Advanced

More in-depth discussion of on certain issues. Intended for people interested in customising what idiscore does

How idiscore deidentifies a dataset

Getting a sense of what the method idiscore.core.Core.deidentify() actually does. Starting at the very specific.

  • A dataset is fed into idiscore.core.Core.deidentify() on a default idiscore instance. What will happen?

  • Suppose that the dataset contains the DICOM element 0010, 0010 (PatientName) - Jane Smith

  • An idiscore.operators.Operator() is applied to this element. In the default case this is idiscore.operators.Empty(). This will keep the element, but remove its value.

  • the Empty operator was applied because the default profile has the Rule 0010, 0010 (PatientName) - Empty

Overview

  • idiscore.core.Core.deidentify() deidentifies a dataset in 4 steps:

    1. idiscore.core.Core.apply_bouncers() Can reject a dataset if it is considered too hard to deidentify.

    2. idiscore.core.Core.apply_pixel_processing() Removes part of the image data if required. If image data is unknown or something else goes wrong the dataset is rejected

    3. idiscore.core.Core.apply_rules() Process all DICOM elements. Remove, replace, keep, according to the profile that was set. See for example all rules for the idiscore default profile. This step is the most involved of the steps listed here. It will be

    4. Insert any new elements into the dataset. idiscore.insertions.get_deidentification_method() for example generates an element that indicates what method was used for deidentification

How to modify and extend processing

Custom profile

"""You can set your own rules for specific DICOM tags. Be aware that this might

mean the deidentification is no longer DICOM-complient
"""

import pydicom

from idiscore.core import Core, Profile
from idiscore.defaults import get_dicom_rule_sets
from idiscore.identifiers import RepeatingGroup, SingleTag
from idiscore.operators import Hash, Remove
from idiscore.rules import Rule, RuleSet

# Custom rules that will hash the patient name and remove all curve data
my_ruleset = RuleSet(
    rules=[
        Rule(SingleTag("PatientName"), Hash()),
        Rule(RepeatingGroup("50xx,xxxx"), Remove()),
    ],
    name="My Custom RuleSet",
)

sets = get_dicom_rule_sets()  # Contains official DICOM deidentification rules
profile = Profile(  # add custom rules to basic profile
    rule_sets=[sets.basic_profile, my_ruleset]
)
core = Core(profile)  # Create an deidentification core

# read a DICOM dataset from file and write to another
core.deidentify(pydicom.dcmread("my_file.dcm")).save_as("deidentified.dcm")

Each Rule above consists of two parts: an Identifier which designates what this rule applies to, and an Operator which defines what the rule does

Custom processing

If the existing Operators in idiscore.operators are not enough, you can define your own by extending idiscore.operators.Operator(). If these operators could be useful for other users as well, please consider creating a pull request (see Contributing)