modules

All modules in idiscore

idiscore.annotation module

idiscore.bouncers module

class idiscore.bouncers.Bouncer[source]

Bases: object

Inspects a dataset and either rejects it or lets it through

description = 'Bouncer'
inspect(dataset: Dataset)[source]

Check given dataset, raise exception if it should be rejected

Parameters

dataset (Dataset) – The DICOM dataset to inspect

Return type

None

Raises

BouncerException – When this dataset cannot be deidentified for any reason

exception idiscore.bouncers.BouncerException[source]

Bases: IDISCoreError

class idiscore.bouncers.RejectEncapsulatedImageStorage[source]

Bases: Bouncer

description = 'Reject encapsulated PDF and CDA'
inspect(dataset: Dataset)[source]

Check given dataset, raise exception if it should be rejected

Parameters

dataset (Dataset) – The DICOM dataset to inspect

Return type

None

Raises

BouncerException – When this dataset cannot be deidentified for any reason

class idiscore.bouncers.RejectKOGSPS[source]

Bases: Bouncer

description = 'Reject PresentationStorage and KeyObjectSelectionDocument'
inspect(dataset: Dataset)[source]

Rejects three types of DICOM objects: 1.2.840.10008.5.1.4.1.1.11.1 - GrayscaleSoftcopyPresentationStateStorage 1.2.840.10008.5.1.4.1.1.88.59 - KeyObjectSelectionDocumentStorage 1.2.840.10008.5.1.4.1.1.11.2 - ColorSoftcopyPresentationStateStorage These often contain ids and physician names in their SeriesDescription. See ticket #8465

Raises

BouncerException – When the dataset is one of these types

class idiscore.bouncers.RejectNonStandardDicom[source]

Bases: Bouncer

description = 'Reject non-standard DICOM types by SOPClassUID'
inspect(dataset: Dataset)[source]

Reject all DICOM that is not one of the standard SOPClass types.

All standard types are listed in DICOM PS3.4 section 5B: http://dicom.nema.org/dicom/2013/output/chtml/part04/sect_B.5.html

idiscore.bouncers.handle_required_tag_not_found(func)[source]

Decorator for handling missing dataset keys, together with RequiredDataset()

Reduces duplicated code in most Bouncer.inspect() definitions

idiscore.core module

idiscore.dataset module

Additions to the pydicom Dataset object

class idiscore.dataset.RequiredDataset(*args: Union[Dataset, MutableMapping[BaseTag, Union[DataElement, RawDataElement]]], **kwargs: Any)[source]

Bases: Dataset

A pydicom Dataset,that raises distinctive errors when accessing missing keys

Made this to specifically handle missing keys on a dataset. By default a Dataset instance raises KeyError and AttributeError. These are too general to safely catch over larger pieces of code. Putting try except blocks around each individual dict key access is ugly and annoying.

Raises

RequiredTagNotFound – When a requested key is not found in this dataset. Either through attribute access, like dataset.PatientID or through dict access like dataset[‘PatientID’]

Notes

Init like this:

>>> ds = Dataset()
>>> rds = RequiredDataset(ds)

Now you can handle missing keys cleanly without accidentally catching other KeyErrors:

>>> try:
>>>     important_dataset_check(rds)
>>> except RequiredTagNotFound:
>>>     print('check failed due to missing information')
exception idiscore.dataset.RequiredTagNotFound[source]

Bases: IDISCoreError

idiscore.defaults module

idiscore.delta module

class idiscore.delta.Delta(tag: BaseTag, before, after)[source]

Bases: object

A change in a DICOM element value after deidentification

full_description() str[source]

Full human-readable description of the change that happened

has_changed() bool[source]

Has changed or has been removed after deidentification

property status: str
property tag_name: str
class idiscore.delta.DeltaStatusCodes[source]

Bases: object

How has a DICOM element changed?

ALL = {'CHANGED', 'CREATED', 'EMPTIED', 'REMOVED', 'UNCHANGED'}
CHANGED = 'CHANGED'
CREATED = 'CREATED'
EMPTIED = 'EMPTIED'
REMOVED = 'REMOVED'
UNCHANGED = 'UNCHANGED'

idiscore.exceptions module

exception idiscore.exceptions.AnnotationValidationFailedError[source]

Bases: IDISCoreError

exception idiscore.exceptions.IDISCoreError[source]

Bases: Exception

Base for all exceptions in IDIS core

exception idiscore.exceptions.SafePrivateError[source]

Bases: IDISCoreError

idiscore.identifiers module

Ways to designate a DICOM tag or a group of dicom tags

class idiscore.identifiers.PrivateBlockTagIdentifier(tag: str)[source]

Bases: TagIdentifier

A private DICOM tag with a private creator. Like ‘0013,[MyCompany]01’

In this example [MyCompany] refers whatever block was reserved by private creator identifier ‘MyCompany’

For more info on private blocks, see DICOM standard part 5, section 7.8.1 (‘Private Data Elements’)

BLOCK_TAG_REGEX = re.compile('(?P<group>[0-9A-F]{4}),?\\s?\\[(?P<private_creator>.*)\\](?P<element>[0-9,A-F]*)', re.IGNORECASE)
as_python() str[source]

For special export. Python code that recreates this instance

classmethod init_explicit(group: int, private_creator: str, element: int)[source]

Create with explicit parameters. This cannot be the main init because TagIdentifier classes need to be instantiable from a single string and uphold cls(cls.tag)=cls

Parameters
  • group (int) – DICOM group, between 0x0000 and 0xFFFF

  • private_creator (str) – Name of the private creator for this tag

  • element (int) – The two final bytes of the element. Between 0x00 and 0xFF

key() str[source]

For sane sorting, make sure this matches the key format of other identifiers

matches(element: DataElement) bool[source]

True if private element has been created by private creator and the rest of the group and element match up

name() str[source]

Human readable name for this tag

number_of_matchable_tags() int[source]

How many tags could this identifier match?

classmethod parse_tag(tag: str) Tuple[int, str, int][source]

Parses ‘xxxx,[creator]yy’ into xxxx, creator and yy components. xxxx and yy are interpreted as hexadecimals

Parameters

tag (str) – Format: ‘xxxx,[creator]yy’ where xxxx and yy are hexadecimals. Case insensitive.

Returns

xxxx: int, creator:str and yy:int from tag string ‘xxxx,[creator]yy’ where xxxx and yy are read as hexadecimals from string

Return type

Tuple[int, str, int]

Raises

ValueError: – When input cannot be parsed

property tag: str
static to_tag(group: int, private_creator: str, element: int) str[source]

Tag string like ‘1301,[creator]01’ from individual elements

Parameters
  • group (int) – DICOM group, between 0x0000 and 0xFFFF

  • private_creator (str) – Name of the private creator for this tag

  • element (int) – The two final bytes of the element. Between 0x00 and 0xFF

class idiscore.identifiers.PrivateTags[source]

Bases: TagIdentifier

Matches any private DICOM tag. A private tag has an uneven group number

static as_python() str[source]

For special export. Python code that recreates this instance

key() str[source]

String used in comparison operators

Also. A key should contain all information needed to recreate an instance. if ‘tag’ is a TagIdentifier instance, the following should hold:

>>> tag(tag.key()) == tag
matches(element: DataElement) bool[source]

The given element matches this identifier

name() str[source]

Human-readable name for this tag

number_of_matchable_tags() int[source]

The number of distinct tags that this identifier could match

Used to determine order of matching (specific -> general)

class idiscore.identifiers.RepeatingGroup(tag: Union[str, RepeatingTag])[source]

Bases: TagIdentifier

A DICOM tag where not all elements are filled. Like (50xx,xxxx)

as_python() str[source]

For special export. Python code that recreates this instance

key() str[source]

For sane sorting, make sure this matches the key format of other identifiers

matches(element: DataElement) bool[source]

True if the tag values match this repeater in all places without an ‘x’

name() str[source]

Human readable name for this tag

number_of_matchable_tags() int[source]

The number of distinct tags that this identifier could match

Used to determine order of matching (specific -> general)

class idiscore.identifiers.RepeatingTag(tag: str)[source]

Bases: object

Dicom tag with x’s in it to denote wildcards, like (50xx,xxxx) for curve data

See http://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_7.6.html

Raises

ValueError – on init if tag cannot be parsed as a DICOM repeater group

Notes

I would prefer to take any pydicom way of working with repeater tags, but the current version of pydicom (2.0) only offers limited lookup support as far as I can see

as_mask() int[source]

Byte mask that can remove the byte positions that have value ‘x’

RepeatingTag(‘0010,xx10’).as_mask() -> 0xffff00ff RepeatingTag(‘50xx,xxxx’).as_mask() -> 0xff000000

name() str[source]

Human-readable name for this repeater tag, from pydicom lists

number_of_wildcard_positions() int[source]

Number of x’s in this wildcard

static parse_tag_string(tag: str) str[source]

Cleans tag string and outputs it in standard format. Raises ValueError if tag is not of the correct format like (0010,10xx).

Returns

standard format, 8 character hex string with ‘x’ for wildcard bytes. like 0010xx10 or 75f300xx

Return type

str

static_component() int[source]

The int value of all bytes of this tag that are not ‘x’ RepeatingTag(‘0010,xx10’).static_component() -> 0x00100010 RepeatingTag(‘50xx,xxxx’).static_component() -> 0x50000000

class idiscore.identifiers.SingleTag(tag: Union[BaseTag, str, Tuple[int, int]])[source]

Bases: TagIdentifier

Matches a single DICOM tag like (0010,0010) or ‘PatientName’

as_python() str[source]

For special export. Python code that recreates this instance

key() str[source]

Return a valid Tag() string argument

matches(element: DataElement) bool[source]

The given element matches this identifier

name() str[source]

Human-readable name for this tag

number_of_matchable_tags() int[source]

The number of distinct tags that this identifier could match

Used to determine order of matching (specific -> general)

class idiscore.identifiers.TagIdentifier[source]

Bases: object

Identifies a single DICOM tag or repeating group like (50xx,xxx)

Using just DICOM tags is too limited for defining deidentification. We want to be able to represent for example:

  • all curves (50xx,xxxx)

  • a private tag with private creator group (01[PrivateCreatorName],0010)

as_python() str[source]

For special export. Python code that recreates this instance

key() str[source]

String used in comparison operators

Also. A key should contain all information needed to recreate an instance. if ‘tag’ is a TagIdentifier instance, the following should hold:

>>> tag(tag.key()) == tag
matches(element: DataElement) bool[source]

The given element matches this identifier

name() str[source]

Human-readable name for this tag

number_of_matchable_tags() int[source]

The number of distinct tags that this identifier could match

Used to determine order of matching (specific -> general)

idiscore.identifiers.clean_tag_string(x)[source]

Remove common clutter from pydicom Tag.__str__() output

idiscore.identifiers.get_keyword(tag)[source]

Human-readable keyword for known dicom tags, or ‘Unknown’

idiscore.imageprocessing module

Classes and methods for working with image part of a DICOM dataset

exception idiscore.image_processing.CriterionException[source]

Bases: IDISCoreError

class idiscore.image_processing.PIILocation(areas: List[SquareArea], criterion: Optional[Callable[[Dataset], bool]] = None)[source]

Bases: object

One or more areas in a DICOM image slice that might contain Personally Identifiable Information (PPI)

Notes

A PIILocation is 2D. Cleaning will be done on each slice individually.

Responsibilities:

  • Holds location information. Does not alter PixelData itself

  • Determine whether it applies to a given Dataset

exists_in(dataset: Dataset) bool[source]

True if the given PII location exists in the given dataset

Raises

CriterionException – If for some reason no True or False response can be given for this dataset

class idiscore.image_processing.PIILocationList(locations: Optional[List[PIILocation]] = None)[source]

Bases: object

Defines where in images there might by Personally Identifiable information

exception idiscore.image_processing.PixelDataProcessorException[source]

Bases: IDISCoreError

class idiscore.image_processing.PixelProcessor(location_list: PIILocationList)[source]

Bases: object

Finds and removes burned-in sensitive information in images

Notes

Responsibilities:

  • Checking whether a dataset needs cleaning of its pixel data

  • Checking whether redaction can be performed

  • Actually performing the blackout

clean_pixel_data(dataset: Dataset) Dataset[source]

Remove pixel data that needs cleaning and mark the dataset as safe

If this dataset does not look suspicious it will not be returned unchanged

Raises

PixelDataProcessorException – If pixel data needs cleaning but no information can be found

get_locations(dataset: Dataset) List[PIILocation][source]

Get all locations with person information in the current dataset

Raises

PixelDataProcessorException – When locations cannot be found properly

static needs_cleaning(dataset: Dataset) bool[source]

Whether this dataset should be rejected as unsafe without cleaning

Made this into a separate method as for many DICOM datasets you can reasonably skip image processing altogether.

Raises

PixelDataProcessorException – When it cannot be determined whether this dataset needs cleaning or not. Usually due to missing DICOM elements

class idiscore.image_processing.SquareArea(origin_x: int, origin_y: int, width: int, height: int)[source]

Bases: object

A 2D square in pixel coordinates

height: int
origin_x: int
origin_y: int
width: int

idiscore.insertions module

Common DICOM elements you might like to insert into deidentified datasets

This includes the insertions from DICOM PS3.15 E1-1.6:

The attribute Patient Identity Removed (0012,0062) shall be replaced or added to the dataset with a value of YES, and one or more codes from CID 7050 “De-identification Method” corresponding to the profile and options used shall be added to De-identification Method Code Sequence (0012,0064). A text string describing the method used may also be inserted in or added to De-identification Method (0012,0063), but is not required.

idiscore.insertions.get_deidentification_method(method: str = 'idiscore 1.0.3') DataElement[source]

Create the element (0012,0063) - DeIdentificationMethod

A string description of the deidentification method used

Parameters

method (str, optional) – String representing the deidentification method used. Defaults to ‘idiscore <version>’

idiscore.insertions.get_idis_code_sequence(ruleset_names: List[str]) DataElement[source]

Create the element (0012,0064) - DeIdentificationMethodCodeSequence

This sequence specifies what kind of anonymization has been performed. It is quite free form. This implementation uses the following format:

DeIdentificationMethodCodeSequence will contain the code of each official DICOM deidentification profile that was used. Codes are taken from Table CID 7050

Parameters

ruleset_names (List[str]) – list of names as defined in nema.E1_1_METHOD_INFO

Returns

Sequence element (0012,0064) - DeIdentificationMethodCodeSequence. Will

contain the code of each official DICOM deidentification profile passed

Return type

DataElement

Raises

ValueError – When any name in ruleset_names is not recognized as a standard DICOM rule set

idiscore.nema module

Encodes official NEMA information like Basic Application Level Confidentiality Profile and Options as defined in table E1-1 here: http://dicom.nema.org/medical/dicom/current/output/chtml/part15/sect_E.3.html

This module should model public DICOM information. Any additional information such as default implementations for the action codes should be put in ‘rule_sets.py’

class idiscore.nema.ActionCode(key, var_name)

Bases: tuple

key

Alias for field number 0

var_name

Alias for field number 1

class idiscore.nema.ActionCodes[source]

Bases: object

NEMA specifications from table E1-1 of what to do with each tag

Modelling these to lessen room for error and to make it easier to write this to disk

ALL = {ActionCode(key='C', var_name='CLEAN'), ActionCode(key='D', var_name='DUMMY'), ActionCode(key='K', var_name='KEEP'), ActionCode(key='U', var_name='UID'), ActionCode(key='X', var_name='REMOVE'), ActionCode(key='X/D', var_name='REMOVE_OR_DUMMY'), ActionCode(key='X/Z', var_name='REMOVE_OR_EMPTY'), ActionCode(key='X/Z/D', var_name='REMOVE_OR_EMPTY_OR_DUMMY'), ActionCode(key='X/Z/U*', var_name='REMOVE_OR_EMPTY_OR_UID'), ActionCode(key='Z', var_name='EMPTY'), ActionCode(key='Z/D', var_name='REPLACE_OR_DUMMY')}
CLEAN = ActionCode(key='C', var_name='CLEAN')
DUMMY = ActionCode(key='D', var_name='DUMMY')
EMPTY = ActionCode(key='Z', var_name='EMPTY')
KEEP = ActionCode(key='K', var_name='KEEP')
PER_STRING = {'C': ActionCode(key='C', var_name='CLEAN'), 'D': ActionCode(key='D', var_name='DUMMY'), 'K': ActionCode(key='K', var_name='KEEP'), 'U': ActionCode(key='U', var_name='UID'), 'X': ActionCode(key='X', var_name='REMOVE'), 'X/D': ActionCode(key='X/D', var_name='REMOVE_OR_DUMMY'), 'X/Z': ActionCode(key='X/Z', var_name='REMOVE_OR_EMPTY'), 'X/Z/D': ActionCode(key='X/Z/D', var_name='REMOVE_OR_EMPTY_OR_DUMMY'), 'X/Z/U*': ActionCode(key='X/Z/U*', var_name='REMOVE_OR_EMPTY_OR_UID'), 'Z': ActionCode(key='Z', var_name='EMPTY'), 'Z/D': ActionCode(key='Z/D', var_name='REPLACE_OR_DUMMY')}
REMOVE = ActionCode(key='X', var_name='REMOVE')
REMOVE_OR_DUMMY = ActionCode(key='X/D', var_name='REMOVE_OR_DUMMY')
REMOVE_OR_EMPTY = ActionCode(key='X/Z', var_name='REMOVE_OR_EMPTY')
REMOVE_OR_EMPTY_OR_DUMMY = ActionCode(key='X/Z/D', var_name='REMOVE_OR_EMPTY_OR_DUMMY')
REMOVE_OR_EMPTY_OR_UID = ActionCode(key='X/Z/U*', var_name='REMOVE_OR_EMPTY_OR_UID')
REPLACE_OR_DUMMY = ActionCode(key='Z/D', var_name='REPLACE_OR_DUMMY')
UID = ActionCode(key='U', var_name='UID')
classmethod get_code(key: str)[source]

I’ve got a string. Which action code is this?

class idiscore.nema.NemaDeidMethodInfo(table_header, full_name, short_name, code)

Bases: tuple

code

Alias for field number 3

full_name

Alias for field number 1

short_name

Alias for field number 2

table_header

Alias for field number 0

class idiscore.nema.RawNemaRuleSet(rules: List[Tuple[TagIdentifier, ActionCode]], name: str, code: str)[source]

Bases: object

Defines the action code from table E1-1 for each DICOM identifier

‘raw’ because an action code is just a string and cannot be applied to a tag. This class defines an intermediate stage in parsing the DICOM confidentiality options. Each identifier has been parsed, but operations have not been assigned

compile(action_mapping: Dict[ActionCode, Operator]) RuleSet[source]

Replace each action code (string) with actual operator (function)

idiscore.operators module

class idiscore.operators.Clean(safe_private: Optional[SafePrivateDefinition] = None, delta_provider: Optional[TimeDeltaProvider] = None)[source]

Bases: Operator

Replace with values of similar meaning known not to contain identifying information and consistent with the VR

‘similar meaning’ is open to interpretation.

Also handles private tags

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

clean_date_time(element: DataElement, dataset: Dataset) DataElement[source]

Clean a DICOM date or time

Do this by subtracting a random increment from it

clean_private(element: DataElement, dataset: Dataset) DataElement[source]

Clean private DICOM element

is_safe(element: DataElement, dataset: Dataset) bool[source]

True if this element is safe according to safe private definition

Raises

SafePrivateError – If for some reason it cannot be determined whether this is safe

name = 'Clean'
static parse_date_time(value: str) Tuple[str, datetime][source]

Parse DICOM date, datetime or time string

Parameters

value (str) – A dicom date datetime or time string

Returns

strptime date format string, parsed datetime instance

Return type

Tuple[str, datetime]

Raises

ValueError – If value cannot be parsed

exception idiscore.operators.ElementShouldBeRemoved[source]

Bases: IDISCoreError

class idiscore.operators.Empty[source]

Bases: Operator

Make the content of element empty

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

name = 'Empty'
class idiscore.operators.Hash[source]

Bases: Operator

Replace value with an MD5 hash of that value

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

name = 'Hash'
class idiscore.operators.HashUID(root_uid: Optional[str] = None)[source]

Bases: Operator

Replace element with a valid UID

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

static ctp_hash_uid(prefix: str, uid: str)[source]

Implementation of CTP function hashUID(prefix, uid)

Generates a hash of the given UID with the given prefix. Modelled as closely as possible to the java function https://mircwiki.rsna.org/index.php?title=The_CTP_DICOM_Anonymizer #.40hashuid.28root.2CElementName.29

Parameters
  • prefix (str) – DICOM prefix for your organization to prepend in output.

  • uid (str) – original UID

Returns

hashed UID

Return type

str

name = 'HashUID'
class idiscore.operators.Keep[source]

Bases: Operator

Keep the given element as is. Make no changes

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

name = 'Keep'
class idiscore.operators.Operator[source]

Bases: object

Base class for something that can change a DICOM data element.

Like changing the value, hashing it, removing the entire element, etc. Takes care of input validation, raising exceptions when needed

Notes

Responsibilities

An Operator:

  • Can change the single DICOM data element that is fed to it

  • Can inspect the dataset that is passed to it

  • Can take init arguments and connect to external resources if needed

  • Should NOT alter the dataset that is passed to it

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

name = 'Base Operation'
class idiscore.operators.Remove[source]

Bases: Operator

Remove the given element completely

apply(element: DataElement, dataset: Optional[Dataset] = None)[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

name = 'Remove'
class idiscore.operators.Replace[source]

Bases: Operator

Replace element with a dummy value

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

name = 'Replace'
class idiscore.operators.SetFixedValue(value: Union[str, int, object])[source]

Bases: Operator

Replace element with a fixed value from a list of tag-value pairs

apply(element: DataElement, dataset: Optional[Dataset] = None) DataElement[source]

Perform this operation on the given element.

Parameters
  • element (DataElement) – The DICOM element to operate on

  • dataset (Dataset, optional) – The DICOM dataset that this element comes from. This can be inspected to determine what to do with element. Should not be changed in any way. Defaults to None

Returns

A new DataElement instance to replace the given element with

Return type

DataElement

Raises
  • ValueError – When this operation cannot be performed on this element. For example when the data element has a number ValueType but the operation is for a string

  • ElementShouldBeRemoved – Signals that this element should be removed from the dataset. Operators cannot do this by themselves as they can only operate on the element given

name = 'SetFixedValue'
class idiscore.operators.TimeDeltaProvider[source]

Bases: object

Generates a random shift in time to use when cleaning dates.

Returns the same output for data sets in the same study

static extract_key(dataset: Dataset) str[source]

Extracts a key from dataset. Data sets with the same key will be given the same delta

Raises

ValueError – If key cannot be generated

static generate_random_delta() timedelta[source]

Anything from 0 up to 5 years and 23:59 and 59 seconds

get_delta(dataset: Dataset) timedelta[source]

Returns the same delta if a dataset belongs to a series already seen

If series cannot be determined, return random delta

idiscore.privateprocessing module

Classes and methods for handling private DICOM elements

Is a private tag is safe to keep? This can not be answered with regular rules of the form tag -> operation. Sometimes you need to inspect the entire dataset, for example to check modality or vendor.

class idiscore.private_processing.SafePrivateBlock(tags: Iterable[Union[PrivateBlockTagIdentifier, str]], criterion: Optional[Callable[[Dataset], bool]] = None, comment: str = '')[source]

Bases: object

Defines when one or more private DICOM elements can be considered ‘safe’

Safe as in ‘not containing personally identifiable information’

get_safe_private_tags(dataset: Dataset) Set[TagIdentifier][source]

The private tags that are safe to keep, given this dataset

Raises

CriterionException – If no True or False response can be given for this dataset

tags_are_safe(dataset: Dataset) bool[source]

True if these private tags are safe to keep in this dataset

static to_tag_identifier(tag_or_string: Union[PrivateBlockTagIdentifier, str]) PrivateBlockTagIdentifier[source]

Cast any string to tag identifier. If already a TagIdentifier do nothing

Return type

TagIdentifier

Raises

ValueError – if tag is string and is not in the correct format

class idiscore.private_processing.SafePrivateDefinition(blocks: List[SafePrivateBlock])[source]

Bases: object

Holds all information on which private tags can be considered safe

Contains one or more SafePrivateBlocks

is_safe(element: DataElement, dataset: Dataset) bool[source]

True if the given private element in the given dataset is safe to keep

Raises

SafePrivateError – If for some reason it cannot be determined whether this is safe

safe_identifiers(dataset: Dataset) List[TagIdentifier][source]

All tags that are safe to keep given this dataset

Raises

SafePrivateError – If safe identifiers cannot be determined

idiscore.rule_sets module

Common sets of rules to deidentify multiple dicom elements

Contains default implementations of the DICOM standard deidentification profiles and options and other useful sets

class idiscore.rule_sets.DICOMRuleSets(action_mapping: Optional[Dict[ActionCode, Operator]] = None)[source]

Bases: object

Holds the rule sets for DICOM deidentification basic profile and options

These are lists of rules that implement the actions designated in table E3

Notes

More information on profile and options found here: http://dicom.nema.org/medical/dicom/current/output/chtml/part15/sect_E.3.html

idiscore.rules module

class idiscore.rules.Rule(identifier: Union[TagIdentifier, BaseTag], operation: Operator)[source]

Bases: object

Defines what to do with a single DICOM element or single group of elements

as_human_readable() str[source]
matches(element: DataElement) bool[source]

True if this rule matches the given DICOM element

number_of_matchable_tags() int[source]

The number of distinct DICOM tags that this rule could match

class idiscore.rules.RuleSet(rules: Iterable[Rule], name: str = 'RuleSet')[source]

Bases: object

Defines what to do to one or more DICOM tags

Models part of a deidentification procedure, such as the Basic Application Level Confidentiality Options in DICOM (e.g. Retain Safe Private Option)

as_dict() Dict[TagIdentifier, Rule][source]
as_human_readable_list() str[source]

All rules in this set sorted by tag name

get_rule(element: DataElement) Optional[Rule][source]

The most specific rule for the given DICOM element, or None if not found

Returns

  • Rule – Most specific rule for the given DICOM tag

  • None – If no rule matches the given DICOM tag

Notes

It is possible for multiple rules to match. Lookup is always done from specific to general. For example, when getting a rule for element with tag (0010,0010):

  • A rule for (0010,0010) is preferred over (0010,00xx)

  • A rule for (0010,00xx) is preferred over (0010,xx10)

  • A rule for (0010,xx10) is preferred over (xxxx,0010)

Generality is determined by the number_of_matchable_tags() function of each rule. The more tags that could be matched, the more general the rule is

static is_single_tag_rule(rule: Rule) bool[source]

Targets only a single DICOM tag

remove(rule: Rule)[source]

Remove the given rule from this set

Raises

KeyError – If rule is not in this set

property rules: Set[Rule]

All rules in this list

static tag_to_key(tag: BaseTag) str[source]

Represent tag as single 8 char hex string like ‘00100010’

This is the format used as dict key internally

idiscore.settings module

idiscore.templates module

Jinja templates. Putting these in a separate module because indentation is difficult when inlining templates inside classes and functions

idiscore.templates.make_h1(text)[source]
idiscore.templates.make_h2(text)[source]
idiscore.templates.make_h3(text)[source]

idiscore.validation module