EDRN DICOM De-identification Process
Version: 0.1 (Draft)
Date: 2025-02-21
Following industry best practices, EDRN uses a standards-based approach to DICOM de-identification to ensure that images are free of Protected Health Information (PHI). The EDRN recommends a de-identification process that meets the HIPAA Safe Harbor Method as defined in section 164.514(b)(2) of the HIPAA Privacy Rule. The standard for de-identification of DICOM objects is outlined in DICOM PS 3.15: Appendix E (DICOM Standard). This approach aligns with and is harmonized with the National Cancer Institute’s Cancer Imaging Archive (TCIA) Submission and De-Identification process (TCIA Guide).
Required DICOM Metadata:
EDRN DICOM data submissions must include specific required DICOM tags, as defined in the DRAFT DICOM Extensions Data Model. Additional optional tags may be included based on modality type or imaging use case.
De-Identification Approach
- At the submitting site, DICOM tags deemed unsafe must be modified or removed (see Table 1 - DICOM Tags Modified or Removed at the Source Site, maintained by TCIA) before leaving the site’s host computer.
- Each submitting site is responsible for performing de-identification using software tools appropriate for their workflow.
- EDRN does not provide specific de-identification software but requires compliance with best practices.
- EDRN recommends following the Basic Application Confidentiality Profile (BACP) (DICOM PS 3.15: Appendix E), which provides a framework for de-identification.
The following options should be applied to balance de-identification and data utility:
Primary De-Identification Requirements
Patient Identifiers
- Patient ID (0010,0020) should be modified. The EDRN Data Management and Coordinating Center (DMCC) or EDRN site must implement a de-identified ID mapping between the original Patient ID and the new ID used within EDRN LabCAS.
- Patient Name (0010,0010) should be modified to "ANONYMOUS" to ensure compatibility with DICOM viewers.
- Patient Identity Removed (0012,0062) must be set to "YES" to indicate de-identification.
Institution Identification
- Institution Name (0008,0080) must be modified as part of the de-identification procedure. Replace with 'ANONYMOUS' if indicated in the study procedure to prevent institutional traceability.
- Ensure Institution Name does not appear in any other text fields in the header.
Dates
Retain Longitudinal Dates with Modified Dates Option
- Modify all dates while preserving the time intervals between scans to allow researchers to analyze longitudinal studies without disclosing actual dates.
- Longitudinal Temporal Information Modified (0028,0303) must be marked ‘MODIFIED’ to indicate adjustments.
- Optionally, Longitudinal Temporal Offset (0012,0052) and Baseline Year (0013,1051) may be used.
Exam Identifiers
Universal Identifiers (UIDs) are modified to prevent traceability while preserving internal referencing. Original UIDs can be linked to hospital databases or PACS (Picture Archiving and Communication System) records, which may allow an image to be traced back to a specific patient or institution if left unchanged.
- To modify UIDs:
- The original UID is replaced with a newly generated UID under a standardized root, ensuring no direct link to the original system.
- Some systems generate new UIDs by hashing or encoding the original UID, ensuring images remain linked within a study while preventing reverse identification.
- If the UID exceeds the DICOM-specified length, it may be truncated while maintaining uniqueness by retaining the end of the UID.
- Maintaining internal referencing:
- Studies, series, and images within a dataset remain correctly linked despite UID modification.
- Multi-frame studies and referenced images in radiotherapy plans, PET/CT fusion images, and segmentations remain valid to avoid breaking critical data relationships.
- Original accession numbers and internal study identifiers are removed to prevent unintended traceability.
Pixel Data
- PHI (e.g., burned-in annotations or overlays with patient information) must be removed from the image pixels themselves. Some imaging systems may embed PHI directly into the image pixels. This information must be removed or masked, ensuring it cannot be undone, to ensure patient privacy while preserving critical diagnostic information.
Patient Demographics
- Retention of key demographics is useful for research while ensuring PHI is removed.
- Fields Retained:
- Sex, Age, Size, Weight, Ethnic Group, Smoking Status, and Pregnancy Status.
- If the subject is over 89 years old, age is recorded as 90+.
Free Text Fields
DICOM includes many free-text fields where human operators (such as radiologists or technicians) can enter information about a scan. These fields can sometimes contain PHI and must be reviewed and modified accordingly.
- Modify the following free-text fields to retain scientifically valuable descriptors while eliminating PHI.
- The following fields are crucial for organizing and classifying images and must be reviewed and modified to ensure that scientifically valuable descriptors are retained while eliminating PHI.
- Study Description (0008,1030) – Provides a general description of the imaging study.
- Series Description (0008,103E) – Helps differentiate between different image series within a study.
- Protocol Name (0018,1030) – Can be useful for standardizing scan protocols but must be reviewed for PHI.
Private Tags
Retain safe private DICOM tags and remove any containing PHI. Only retain tags confirmed to be safe and scientifically relevant.
- Some scanner-specific private tags contain valuable imaging parameters for research.
- Private tags containing PHI must be removed.
Device Information
- Scanner-related details are anonymized if they contain PHI.
Final Review Process
- Sites must validate that de-identified images are compliant and suitable for research prior to uploading.
- Data is reviewed to confirm adherence to Table 1 before submission.
- Burned-in PHI checks are performed to ensure compliance.
This approach ensures that DICOM data remains valuable for research while fully protecting patient privacy and is aligned with TCIA's de-identification process.