Minutes to EDRN Data Sharing and Informatics Subcommittee 11/18/2024
EDRN Data Sharing and Informatics Subcommittee Meeting
Monday, November 18, 2024
Present (in BOLD):
- NASA/JPL: Dan Crichton, Sean Kelly, Heather Kincaid, Ashish Mahabal
- Arizona State University: Ji Qiu
- Boston University: Jennifer Beane
- EVMS: Julius Nyalwidhe
- DMCC: Jackie Dahlgren, Royce Malnik
- Johns Hopkins: Zhen Zhang
- NCI: Amanda Skarlupka, Guillermo Marquez, Christos Patriotis, Juan Miguel Villanueva
- PNNL: Tao Liu
- University of California: William Hsu
- University of North Carolina: Kristen Anton
Current Action Items:
- JPL to 1) populate the grid of Roles and Responsibilities for FAIR-based data presented on the call to the Public Portal, 2) document more, and 3) promote investigator trainings to ensure that NCI policies are followed.
- JPL to follow-up with lead PI of each project listed in LabCAS Holdings Nov 2023 spreadsheet rows 25-35 to determine if the projects can be considered public. DONE
- PI’s are asked to review their data in LabCAS and let JPL know of any issues.
- Discuss a roadmap for additional hackathons and workshops.
- JPL to schedule DICOM Header Standards call with EDRN DICOM Imaging Investigators
- JPL to review EDRN FAIR Data Guidance Page and Training for each Collaborative Group on next call
Agenda/Discussion:
- AI Workshop Report: The draft has been circulated to the program committee—ready to post on the portal. The videos and slides from the workshop have been posted to the Public Portal. The next steps:
- Continue establishing a community of practice in EDRN and with the cancer biomarker community in AI
- Provide additional workshops and hackathons—have posters at the upcoming EDRN Scientific Workshop—discuss a roadmap for additional events at the next meeting.
- Pursue a special journal issue on cancer biomarkers and AI
- Drive forward one or more projects as an EDRN use case from the hackathon
- Work with the informatics and Data Sharing Subcommittee within the EDRN to enhance data usability and reuse
- Increase access to shared computation for joint AI projects.
- EDRN LabCAS Data Holdings 2024 Report: Dan Crichton discussed the spreadsheet—have been working with NCI and the EDRN program managers. The goal is to identify what is public, 30,000 data sets in LabCAS, want to ensure what is public and where we are with reviewing the data. A standard nomenclature, a standard set of values for where we are in terms of delivery—identify what data is being delivered. Sites are reaching out to JPL for ways to deliver their data, and JPL is working on the plan for this. For the data JPL does have, they are reviewing the data for compliance and completeness. They see a significant increase in the need for metadata, and they want to work with EDRN members to improve the process and ensure the data is following a FAIR-based approach as much as possible. A report of the progress will be sent to NCI. Working to make sure imaging headers are deidentified and address questions around the deidentification process. JPL will reach out to the sites if they have questions. Plan to do training at some of the collaborative groups. Dan Crichton discussed ways the sites can prepare their data before it is uploaded to LabCAS:
- Review raw data files
- Ensure data is deidentified
- Provide required metadata
- Organize the data
- Upload your data and supplemental files
- Review data in LABCAS
- Works with domain experts or site reviewers to validate data capture and usability
Action Plans for PIs to Ensure FAIR Compliance in LabCAS:
- Review existing data: assess data already deposited in LabCAS for completeness and compliance
- Check for missing elements: ensure our data collection include all necessary data files
- Provide required metadata: verify metadata requirements and submit any missing information
- Coordinate with JPL IC: upload missing data or metadata by contacting JPL IC
- For new data collections: contact JPL IC to upload data generated during this funding cycle. Contact JPL—they have plans to work with sites to ingest data.
- Follow up on Hackathon: Ashish Mahabal reviewed what was done and the next steps. He has reached out to the teams to see if they want to present at the March EDRN Scientific Workshop. Will need funding to explore the data further such as code generation. The goal is to make sure that can do something very basic with existing models using Jupyter notebooks.
- Federated Learning Compliance with EDRN Policy: Dan Crichton will put together slides and work with Amanda Skarlupka on the emerging guidance for Federated Learning data sharing—this is a follow up to a call that occurred a few months ago.
- DICOM Header Standards Working Group (PMRI, Lung, and other DICOM experts): Proposed working group to address DICOM standards surrounding capturing DICOM images with standardized DICOM headers, with potential overlap across modalities. The proposed next steps are:
- Establish minimum DICOM header standards for EDRN
- Ensure DICOM files across modalities are AI-ready and interoperable.
- The group may only need to meet a few times to develop these standards. Potential members: Eugene Koay, William Hsu, Matt Schabath, Heather Kincaid and Guillermo Marquez, Jackie Dahlgren and Eardi Lila.
Next Call: Monday, January 20, 2025 at 1pm Eastern/10am Pacific. (Note: will need to reschedule since this is MLK, JR Day) The December meeting is canceled.