DAIR Digital Breast Tomosynthesis Lesion Detection Challenge: Phase 2
The American Association of Physicists in Medicine (AAPM) is sponsoring a Challenge for the detection of biopsy-proven breast lesions on digital breast tomosynthesis (DBT) images. The results of the Challenge will be announced at the Grand Challenges Symposium session of the 2021 AAPM Annual Meeting.
The winning team will receive a $1000 prize sponsored by Duke Center for Artificial Intelligence in Radiology (DAIR), contingent on making their code publicly available on GitHub. Depositing of code is only required in order to be eligible to win the $1000 prize, but all participants are encouraged to make their code publicly available. Additionally, one member from each of the two top-performing teams will receive a waiver of the meeting ;registration fee in order to present their methods during the AAPM Annual Meeting.
- Submissions allowed for validation set evaluation: June 15, 2021
- Submissions allowed for test set evaluation: July 11, 2021
- Test set submission deadline for participants: July 21, 2021
- Winning teams notified of Challenge results: July 24, 2021
- Winning results presented at Grand Challenges Symposium session of the 2021 AAPM Annual Meeting: July 28, 2021
- Winning results presented at the Grand Challenges Symposium session of the 2021 AAPM Annual Meeting: July 28, 2021, 10:30 - 11:30 ET (Please note, winning and runner-up Team Leaders are required to attend and present on their Challenge results at this session)
DBTex Challenge Format
The goal of the DBTex2 Challenge is to detect breast lesions that subsequently underwent biopsy and provide the location and size of a bounding box, as well as a confidence score, for each detected lesion candidate. The dataset contains DBT exams with breast cancers, biopsy-proven benign lesions, actionable non-biopsied findings, as well as normals (scans without any findings). The task is to detect biopsy-proven lesions only.
A predicted box is counted as a true positive if the distance in pixels in the original image between its center point and the center of a ground truth box is less than half of its diagonal or 100 pixels, whichever is larger. In terms of the third dimension, the ground truth bounding box is assumed to span 25% of volume slices before and after the ground truth center slice and the predicted box center slice is required to be included in this range to be considered a true positive. Actionable lesions that did not undergo biopsy do not have annotations (ground truth boxes).
The primary performance metric is the average sensitivity for 1, 2, 3, and 4 false positives per DBT view. The primary performance metric will be determined using only views with a biopsied finding. The secondary performance metric is the sensitivity for 2 FP/image for all test views as assessed in arxiv.org/pdf/2011.07995.pdf. Submissions will be ranked using the primary performance metric and the secondary performance metric will be used as a tie-breaker. Please note, participants may not withdraw from the competition once a test submission is made.
Participation in the DBTex2 Challenge acknowledges the educational, friendly competition, and community-building nature of this challenge and commits to conduct consistent with this spirit for the advancement of the medical imaging research community.
Data Availability
Validation and Test Set Labels + Bounding Boxes
NOTICE: The labels and lesion bounding boxes for the BCS-DBT dataset's validation and test sets were recently released, at https://www.cancerimagingarchive.net/collection/breast-cancer-screening-dbt/ under "Data Access".
The dataset for the DBTex2 Challenge contains a total of 5,610 breast tomosynthesis studies from 5,060 patients: the training set contains 4,838 studies, the validation set contains 312 studies, and the test set contains 460 studies. A study may have multiple reconstruction volumes/DICOM files corresponding to different anatomical views (i.e. RCC, LCC, RMLO, or LMLO). The DICOM files, which each correspond to a different combination of patient, study and view, number 19,148 total for the training set, 1,163 for validation, and 1,721 for testing.
Release of the training set (with truth): May 24, 2021
The training set will be representative of the technical properties (equipment, acquisition parameters, file format) and the nature of lesions in the validation and test sets. An associated Excel file in CSV format will include DBT scan identifier and the definition of the bounding box of all lesions. Training data is available here.
Release of the validation set (without truth): May 24, 2021
The locations of lesions will not be provided for the validation set. The validation set needs to be processed, manipulated, and analyzed without human intervention. Validation set output submitted through the online challenge interface will contribute to the challenge leader board.
Release of the test set (without truth): May 24, 2021
The locations of lesions will also not be provided for the test set. The test set needs to be processed, manipulated, and analyzed without human intervention.
Deadline for participants to submit test set output: July 21, 2021
Participants should submit their test set output through the online Challenge interface by July 21, 2021. This interface will be open to submission starting on July 11, and a similar interface for submissions of validation set evaluation will open June 15. An acknowledgment will be sent within 2 business days of receipt of results. In case no confirmation of receipt is received, please contact the Challenge organizers.
Note that the submission of test set output will not be considered complete unless it is accompanied by (1) an agreement to be acknowledged in the Acknowledgment section (by name and institution—but without any link to the performance score of your particular method) of any manuscript that results from the challenge and (2) a one-paragraph statement of the methods used to obtain the submitted results, including information regarding the approach and data set(s) used to train your system, the image analysis and segmentation (if applicable) methods used, the type of classifier, and any relevant references for these methods that should be cited in the challenge overview manuscript. Also note that in order to be eligible for the $1000 prize it is required to make your code publicly available on GitHub.
Output format for the DBTex2 Challenge test set results:
Submissions to the system should contain output results for all cases in a single CSV file and give the location and size of a bounding box for each detected lesion candidate as well as a confidence score that this detection represents an actual lesion.
Formatting your submission file:
The output of your method submitted to the evaluation system should be a single CSV file with the following columns:
- PatientID: string - patient identifier
- StudyUID: string - study identifier
- View: string - view name, one of: RLL, LCC, RMLO, LMLO
- X: integer - X coordinate (on the horizontal axis) of the left edge of the predicted bounding box in 0-based indexing (for the left-most column of the image x=0)
- Width: integer - predicted bounding box width (along the horizontal axis)
- Y: integer - Y coordinate (on the vertical axis) of the top edge of the predicted bounding box in 0-based indexing (for the top-most column of the image y=0)
- Height: integer - predicted bounding box height (along the vertical axis)
- Z: integer - the first bounding box slice number in 0-based indexing (for the first slice of the image z=0)
- Depth: integer - predicted bounding box slice span (size along the depth axis)
- Score: float - predicted bounding box confidence score indicating the confidence level that the detection represents an actual lesion. This score can have an arbitrary scale, but has to be unified across all the cases within a single submission (e.g. 0.0 – 1.0)
Conflict of Interest:
All participants must attest that they are not directly affiliated with the labs of any of the DBTex2 organizers or major contributors. Please refer to the Challenge Organizer Guidance document of the AAPM Working Group on Grand Challenges.
Challenge Website:
spie-aapm-nci-dair.westus2.cloudapp.azure.com/competitions/6
Organizers and Major Contributors:
- Maciej Mazurowski, Duke University (maciej.mazurowski@duke.edu)
- Sam Armato, University of Chicago (s-armato@uchicago.edu)
- Karen Drukker, University of Chicago (kdrukker@uchicago.edu)
- Lubomir Hadjiiski, University of Michigan (lhadjisk@umich.edu)
- Kenny Cha, FDA (Kenny.Cha@fda.hhs.gov)
- Keyvan Farahani, NIH/NCI (farahank@mail.nih.gov)
- Mateusz Buda, Duke University (mateusz.buda@duke.edu)
- Jichen Yang, Duke University (jy168@duke.edu)
- Nick Konz, Duke University (nicholas.konz@duke.edu)
- Ashirbani Saha, Duke University (as698@duke.edu)
- Reshma Munbodh, Brown University (reshma_munbodh@brown.edu)
- Jinzhong Yang, MD Anderson (jyang4@mdanderson.org)
- Nicholas Petrick, FDA (nicholas.petrick@fda.hhs.gov)
- Justin Kirby, NIH/NCI (kirbyju@mail.nih.gov)
- Jayashree Kalpathy-Cramer, Harvard University (kalpathy@nmr.mgh.harvard.edu)
- Benjamin Bearce, Massachusetts General Hospital (kalpathy@nmr.mgh.harvard.edu)