Encrypted | Login

The AAPM CT Metal Artifact Reduction (CT-MAR) challenge has concluded. This challenge provided 14,000 CT training datasets using a hybrid data simulation combining different anatomies such as lung, abdomen, liver, head, and pelvis with different metal materials. Each dataset includes 5 components: CT sinograms (uncorrected & labels), CT reconstructed images (uncorrected & labels), and metal masks. In the final evaluation, a total of 29 clinical datasets with metal artifacts were provided in both sinogram and image domains. The inserted metals include surgical clips, dental fillings, and hip prosthesis etc. The participants’ submitted images were evaluated using our scoring metrics.

Out of a total of 105 registered institutions, 26 participants completed all phases of the competition.

The top five performing teams were:

  1. Team name: JLAB
    Members: Yi Guo, Jianhua Ma, Yongbo Wang, Zhaoying Bian, and Dong Zeng  
    Institution: Southern Medical University, China
    Final score: 0.96

  2. Team name: NIMS
    Members: Hyoung Suk Park and Kiwan Jeon
    Institution: National Institute for Mathematical Sciences, South Korea
    Final score = 0.98

  3. Team name: fanstan
    Members: Fuxin Fan and Mareike Thies
    Institution: Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
    Final score = 0.99

  4. Team name: FDMAR
    Members: Zilong Li, Chenglong Ma, and Hongming Shan
    Institution: Fudan University, China
    Final score = 1.05

  5. Team name: MIR-MAR
    Members: Da-in Choi, Sungho Yun, and Subong Hyun
    Institution: Korea Advanced Institute of Science and Technology, South Korea
    Final score = 1.06

A total of eight metrics were used to evaluate the submitted MAR images relative to the ground truth images. The metrics included RMSE, noise, image sharpness, streak amplitude, SSIM, metal integrity, bone integrity, and proton beam range for radiotherapy. For each metric, a fractional score between 0 (good) and 4 (bad) was assigned. A score of 0 corresponds to no relevant differences to ground truth, a score of 2 corresponds to the MAR capability of a state-of-the-art NMAR algorithm if applicable, and a score of 4 corresponds to no improvement. An overall score was computed by the average of the eight metrics over all cases.