Program Information

Veracity of Data Elements in Radiation Oncology Incident Learning Systems

A Kapur¹*, S Evans² , D Brown³ , G Ezzell⁴ , D Hoopes⁵ , S Dieterich⁶ , K Kapetanovic⁷ , C Tomlinson⁸ , (1) Northwell Health System, New Hyde Park, NY, (2) Yale University New Haven, CT, (3) University of California, San Diego, La Jolla, CA, (4) Mayo Clinic Arizona, Phoenix, AZ, (5) The University of California San Diego, San Diego, CA, (6) UC Davis Medical Center, Sacramento, CA, (7) American Society for Radiation Oncology, Fairfax, VA, (8) American Society for Radiation Oncology, Fairfax, VA

Presentations

TU-D-201-4 (Tuesday, August 2, 2016) 11:00 AM - 12:15 PM Room: 201

Purpose:
Incident learning systems encompass volumes, varieties, values, and velocities of underlying data elements consistent with the V’s of big data. Veracity, the 5th V however exists only if there is high inter-rater reliability (IRR) within the data elements. The purpose of this work was to assess IRR in the nationally deployed RO-ILS: Radiation Oncology-Incident Learning System (R) sponsored by the American Society for Radiation Oncology (ASTRO) and the American Association of Physicists in Medicine (AAPM).

Methods:
Ten incident reports covering a wide range of scenarios were created in standardized narrative and video formats and disseminated to 67 volunteers of multiple disciplines from 26 institutions along with two published narratives from the International Commission of Radiological Protection to assess IRR on a nationally representative level. The volunteers were instructed to independently enter the associated data elements in a test version of RO-ILS over a 3-week period. All responses were aggregated into a spreadsheet to assess IRR using free-marginal kappa metrics.

Results:
48 volunteers from 21 institutions completed all reports in the study period. The average kappa score for all raters across all critical data elements was 0.659 [range 0.326-1.000]. Statistically significant differences (p <0.05) were noted between reporters of different disciplines and raters with varying levels of experience. Kappa scores were high for event classification (0.781) and contributory factors (0.777) and low for likelihood-of-harm (0.326). IRR was highest among AAPM-ASTRO members (0.672) and lowest among trainees (0.463).

Conclusion:
A moderate-to-substantial level of IRR in RO-ILS was noted in this study. Although the number of events reviewed in this study was small, opportunities for improving the taxonomy for the lower scoring data elements as well as specific educational targets for training were identified by assessing data veracity quantitatively. This is expected to improve the quality of the data garnered from RO-ILS.

Contact Email: