Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms

التفاصيل البيبلوغرافية
العنوان: Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms
المؤلفون: Hyo-Eun Kim, Jiashi Feng, Stephen H. Friend, Ljubomir Buturovic, Dezső Ribli, Luis Caballero, Li Shen, Fredrik Strand, Yaroslav Nikulin, Krzysztof J. Geras, Kyunghyun Cho, Elias Chaibub Neto, Rami Ben-Ari, Christoph I. Lee, Zequn Jie, Imane Nedjar, Felix Nensa, Darvin Yi, Shivanthan A.C. Yohanandan, Bruce Hoff, Justin Guinney, Jaime S. Cardoso, Russell B. McBride, Mengling Feng, Yiqiu Shen, Simona Rabinovici-Cohen, Ethan Goan, Stefan Harrer, Sven Koitka, Michael Kawczynski, Hari Trivedi, Karl Trygve Kalleberg, Christoph M. Friedrich, F. Albiol, Dimitri Perrin, Jose Costa Pereira, Umar Asif, Bibo Shi, Zbigniew Wojna, Antonio Jimeno Yepes, Peter Lindholm, Berkman Sahiner, Sijia Wang, Thea Norman, Weiva Sieh, Joyce Cahoon, Gerard Cardoso Negrie, Pavitra Krishnaswamy, Diana S. M. Buist, Alberto Albiol, Lester Mackey, Hwejin Jung, Laurie R. Margolies, Gaurav Pandey, Can Son Khoo, William Lotter, Yuanfang Guan, Thomas Yu, Andrew D. Trister, Stephen Morrell, Gustavo Stolovitzky, A. Gregory Sorensen, Clinton Fookes, Mehmet Eren Ahsen, David D. Cox, Jae Ho Sohn, Hao Du, Thomas Schaffter, Joseph H. Rothstein, Eduardo Castro, Joseph Y. Lo, Daniel L. Rubin, Obioma Pelka
المصدر: JAMA Network Open
بيانات النشر: American Medical Association (AMA), 2020.
سنة النشر: 2020
مصطلحات موضوعية: Adult, medicine.medical_specialty, Medizin, MEDLINE, Breast Neoplasms, Diagnostic accuracy, Sensitivity and Specificity, 030218 nuclear medicine & medical imaging, 03 medical and health sciences, Deep Learning, 0302 clinical medicine, Breast cancer, Artificial Intelligence, Image Interpretation, Computer-Assisted, Radiologists, medicine, False positive paradox, Humans, Mammography, Risk factor, Early Detection of Cancer, Aged, Sweden, medicine.diagnostic_test, Screening mammography, business.industry, Correction, General Medicine, Middle Aged, medicine.disease, United States, 3. Good health, Online Only, 030220 oncology & carcinogenesis, Female, Other, Radiology, Artificial intelligence, business, Validation cohort, Algorithms
الوصف: Importance: Mammography screening currently relies on subjective human interpretation. Artificial intelligence (AI) advances could be used to increase mammography screening accuracy by reducing missed cancers and false positives. Objective: To evaluate whether AI can overcome human mammography interpretation limitations with a rigorous, unbiased evaluation of machine learning algorithms. Design, Setting, and Participants: In this diagnostic accuracy study conducted between September 2016 and November 2017, an international, crowdsourced challenge was hosted to foster AI algorithm development focused on interpreting screening mammography. More than 1100 participants comprising 126 teams from 44 countries participated. Analysis began November 18, 2016. Main Outcomes and Measurements: Algorithms used images alone (challenge 1) or combined images, previous examinations (if available), and clinical and demographic risk factor data (challenge 2) and output a score that translated to cancer yes/no within 12 months. Algorithm accuracy for breast cancer detection was evaluated using area under the curve and algorithm specificity compared with radiologists' specificity with radiologists' sensitivity set at 85.9% (United States) and 83.9% (Sweden). An ensemble method aggregating top-performing AI algorithms and radiologists' recall assessment was developed and evaluated. Results: Overall, 144 231 screening mammograms from 85 580 US women (952 cancer positive ≤12 months from screening) were used for algorithm training and validation. A second independent validation cohort included 166 578 examinations from 68 008 Swedish women (780 cancer positive). The top-performing algorithm achieved an area under the curve of 0.858 (United States) and 0.903 (Sweden) and 66.2% (United States) and 81.2% (Sweden) specificity at the radiologists' sensitivity, lower than community-practice radiologists' specificity of 90.5% (United States) and 98.5% (Sweden). Combining top-performing algorithms and US radiologist assessments resulted in a higher area under the curve of 0.942 and achieved a significantly improved specificity (92.0%) at the same sensitivity. Conclusions and Relevance: While no single AI algorithm outperformed radiologists, an ensemble of AI algorithms combined with radiologist assessment in a single-reader screening environment improved overall accuracy. This study underscores the potential of using machine learning methods for enhancing mammography screening interpretation. CA extern
تدمد: 2574-3805
الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0c12077daf27f336825279d2831286b1Test
https://doi.org/10.1001/jamanetworkopen.2020.0265Test
حقوق: OPEN
رقم الانضمام: edsair.doi.dedup.....0c12077daf27f336825279d2831286b1
قاعدة البيانات: OpenAIRE