Preparation of abdominal computed tomography data set for patients with abdominal aortic aneurysm

封面


如何引用文章

全文:

详细

BACKGROUND: Artificial intelligence (AI) technologies are actively implemented in the processing and analysis of diagnostic medical images. The accuracy and reliability of AI algorithms are determined by the amount and quality of training data sets. Currently, a need exists for increased open access data sets, particularly abdominal aortic CT angiographic studies (CTA). Limitations of existing abdominal aortic CTA data sets are binary labeling (classification of the entire study) and small number of examinations. In addition, most examinations do not contain signs of aortic pathology, which, given its variability, significantly limits their use for AI training, since the target of such algorithms is the detection of pathology.

AIM: To prepare a CTA data set for patients with abdominal aortic aneurysm.

METHODS: Using the CTA data set with sings of abdominal aortic aneurysm, the stages and features of data set creation for AI training in accordance with the methodological recommendations were considered. Given the basic diagnostic requirements for the selected clinical task, the terms of reference for the preparation of the data set were formed, the required sample size was calculated, and the optimal annotation scenario was determined. The next stage included the selection of initial CT data of abdominal organs in the Unified Radiology Information System, anonymization of data, semi-automatic labeling and of the area of interest (aortic wall and aortic bed) using the 3D Slicer tool and its verification by an examining radiologist, and documentation of intermediate results.

RESULTS: The calculated sample volume included 100 scans, containing the arterial phase, with a slice thickness of up to 1.2 mm. The balance of “normal vs. pathology” classes was chosen to be 1:4. Partial annotation of the data (50%) was performed.

CONCLUSIONS: A methodology for preparing CTA data sets was developed. The generated dataset, if the necessary procedures are followed, will be placed in the public domain and may be used for training and testing AI algorithms and conducting scientific research.

全文:

BACKGROUND: Artificial intelligence (AI) technologies are actively implemented in the processing and analysis of diagnostic medical images. The accuracy and reliability of AI algorithms are determined by the amount and quality of training data sets. Currently, a need exists for increased open access data sets, particularly abdominal aortic CT angiographic studies (CTA). Limitations of existing abdominal aortic CTA data sets are binary labeling (classification of the entire study) and small number of examinations. In addition, most examinations do not contain signs of aortic pathology, which, given its variability, significantly limits their use for AI training, since the target of such algorithms is the detection of pathology.

AIM: To prepare a CTA data set for patients with abdominal aortic aneurysm.

METHODS: Using the CTA data set with sings of abdominal aortic aneurysm, the stages and features of data set creation for AI training in accordance with the methodological recommendations were considered. Given the basic diagnostic requirements for the selected clinical task, the terms of reference for the preparation of the data set were formed, the required sample size was calculated, and the optimal annotation scenario was determined. The next stage included the selection of initial CT data of abdominal organs in the Unified Radiology Information System, anonymization of data, semi-automatic labeling and of the area of interest (aortic wall and aortic bed) using the 3D Slicer tool and its verification by an examining radiologist, and documentation of intermediate results.

RESULTS: The calculated sample volume included 100 scans, containing the arterial phase, with a slice thickness of up to 1.2 mm. The balance of “normal vs. pathology” classes was chosen to be 1:4. Partial annotation of the data (50%) was performed.

CONCLUSIONS: A methodology for preparing CTA data sets was developed. The generated dataset, if the necessary procedures are followed, will be placed in the public domain and may be used for training and testing AI algorithms and conducting scientific research.

×

作者简介

Maria Kodenko

Moscow Center for Diagnostics and Telemedicine; Bauman Moscow State Technical University

Email: m.r.kodenko@yandex.ru
ORCID iD: 0000-0002-0166-3768
SPIN 代码: 5789-0319
俄罗斯联邦, Moscow; Moscow

Tatiana Makarova

Russian Medical Academy of Continuous Professional Education

编辑信件的主要联系方式.
Email: makarova97.mail@gmail.com
ORCID iD: 0000-0002-6274-5636
俄罗斯联邦, Moscow

参考

  1. Pavlov NA, Andreychenko AE, Vladzymyrskyy AV, et al. Reference medical datasets (MosMedData) for independent external evaluation of algorithms based on artificial intelligence in diagnostics. Digital Diagnostics. 2021;2(1):49−65. (In Russ). doi: https://doi.org/10.17816/DD60635
  2. Tsentr diagnostiki i telemeditsiny [Internet]. Nabor dannykh dlya self-testa diagnosticheskogo MosMedData KT s priznakami anevrizmy aorty tip III [cited 2023 Jun 05]. Available from: https://mosmed.ai/datasets/mosmeddata-kt-s-priznakami-anevrizmi-aorti-tip-iii. (In Russ).
  3. Aorta segmentation [Internet] [cited 2023 Jun 05]. Available from: https://www.kaggle.com/datasets/licethyaneth/aorta-segmentation.
  4. Reglament podgotovki naborov dannykh s opisaniem podkhodov k formirovaniyu reprezentativnoi vyborki dannykh. P. 1: metodicheskie rekomendatsii: [preprint]. Morozov SP, Vladzimirskii AV, Andreichenko AE, et al. Moscow: GBUZ “NPKTs DiT DZM”, 2021. 40 p. (Luchshie praktiki luchevoi i instrumental'noi diagnostiki; iss. 103). (In Russ).
  5. Bazovye diagnosticheskie trebovaniya k rezul’tatam raboty II [Internet] [cited 2023 Jun 05]. Available from: https://mosmed.ai/ai/docs/. (In Russ).
  6. Homeyer A, Geißler C, Schwen LO, et al. Recommendations on compiling test datasets for evaluating artificial intelligence solutions in pathology. Mod Pathol. 2022;35(12):1759–1769. doi: 10.1038/s41379-022-01147-y

补充文件

附件文件
动作
1. JATS XML

版权所有 © Eco-Vector, 2023

Creative Commons License
此作品已接受知识共享署名-非商业性使用-禁止演绎 4.0国际许可协议的许可。

СМИ зарегистрировано Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор).
Регистрационный номер и дата принятия решения о регистрации СМИ: серия ПИ № ФС 77 - 79539 от 09 ноября 2020 г.


##common.cookie##