Preparation of abdominal computed tomography data set for patients with abdominal aortic aneurysm

Cover Page


Cite item

Full Text

Abstract

BACKGROUND: Artificial intelligence (AI) technologies are actively implemented in the processing and analysis of diagnostic medical images. The accuracy and reliability of AI algorithms are determined by the amount and quality of training data sets. Currently, a need exists for increased open access data sets, particularly abdominal aortic CT angiographic studies (CTA). Limitations of existing abdominal aortic CTA data sets are binary labeling (classification of the entire study) and small number of examinations. In addition, most examinations do not contain signs of aortic pathology, which, given its variability, significantly limits their use for AI training, since the target of such algorithms is the detection of pathology.

AIM: To prepare a CTA data set for patients with abdominal aortic aneurysm.

METHODS: Using the CTA data set with sings of abdominal aortic aneurysm, the stages and features of data set creation for AI training in accordance with the methodological recommendations were considered. Given the basic diagnostic requirements for the selected clinical task, the terms of reference for the preparation of the data set were formed, the required sample size was calculated, and the optimal annotation scenario was determined. The next stage included the selection of initial CT data of abdominal organs in the Unified Radiology Information System, anonymization of data, semi-automatic labeling and of the area of interest (aortic wall and aortic bed) using the 3D Slicer tool and its verification by an examining radiologist, and documentation of intermediate results.

RESULTS: The calculated sample volume included 100 scans, containing the arterial phase, with a slice thickness of up to 1.2 mm. The balance of “normal vs. pathology” classes was chosen to be 1:4. Partial annotation of the data (50%) was performed.

CONCLUSIONS: A methodology for preparing CTA data sets was developed. The generated dataset, if the necessary procedures are followed, will be placed in the public domain and may be used for training and testing AI algorithms and conducting scientific research.

Full Text

BACKGROUND: Artificial intelligence (AI) technologies are actively implemented in the processing and analysis of diagnostic medical images. The accuracy and reliability of AI algorithms are determined by the amount and quality of training data sets. Currently, a need exists for increased open access data sets, particularly abdominal aortic CT angiographic studies (CTA). Limitations of existing abdominal aortic CTA data sets are binary labeling (classification of the entire study) and small number of examinations. In addition, most examinations do not contain signs of aortic pathology, which, given its variability, significantly limits their use for AI training, since the target of such algorithms is the detection of pathology.

AIM: To prepare a CTA data set for patients with abdominal aortic aneurysm.

METHODS: Using the CTA data set with sings of abdominal aortic aneurysm, the stages and features of data set creation for AI training in accordance with the methodological recommendations were considered. Given the basic diagnostic requirements for the selected clinical task, the terms of reference for the preparation of the data set were formed, the required sample size was calculated, and the optimal annotation scenario was determined. The next stage included the selection of initial CT data of abdominal organs in the Unified Radiology Information System, anonymization of data, semi-automatic labeling and of the area of interest (aortic wall and aortic bed) using the 3D Slicer tool and its verification by an examining radiologist, and documentation of intermediate results.

RESULTS: The calculated sample volume included 100 scans, containing the arterial phase, with a slice thickness of up to 1.2 mm. The balance of “normal vs. pathology” classes was chosen to be 1:4. Partial annotation of the data (50%) was performed.

CONCLUSIONS: A methodology for preparing CTA data sets was developed. The generated dataset, if the necessary procedures are followed, will be placed in the public domain and may be used for training and testing AI algorithms and conducting scientific research.

×

About the authors

Maria R. Kodenko

Moscow Center for Diagnostics and Telemedicine; Bauman Moscow State Technical University

Email: m.r.kodenko@yandex.ru
ORCID iD: 0000-0002-0166-3768
SPIN-code: 5789-0319
Russian Federation, Moscow; Moscow

Tatiana A. Makarova

Russian Medical Academy of Continuous Professional Education

Author for correspondence.
Email: makarova97.mail@gmail.com
ORCID iD: 0000-0002-6274-5636
Russian Federation, Moscow

References

  1. Pavlov NA, Andreychenko AE, Vladzymyrskyy AV, et al. Reference medical datasets (MosMedData) for independent external evaluation of algorithms based on artificial intelligence in diagnostics. Digital Diagnostics. 2021;2(1):49−65. (In Russ). doi: https://doi.org/10.17816/DD60635
  2. Tsentr diagnostiki i telemeditsiny [Internet]. Nabor dannykh dlya self-testa diagnosticheskogo MosMedData KT s priznakami anevrizmy aorty tip III [cited 2023 Jun 05]. Available from: https://mosmed.ai/datasets/mosmeddata-kt-s-priznakami-anevrizmi-aorti-tip-iii. (In Russ).
  3. Aorta segmentation [Internet] [cited 2023 Jun 05]. Available from: https://www.kaggle.com/datasets/licethyaneth/aorta-segmentation.
  4. Reglament podgotovki naborov dannykh s opisaniem podkhodov k formirovaniyu reprezentativnoi vyborki dannykh. P. 1: metodicheskie rekomendatsii: [preprint]. Morozov SP, Vladzimirskii AV, Andreichenko AE, et al. Moscow: GBUZ “NPKTs DiT DZM”, 2021. 40 p. (Luchshie praktiki luchevoi i instrumental'noi diagnostiki; iss. 103). (In Russ).
  5. Bazovye diagnosticheskie trebovaniya k rezul’tatam raboty II [Internet] [cited 2023 Jun 05]. Available from: https://mosmed.ai/ai/docs/. (In Russ).
  6. Homeyer A, Geißler C, Schwen LO, et al. Recommendations on compiling test datasets for evaluating artificial intelligence solutions in pathology. Mod Pathol. 2022;35(12):1759–1769. doi: 10.1038/s41379-022-01147-y

Supplementary files

Supplementary Files
Action
1. JATS XML

Copyright (c) 2023 Eco-Vector

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

СМИ зарегистрировано Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор).
Регистрационный номер и дата принятия решения о регистрации СМИ: серия ПИ № ФС 77 - 79539 от 09 ноября 2020 г.


This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies