Assessing the probability of metastatic mediastinal lymph node involvement in patients with non small cell lung cancer using convolutional neural networks on chest computed tomography
- Authors: Shevtsov A.E.1, Tominin I.D.1, Tominin V.D.1, Malevanniy V.M.1, Esakov Y.S.2, Tukvadze Z.G.2, Nefedov A.O.3, Yablonskii P.K.3, Gavrilov P.V.3, Kozlov V.V.4, Blokhina M.E.5, Nalivkina E.A.5, Gombolevskiy V.A.1,6, Vasilev Y.A.7, Dugova M.N.1, Chernina V.Y.1, Omelyanskaya O.V.7, Reshetnikov R.V.7, Blokhin I.A.7, Belyaev M.G.1
-
Affiliations:
- IRA Labs
- Moscow City Clinical Oncological Hospital № 1
- Saint-Petersburg State Research Institute of Phthisiopulmonology
- Novosibirsk Regional Clinical Oncology Dispensary
- AstraZeneca Pharmaceuticals LLC
- Artificial Intelligence Research Institute
- Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies
- Issue: Vol 5, No 4 (2024)
- Pages: 765-783
- Section: Technical Reports
- Submitted: 15.05.2024
- Accepted: 25.09.2024
- Published: 19.12.2024
- URL: https://jdigitaldiagnostics.com/DD/article/view/632008
- DOI: https://doi.org/10.17816/DD632008
- ID: 632008
Cite item
Abstract
BACKGROUND: Lung cancer is the second most common cancer worldwide, accounting for approximately 20% of all cancer related deaths and having a <10% 5 year survival rate for very late stage cases. For the prevalent non small cell lung cancer (NSCLC), recent guidelines advise staging based on the 8th edition of the TNM classification, highlighting the importance of mediastinal lymph node involvement. While noninvasive methods are generally accurate, they often lack sensitivity, and invasive methods may not be suitable for all patients. Advances in deep learning present potential in solving such problems. However, most research focuses on algorithm development more than clinical relevance. Moreover, none of them addressed individual lymph node malignancies, limiting comprehensive analysis and interpretability and leaving clinicians without sufficient means to validate the results effectively.
AIM: To develop a local data trained and validated algorithm for segmenting each mediastinal lymph node in chest computed tomography (CT) and assessing the probability of its involvement in metastasis.
MATERIALS AND METHODS: Initially, IASLC lymph node stations are segmented, providing a bounding box of the mediastinum for further processing. Next, the image is cropped to this box and passed through a second network to identify and mask all visible lymph nodes. Finally, each detected lymph node is extracted, stacked with its mask, and evaluated by a feed-forward network to determine malignancy probabilities.
RESULTS: The pipeline achieved an average recall and object Dice Score of 0.74±0.01 and 0.53±0.26 for the clinically relevant lymph node segmentation task. Further, it recorded a 0.73 ROC AUC for predicting a patient’s N-stage, outperforming traditional size based criteria.
CONCLUSIONS: The proposed algorithm enables new research algorithms to optimize the management of patients with nonenlarged intrathoracic lymph nodes, thus improving the quality of medical care for patients with cancer.
Keywords
Full Text
Background
Lung cancer is the second most prevalent cancer globally. In 2018, 2.1 million new lung cancer cases were reported, with 1.8 million deaths, representing approximately 20% of all cancer-related deaths [1]. The 5-year survival rate for early-stage lung cancer is 68%–92%. However, in advanced disease, it drops to <10%, accounting for 42% of lung cancer cases [2]. Thus, an early diagnosis and timely treatment are crucial for improving survival and lowering treatment costs.
Screening examinations are crucial for detecting early-stage lung cancer because they can identify the disease in asymptomatic high-risk patients. These examinations are recommended in patients aged over 50 years who are currently smoking or who have quit smoking within the last 15 years (smoking index ≥20) [3]. Low-dose computed tomography (CT) is used to screen for lung pathology because it has been proven effective in multiple randomized prospective studies [6–11]. However, this noninvasive technique, while associated with minimal risks, provides insufficient information.
It is recommended to utilize minimally invasive biopsies or noninvasive positron emission tomography with CT (PET/CT) to acquire initial results [4, 5]. For patients with confirmed lung cancer, staging is crucial for determining the degree of metastasis and selecting the best treatment strategy based on the cancer type.
Recent guidelines recommend using the TNM staging system (8th edition, 2017) for staging non-small cell lung cancer (NSCLC), the most prevalent lung cancer, where T indicates the primary tumor size and invasiveness, N the degree of spread to thoracic lymph nodes, and M represents distant metastasis [12]. This system enables the determination of disease stage based on clinical examination findings (typically prior to surgery, using noninvasive methods), histopathological findings, and repeated staging following treatment.
Since thoracic lymph node involvement is frequent in lung cancer, evaluating regional lymph node involvement is essential for staging the disease and selecting appropriate treatment approaches. Whether aggressive surgery or adjuvant therapy is employed in NSCLC patients depends on the extent of mediastinal lymph node involvement [4, 5]. There are currently two main types of treatment strategies for NSCLC patients:
- Perform PET/CT followed by diagnostic surgery [5, 13, 14];
- Perform diagnostic surgery irrespective of the PET/CT findings [4].
The National Comprehensive Cancer Network guidelines recommend radical surgery as a preferable therapeutic option in patients with early NSCLC, while radiotherapy or chemotherapy is considered appropriate for patients with advanced NSCLC [15].
Despite the wide range of approaches for detecting mediastinal lymph node involvement, it can be challenging to confirm or rule out the presence of metastases. Studies performed by Roberts et al. [6] and Kanzaki et al. [17] reveal that compared to histological examination, which is considered the gold standard, the misdiagnosis and false-negative rates are higher when PET/CT is used for detecting lymph node involvement compared with histological examinations. Moreover, PET/CT is not accessible to most patients in remote areas [18]. Diagnostic surgeries, even minimally invasive ones, require anesthesia, which may be contraindicated in some patients. This necessitates the use of noninvasive, cost-effective techniques for predicting mediastinal lymph node involvement in patients with primary NSCLC.
Advancements in deep learning technologies make it possible to address these challenges [19]. Published studies have demonstrated encouraging outcomes for single or group lymph node segmentation [20–22]. However, most studies focus on algorithm development rather than on the clinical relevance of the findings. Moreover, to the best of our knowledge, no studies have assessed individual lymph node involvement, which limits the possibilities of integrating and independently validating the findings in clinical practice [23–25].
Determining the extent of lymph node involvement is crucial for administering effective treatment in patients with NSCLC. Clinicians employ two main groups of methods:
- Noninvasive: techniques that do not physically affect the patient’s body;
- Invasive: diagnostic surgeries.
Despite their high sensitivity and low rates of post-operative tumor grade changes, invasive techniques necessitate anesthesia and surgical intervention [14, 26]. This calls for more affordable and accessible noninvasive techniques.
Several studies have revealed that this technique has limitations when the short axis diameter is the only parameter for determining the histological status of lymph nodes based on CT or magnetic resonance imaging (MRI) findings [27–29]. For example, Brown et al. [27] reported that mesorectal MRI findings do not differentiate between histologically benign (2–10 mm) and malignant (3–15 mm) lymph nodes. The same problem arises in MRI of the head and neck, where the standard threshold of 10 mm yields a sensitivity and specificity of 0.88 and 0.39, respectively [28, 29]. However, additional morphological criteria such as irregular boundaries or mixed signal intensity raise the sensitivity to 0.85 (95% confidence interval [CI]: 0.74–0.92) and specificity to 0.97 (95% CI: 0.95–0.99). Although a recent study identified the classification accuracy for various combinations of criteria, it did not propose a standardized approach [30]. To address this issue, Elsholtz et al. [31] developed the Node-RADS lymph node assessment system. It assesses the visible lymph nodes based on their short axis diameter, texture, border, and shape, using five categories. Moreover, it considers other factors such as the anatomical site.
When used with PET, CT with contrast yields precise results. Significant differences (p < 0.05) in sensitivity and specificity were observed when ascertaining the degree of lymph node involvement using PET/CT compared to CT with contrast (0.78 and 0.92 vs. 0.56 and 0.73, respectively) [32]. However, this technique is expensive and not available to patients in remote areas [18].
Algorithmic Approach
Mediastinal lymph node segmentation and classification remain unexplored because of the lack of publicly available high-quality datasets. However, several studies have explored the algorithm components proposed in this paper.
Lymph Node Segmentation
Over the last 5–10 years, several approaches to volumetric medical image segmentation have emerged. Some architectures, such as DeepMedic3D U-Net or V-Net, provide credible results when assessing publicly available medical image sets [33–37]. The proposed pyramid convolutional neural networks are optimal for lymph node segmentation. Employing fourfold cross-validation, Iuga et al. [21] discovered that the detection rate for large lymph nodes was 0.77; however, the false positive rate was 10.3 per case [21]. Furthermore, this approach has low sensitivity (0.34) for lymph nodes measuring 5–10 mm, with a total Dice Score of 0.44.
Identification of Lymph Node Groups
Iuga et al. [21, 22] used a multiclass classification to analyze the distribution of mediastinal lymph nodes by group, building on their earlier work. Tops-1, -2, and -3 classifications demonstrated high accuracy values of 0.86, 0.94, and 0.96, respectively. Despite these findings, the proposed algorithm lacks sensitivity for critical lymph node groups and only partially meets the recommendations of Goldstraw et al. [2]. In contrast, Guo et al. [20] reported effective segmentation with a Dice score of 0.81 ± 0.06. However, the authors did not ascertain the accuracy of the lymph node distribution and its influence on the extent of regional lymph node involvement.
Classification of lymph node involvement
Previous studies have proposed algorithms for the indirect analysis of mediastinal lymph node involvement using primary tumor image features, without indicating specific groups or individual lymph nodes [23–25]. This is partially due to the difficulties in acquiring precise reference labels. Determining the location of each lymph node on CT images after receiving the biomaterial can be challenging. In this scenario, the simplest solution was to assign labels to each lymph node, similar to the method employed for grading pulmonary nodules. Simple patch-based convolutional neural networks exhibit an area under the ROC curve (AUC) of 0.928 ± 0.027 for classification. Pretraining a convolutional autoencoder for picture fragment reconstruction and employing an encoder as a basis for metastatic classification improves outcomes, with an AUC of 0.936 ± 0.009 [38]. However, using weak supervision, it is possible to establish a shared histological label for many lymph nodes from the same group. Dubost et al. [39] proposed the innovative concept of using a single label for training, with a prognostic map generation when receiving outputs. However, max-pooling can result in the loss of significant information for small targets, such as lymph nodes, limiting the efficacy of their histological status assessment.
Aim
To develop and validate a CT-based algorithm trained on internal data for individual mediastinal lymph node segmentation and to predict the probability of metastasis for each lymph node.
Materials and methods
Study Design
This was an observational, single-center, retrospective study.
Eligibility Criteria
Inclusion criteria:
- Histologically confirmed NSCLC;
- Availability of data acquired using contrast enhanced-chest CT (venous phase), slice thickness ≤1 mm;
- Presence of a two-month interval between the CT and surgery.
Non-inclusion criteria:
- Lack of data obtained through chest CT with contrast (venous phase) or lymph node biopsy;
- More than two months between the CT and surgery.
Exclusion criteria:
- CT artifacts preventing reliable assessment;
- Low diagnostic value of the biopsy findings.
Study Site
Patients who had undergone chest CT with contrast and thoracic lymph node biopsy were enrolled at the City Clinical Oncology Hospital No. 1 (Moscow).
Intervention
The proposed algorithm for lymph node segmentation and metastasis classification involves a three-stage process:
First stage: identification of lymph node groups and mediastinum segmentation in the region of interest are essential for determining the extent of regional lymph node involvement [12];
Second stage: cropping the input image and segmentation of all visible lymph nodes using a bounding box for the mediastinum;
Third stage: analyzing all identified lymph nodes using a feedforward network to determine the probability of metastasis.
The results provide information on lymph node involvement in specific groups, enabling evaluation of the degree of involvement based on the tumor site. Subsection 3.1 addresses the segmentation of lymph node groups; subsection 3.2 examines the segmentation of individual lymph nodes; and subsection 3.3 discusses the classification of lymph node involvement.
Segmentation of the lymph node groups
In patients with NSCLC, the affected lymph nodes are located in a narrow range (mediastinal area). The anatomical and primary tumor sites dictate the extent of regional lymph node involvement [12]. The International Association for the Study of Lung Cancer (IASLC) guidelines recognize ten lymph node groups in the mediastinum [40]. Lymph node groups near the trachea and bronchi are divided into left and right groups. No additional specialized classification system is used for the subcarinal lymph nodes. During diagnostic procedures, biopsies are not typically performed for Groups 1, 8, and 9 lymph nodes. Therefore, they were excluded from this study.
This study used a two-component U-Net model for the 3D segmentation of lymph node groups (Fig. 1) [41]. While the first component distinguished between the mediastinum and the background, the second component classified each voxel within the mediastinal mask to a specific lymph node group. Advanced deep learning technologies, such as ResBlocks, batch normalization, and the ReLU activation function, were used after each convolution, except for the data output [42–44].
Fig. 1. Three-stage algorithm for lymph node segmentation and metastasis classification: a, segmentation of lymph node groups; b, image coding based on the bounding box and processing using a second network; c, marking each identified lymph node, applying the respective mask, and assessment through a feedforward network. LN, lymph node.
Lymph node segmentation
To preserve computing resources during the second stage, the region of interest should be marked employing the bounding box for the mediastinum obtained in the first stage. If the size in the axial view exceeds 128 pixels, filling is used to achieve the minimal size of 128 pixels.
During this stage, the architecture was comparable to the one used for the segmentation of the lymph node groups. However, it offers a pooled binary data output for a lymph node segmentation prognostic map with more channels and fewer levels (see Fig. 1). This design is adapted to individual lymph nodes that are substantially smaller than the groups; thus, a large receptive field is not required. The design provides for additional characteristics that boost segmentation accuracy.
Classification of lymph node involvement
In weak supervision, a metastasis label is assigned to a lymph node group, and the probability of metastasis for each lymph node is predicted. Lymph nodes are marked on 32 × 32 image fragments for study purposes. They are merged with respective masks to combine all identified objects in a single dataset. Data is processed using convolutional neural networks with a five-level ResNet architecture, followed by max-pooling to minimize spatial dimensions [42]. During the final stage, data is analyzed using the sigmoid and fully connected layers to assess the probability of metastasis for each lymph node.
This is further complicated by the fact that in contrast to benign lymph nodes, metastases might be present or absent in malignant lymph node groups. Thus, a benign lymph node group has no affected lymph nodes, whereas a malignant group must have at least one. To satisfy this requirement, a unique loss function is used. The binary cross-entropy loss function was used for training to assess the probability of metastasis in all benign group lymph nodes. For lymph nodes with metastasis, training is only performed if the prognostic data indicates that is all lymph nodes in this region are benign (see Fig. 1). This method has both advantages and disadvantages.
Experiment
Data
The publicly available dataset exhibited several limitations [45]:
- It contains inadequate information on the diagnosis and histological status of the mediastinal lymph nodes;
- The groups are not specified, and the annotation only includes lymph nodes with a short axis diameter of >10 mm;
- Contrast-enhanced CT images were likely obtained during the arterial rather than the venous phase.
Thus, a private dataset for 60 patients with confirmed NSCLC who underwent diagnostic surgery to assess specific lymph node groups was used.
The following inclusion criteria were applied to the dataset:
- presence of CT images acquired during the venous contrast phase, which allows the most effective differentiation between the lymph nodes and surrounding structures (particularly blood vessels);
- diagnostic surgery performed within two months after the previous examination, which included the venous phase contrast-enhanced CT images;
- contrast-enhanced CT slice thickness of <1 mm, with eight image series selected for annotation.
Lymph node groups
The same radiologist annotated the lymph node groups, meticulously adhering to the IASLC guidelines for generating prognostic maps for the mediastinum [40]. The annotation protocol required that large blood vessels, such as the aorta, pulmonary trunk, and azygos vein, as well as the esophagus, be differentiated from the lymph node groups (Fig. 2).
Fig. 2. Example of lymph node group annotation at different mediastinal levels.
Lymph nodes
Two radiologists annotated the mediastinal lymph nodes and assigned binary masks to all the identifiable lymph nodes. If the boundaries between several lymph nodes were uncertain, a single mask was assigned.
Degree of lymph node involvement
Lymph node group involvement was determined based on the outcomes of video-assisted mediastinoscopic lymphadenectomy (VAMLA) [14]. A biopsy of the dissected lymph nodes was performed to establish their status. During the final stage, each group was assigned one of the three labels based on the histological examination findings:
- No data for lymph node groups that were not resected;
- benign;
- malignant.
The statistics for the training dataset are presented in Table 1.
Table 1. Training dataset: biopsy findings for the lymph node groups | |||
Lymph node groups | Benign | Malignant | No data |
Right low cervical, supraclavicular, and sternal notch nodes | 0 | 0 | 8 |
Left low cervical, supraclavicular, and sternal notch nodes | 0 | 0 | 8 |
Right upper paratracheal nodes | 5 | 2 | 1 |
Left upper paratracheal nodes | 3 | 0 | 5 |
Prevascular nodes | 0 | 0 | 8 |
Prevertebral (retrotracheal) nodes | 0 | 0 | 8 |
Right lower paratracheal nodes | 6 | 2 | 0 |
Left lower paratracheal nodes | 8 | 0 | 0 |
Subaortic nodes | 0 | 1 | 7 |
Paraaortic nodes | 0 | 1 | 7 |
Subcarinal nodes | 6 | 1 | 1 |
Paraesophageal nodes | 0 | 0 | 8 |
Pulmonary ligament nodes | 1 | 0 | 7 |
Right hilar nodes | 2 | 2 | 4 |
Left hilar nodes | 2 | 1 | 5 |
Note. Regional lymph node classification according to the International Association for the Study of Lung Cancer (IASLC) guidelines. |
Training
All experiments were performed using a standard pre-processing method, with input images amplified to a fixed voxel spacing (1, 1, 1). The CT images were scaled at 0–1 HU for intensity and set to the soft tissue window level (160–240 HU) for window level. Goncharov et al. [46] used a pretrained neural network to crop the input image to match the lung area, then adjusted it to fit the mediastinal area. Given the small dataset, high amplification was used during training, including 10° and 90° rotation, random shifts, and vertical and horizontal flip (Fig. 3).
Fig. 3. Example of assigning the assessed statistical parameters to different connected components of the sample mask (a) and the respective logit mask (b); self-logit and hit-logit, personal statistics for each connected component; hit-dice, shared parameters for a pair of connected components with a positive value (с).
Segmentation of lymph node groups
The first stage involved training with 30,000 iterations using the Adam optimizer, including mixed precision and gradient scaling. The first component was binary cross-entropy with adaptive foreground voxel reweighting. For the second component, cross-entropy was employed. Throughout the training, the learning rate remained consistent at 0.003.
Lymph node segmentation
In the second stage, training was performed with 70,000 iterations using the Adam optimizer, incorporating mixed precision and gradient scaling. Like the first stage, the second stage was aimed at minimizing the binary cross-entropy and adaptive foreground voxel reweighting. The learning rate was 10−3 at the beginning of training and gradually decreased by 3, 3, 2, and 2 times during 5,000, 15,000, 50,000, and 60,000 iterations, respectively.
Classification of lymph node involvement
The classification network was trained with 40,000 iterations using the Adam optimizer, with loss minimization using positive class example reweighting (coefficient 100). The learning rate was 3 × 10−5 at the beginning of the training and remained constant throughout the training.
Parameters
To assess the first stage, the accuracy of assigning lymph nodes to respective groups was ascertained, as conventional segmentation parameters were of little relevance in this case. Moreover, prognoses based on the final outcome were assessed for the third stage. Patients with and without metastases had their regional lymph node involvement levels evaluated using the AUC.
FROC method
The FROC method developed by Van Ginneken et al. [36] was used to assess the quality of lymph node detection. The FROC curve represents the relationship between the model response to the object (Y-axis) and the average false positive rate in the image (X-axis).
The FROC curves were plotted employing a sample mask for the entire image (Fig. 3a) and the respective logit map, which was converted to binary form using the threshold of 0 to obtain a logit mask (Fig. 3b). Both masks were then divided into interconnected components. Each connected component was assigned three statistical parameters:
- self-logit: maximum logit value within the connected component; is set to infinity if the connected component is within the sample mask;
- hit-dice: maximum Dice score between the selected connected component and another mask (Fig. 3c);
- hit-logit: the same statistical parameter as self-logit; however, it is derived from a connected component within another mask that matches the first mask in terms of the hit-dice value; is set to negative infinity if hit-dice = 0.
These statistical parameters enable the hit-logit value to be used plot an FROC curve by selecting different l thresholds. Moreover, the hit-dice value was considered when the curve points were obtained to check the hit condition. The hit condition for two connected components was considered to be met if the hit-dice value was positive. Thus, for the selected l threshold:
- The false positive value was defined as the number of connected components within a logit mask with a self-logit >l threshold, but with hit-dice = 0;
- The true positive value was defined as the number of connected components within a sample mask with hit-logit >l threshold and hit-dice > 0;
- The false negative value was defined as the number of connected components within a sample mask with a self-logit ≤l threshold or hit-dice = 0.
In experiments, we used l thresholds in the range of 0.1 to max-logit, where the max-logit value is the maximum logit value for the control sample of the predictions.
In contrast to traditional approaches for plotting such curves, this approach ensures uniformity because each connected component within a prognostic mask appears in full or does not appear at all, preventing the separation or merging of the connected components. Given the limited accuracy of the floating point calculations, we used a logit value search instead of using a classical probability search. It allows plotting curves for the whole range, because large logit values can be found with greater accuracy than large probabilities, which are usually rounded to 1.
Mean response
The FROC method yields comprehensive analysis results; however, their interpretation can be challenging. We used an additional parameter to simplify and integrate the data derived from the FROC curves. This paper presents the mean values for false positive points in the range of 0 to 5, with an increment of 0.01. Because it identifies a crucial clinical characteristic—the number of detected lesions per case—this method establishes the detection efficacy and functions as the primary quality criterion in this investigation.
Dice Score of an object
The Dice score is the most frequently employed approach for analyzing segmentation [37]. However, applying its average value for multiple objects in several images may cause small objects to be obscured by large ones, which is a significant drawback. This paper reveals the average Dice score for an object:
(1)
where N is the number of images in the control sample; M is the number of lesions within the sample mask; Yj is the set of voxels associated with the j-th connected component within this mask; and Yˆj is the respective connected component for the prognostic mask with the greatest hit rate for the Dice Score (DSC).
Results
Lymph Node Segmentation
In the second stage, the parameters for different size ranges are presented:
- All lymph nodes;
- Lymph nodes >5 mm in size: clinically significant according to the guidelines;
- Lymph nodes >10 mm in size: are used as a baseline parameter of potential metastasis [31].
The results are presented in Table 2 and Table 3.
Table 2. Lymph node detection rates for groups based on the short axis diameter | ||
Group | Mean response | Dice score |
d > 0 mm | 0.48 ± 0.01 | 0.53 ± 0.24 |
d > 5 mm | 0.74 ± 0.01 | 0.53 ± 0.26 |
d > 10 mm | 0.95 ± 0.01 | 0.56 ± 0.26 |
Note. d, short axis diameter. |
Table 3. Patient status by the degree of regional lymph node involvement for the maximum short axis diameter; predicted probability of metastasis | |||
Patient | Lymph node involvement | Maximum d, mm | Maximum p-value |
0 | 0 | 11 | 0.29 |
1 | 0 | 14 | 0.41 |
2 | + | 38 | 1.00 |
3 | + | 14 | 0.54 |
4 | + | 14 | 0.54 |
5 | 0 | 21 | 0.83 |
6 | + | 15 | 0.61 |
7 | + | 11 | 0.96 |
Note. Lymph nodes selected by diameter can differ from the lymph nodes with the highest probability of metastasis. d, short axis diameter. |
Despite the low detection rate in the first group, the convolutional neural networks demonstrated optimal sensitivity for the highest risk (last) group, as evidenced by a low false positive rate (three per case) (Fig. 4, 5).
Fig. 4. Accuracy of assigning lymph nodes to groups in accordance with the International Association for the Study of Lung Cancer (IASLC) guidelines.
Fig. 5. Detection results during lymph node segmentation. FP, false positive; d, short axis diameter.
Classification of Lymph Node Involvement
The metastasis classification outcomes are consistent with those obtained using a simplified approach, where a lymph node with the greatest short axis diameter was considered a marker of metastasis. Using a threshold of 10 mm, this straightforward criterion produced three false positive findings. However, the proposed algorithm is more effective than this approach (Fig. 6) because it provided a larger AUC compared to the simplified approach (0.73 vs. 0.53). The Node-RADS classification indicates that lymph nodes with a short axis diameter of >30 mm be definitely considered as lymph nodes with metastasis. The only error was detected in patient 5: the selected lymph node had a very high probability of metastasis (Fig. 7).
Fig. 6. Comparison of baseline criteria derived from the short axis diameter for predicting patient status regarding the degree of regional lymph node involvement and the proposed algorithm. TPR, true positive rate; FPR, false posi-tive rate; SAD, short axis diameter.
Fig. 7. Lymph nodes with the highest probability of metastasis for each patient. N0, no metastasis; N+, metastasis.
Discussion
The proposed loss function exhibits both advantages and disadvantages. It allows a convolutional neural network to independently determine the relationship between lymph nodes classified as malignant groups, using prior experience to make modifications. Conversely, it is limited in its capacity to assign respective probabilities to positive class examples because specific information for each lymph node in the malignant group is unavailable. This training method can improve sensitivity but can also increase the false positive rate.
The clinical classification of lymph node locations places stringent requirements on the method. However, its main disadvantage is the small dataset, which lacks examples of different lymph nodes with and without metastasis. This is primarily due to the labor- and time-intensive process of establishing the boundaries of the lymph node groups and individual lymph nodes from scratch. For one patient, mapping lymph node groups and individual lymph nodes takes approximately 1 hour and 2–3 hours, respectively. Ambiguous human anatomy criteria further complicate the process, making it unfeasible to establish broad guidelines. Thus, it is essential to augment the efficacy of the algorithm parameters by expanding the training dataset by including new cases of lymph node involvement without enlargement and enlarged lymph nodes without metastasis for each group of thoracic lymph nodes.
Multiphase CT images can significantly boost the capability to determine the degree of lymph node involvement by providing detailed information on each lymph node with and without intravenous contrast enhancement. The venous contrast phase is especially valuable [25]. However, the potential benefits of multiphase CT, including the native (without contrast), arterial, venous, and delayed intravenous contrast phases, are poorly understood. Their assessment can provide further insight into the mechanism of contrast uptake and distribution in the lymph nodes.
Conclusion
This paper presents a three-stage algorithm for lymph node segmentation and metastasis classification in patients with NSCLC. Training was conducted using the histological confirmation results for the lymph node groups. The proposed algorithm has an overall response of 0.74 ± 0.01 and a Dice score of 0.53 ± 0.26 for the segmentation of clinically significant lymph nodes (with a short axis diameter of 5 mm). It also has an AUC of 0.73 for predicting patient status regarding the degree of regional lymph node involvement. Thus, the proposed three-stage algorithm is superior to the conventional size-based methods. In enlarged lymph nodes (with a short axis diameter of 10 mm), segmentation was more effective, with an overall response of 0.95 and a Dice score of 0.56. This provides an opportunity for future studies to improve the quality of cancer therapy and the treatment of patients without thoracic lymph node enlargement.
Moreover, the proposed algorithm can be incorporated into the current management protocol for patients with confirmed NSCLC as a transitional step between the initial diagnosis and PET/CT. This algorithm has several potential applications. If the algorithm predicts a low probability of mediastinal lymph node involvement, radical surgery can be considered immediately, eliminating the need for PET/CT and diagnostic surgery. In contrast, if there is a high chance of metastasis, radical surgery is not recommended. In this case, neoadjuvant chemotherapy can be considered without the need for PET/CT and diagnostic surgery. It is anticipated that this algorithm will be more affordable and accessible to patients, even if its accuracy is on par with PET/CT.
Additional information
Funding source. This work was not supported by any external sources of funding.
Competing interests. The authors declare that they have no competing interests.
Authors’ contribution. All authors made a substantial contribution to the conception of the work, acquisition, analysis, interpretation of data for the work, drafting and revising the work, final approval of the version to be published and agree to be accountable for all aspects of the work. A.E. Shevtsov — literature search on the article topic, data analysis, processing of research results, manuscript writing; I.D. Tominin — dataset formation, processing of research results, expert evaluation of information; V.D. Tominin, V.M. Malevanny — processing of research results, expert evaluation of information; Z.G. Tukvadze — literature search on the article topic, manuscript writing; Yu.S. Esakov, V.V. Kozlov — expert evaluation of information, manuscript editing; A.O. Nefedov, P.К. Yablonsky, P.V. Gavrilov, Yu.А. Vasiliev, О.V. Omelyanskaya, I.А. Blokhin — expert evaluation of information; P.V. Gavrilov — expert evaluation of information; M.E. Blokhina, Е.А. Nalivkina — research concept, expert evaluation of information, approval of the final manuscript version; V.A. Gombolevskyб М.G. Belyaev — research concept, expert evaluation of information, manuscript writing, approval of the final manuscript version; M.N. Dugova, V.Yu. Chernina — research concept, literature search on the article topic, expert evaluation of information, manuscript editing; R.V. Reshetnikov — expert evaluation of information, approval of the final manuscript version.
Acknowledgments. The authors would like to thank Shukran Ragimov, and Anatolii Akhmedov for the IASLC lymph node stations biopsy results extraction, and Anastasia Nikulina and Ekaterina Chukanova for lymph node annotation.
About the authors
Alexey E. Shevtsov
IRA Labs
Author for correspondence.
Email: a.shevtsov@ira-labs.com
ORCID iD: 0000-0003-3085-4325
Russian Federation, Moscow
Iaroslav D. Tominin
IRA Labs
Email: ya.tominin@ira-labs.com
ORCID iD: 0000-0002-7210-7208
Russian Federation, Moscow
Vladislav D. Tominin
IRA Labs
Email: v.tominin@ira-labs.com
ORCID iD: 0000-0001-5678-3452
Russian Federation, Moscow
Vsevolod M. Malevanniy
IRA Labs
Email: v.malevanniy@ira-labs.com
ORCID iD: 0009-0005-8804-2102
Russian Federation, Moscow
Yury S. Esakov
Moscow City Clinical Oncological Hospital № 1
Email: lungsurgery@mail.ru
ORCID iD: 0000-0002-5933-924X
SPIN-code: 8424-0756
MD, Cand. Sci. (Medicine)
Russian Federation, MoscowZurab G. Tukvadze
Moscow City Clinical Oncological Hospital № 1
Email: tukvadze.z.med@gmail.com
ORCID iD: 0000-0002-4550-6107
Russian Federation, Moscow
Andrey O. Nefedov
Saint-Petersburg State Research Institute of Phthisiopulmonology
Email: herurg78@mail.ru
ORCID iD: 0000-0001-6228-182X
SPIN-code: 2365-9458
MD, Cand. Sci. (Medicine)
Russian Federation, Saint PetersburgPiotr K. Yablonskii
Saint-Petersburg State Research Institute of Phthisiopulmonology
Email: glhirurgb2@mail.ru
ORCID iD: 0000-0003-4385-9643
SPIN-code: 3433-2624
MD, Dr. Sci. (Medicine), Professor
Russian Federation, Saint PetersburgPavel V. Gavrilov
Saint-Petersburg State Research Institute of Phthisiopulmonology
Email: spbniifrentgen@mail.ru
ORCID iD: 0000-0003-3251-4084
SPIN-code: 7824-5374
MD, Cand. Sci. (Med.)
Russian Federation, Saint PetersburgVadim V. Kozlov
Novosibirsk Regional Clinical Oncology Dispensary
Email: vadimkozlov80@mail.ru
ORCID iD: 0000-0003-3211-5139
SPIN-code: 8045-4286
MD, Cand. Sci. (Medicine)
Russian Federation, NovosibirskMariya E. Blokhina
AstraZeneca Pharmaceuticals LLC
Email: mariya.blokhina@astrazeneca.com
ORCID iD: 0009-0002-9008-9485
MD
Russian Federation, MoscowElena A. Nalivkina
AstraZeneca Pharmaceuticals LLC
Email: elena.nalivkina@astrazeneca.com
ORCID iD: 0009-0003-5412-9643
Russian Federation, Moscow
Victor A. Gombolevskiy
IRA Labs; Artificial Intelligence Research Institute
Email: gombolevskii@gmail.com
ORCID iD: 0000-0003-1816-1315
SPIN-code: 6810-3279
MD, Cand. Sci. (Med.)
Russian Federation, Moscow; MoscowYuriy A. Vasilev
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies
Email: VasilevYA1@zdrav.mos.ru
ORCID iD: 0000-0002-5283-5961
SPIN-code: 4458-5608
MD, Dr. Sci. (Medicine)
Russian Federation, MoscowMariya N. Dugova
IRA Labs
Email: m.dugova@ira-labs.com
ORCID iD: 0009-0004-5586-8015
MD
Russian Federation, MoscowValeria Yu. Chernina
IRA Labs
Email: v.chernina@ira-labs.com
ORCID iD: 0000-0002-0302-293X
SPIN-code: 8896-8051
MD
Russian Federation, MoscowOlga V. Omelyanskaya
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies
Email: OmelyanskayaOV@zdrav.mos.ru
ORCID iD: 0000-0002-0245-4431
SPIN-code: 8948-6152
Russian Federation, Moscow
Roman V. Reshetnikov
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies
Email: reshetnikov@fbb.msu.ru
ORCID iD: 0000-0002-9661-0254
SPIN-code: 8592-0558
Cand. Sci. (Physics and Mathematics)
Russian Federation, MoscowIvan A. Blokhin
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies
Email: BlokhinIA@zdrav.mos.ru
ORCID iD: 0000-0002-2681-9378
SPIN-code: 3306-1387
MD, Cand. Sci. (Medicine)
Russian Federation, MoscowMikhail G. Belyaev
IRA Labs
Email: belyaevmichel@gmail.com
ORCID iD: 0000-0001-9906-6453
SPIN-code: 2406-1772
Cand. Sci. (Physics and Mathematics)
Russian Federation, MoscowReferences
- Thandra KCh, Barsouk A, Saginala K, et al. Epidemiology of lung cancer. Contemporary Oncology. 2021;25(1):45–52. doi: 10.5114/wo.2021.103829
- Goldstraw P, Chansky K, Crowley J, et al. The IASLC lung cancer staging project: Proposals for revision of the TNM stage groupings in the forthcoming (Eighth) edition of the TNM classification for lung cancer. J Thorac Oncol. 2016;11(1):39–51. doi: 10.1016/j.jtho.2015.09.009
- Tanoue LT, Tanner NT, Gould MK, Silvestri GA. Lung cancer screening. Am J Respir Crit Care Med. 2015;191(1):19–33. doi: 10.1164/rccm.201410-1777CI
- Ettinger DS, Wood DE, Aggarwal C, et al. NCCN guidelines insights: Non small cell lung cancer, version 1. 2020. J Natl Compr Canc Netw. 2019;17(12):1464–1472. doi: 10.6004/jnccn.2019.0059
- Planchard D, Popat S, Kerr K, et al. Metastatic non small cell lung cancer: ESMO clin-ical practice guidelines for diagnosis, treatment and follow up [published correction appears in Ann Oncol. 2019;30(5):863–870. doi: 10.1093/annonc/mdy474]. Ann Oncol. 2018;29(Suppl 4):iv192–iv237. doi: 10.1093/annonc/mdy275
- Heleno B, Siersma V, Brodersen J. Estimation of overdiagnosis of lung cancer in low dose computed tomography screening: A secondary analysis of the danish lung cancer screening trial. JAMA Intern Med. 2018;178(10):1420–1422. doi: 10.1001/jamainternmed.2018.3056
- Lopes Pegna A, Picozzi G, Falaschi F, et al. Four year results of low dose CT screen ing and nodule management in the ITALUNG trial. J Thorac Oncol. 2013;8(7):866–875. doi: 10.1097/JTO.0b013e31828f68d6
- Infante M, Cavuto S, Lutman FR, et al. Long term follow up results of the DANTE trial, a randomized study of lung cancer screening with spiral computed tomography. Am J Respir Crit Care Med. 2015;191(10):1166–1175. doi: 10.1164/rccm.201408-1475OC
- De Koning H, van der Aalst C, de Jong P. Reduced lung cancer mortality with volume CT screening in a randomized trial. N Engl J Med. 2020;382(6):503–513. doi: 10.1056/NEJMoa1911793
- Pastorino U, Silva M, Sestini S, et al. Prolonged lung cancer screening reduced 10-year mortality in the MILD trial: new confirmation of lung cancer screening efficacy. Ann Oncol. 2019;30(10):1672. doi: 10.1093/annonc/mdz169
- Baldwin DR, Duffy SW, Wald NJ, et al. UK Lung Screen (UKLS) nodule management protocol: modelling of a single screen randomised controlled trial of low dose CT screening for lung cancer. Thorax. 2011;66(4):308–313. doi: 10.1136/thx.2010.152066
- Detterbeck FC, Boffa DJ, Kim AW, Tanoue LT. The eighth edition lung cancer stage classification. Chest. 2017;151(1):193–203. doi: 10.1016/j.chest.2016.10.010
- Nakajima T, Yasufuku K, Yoshino I. Current status and perspective of EBUS-TBNA. Gen Thorac Cardiovasc Surg. 2013;61(7):390–396. doi: 10.1007/s11748-013-0224-6
- Hartert M, Tripsky J, Huertgen M. Video-assisted mediastinoscopic lymphadenecto-my (VAMLA) for staging & treatment of non small cell lung cancer (NSCLC). Mediastinum. 2020;4:3. doi: 10.21037/med.2019.09.06
- Ettinger DS, Wood DE, Aisner DL, et al. Non small cell lung cancer, version 5.2017, NCCN clinical practice guidelines in oncology. J Natl Compr Canc Netw. 2017;15(4):504–535. doi: 10.6004/jnccn.2017.0050
- Roberts PF, Follette DM, von Haag D, et al. Factors associated with false positive staging of lung cancer by positron emission tomography. Ann Thorac Surg. 2000;70(4):1154–1160. doi: 10.1016/s0003-4975(00)01769-0
- Kanzaki R, Higashiyama M, Fujiwara A, et al. Occult mediastinal lymph node metas-tasis in NSCLC patients diagnosed as clinical N0-1 by preoperative integrated FDG-PET/CT and CT: risk factors, pattern, and histopathological study. Lung Cancer. 2011;71(3):333–337. doi: 10.1016/j.lungcan.2010.06.008
- Verduzco-Aguirre HC, Lopes G, Soto Perez De Celis E. Implementation of diagnostic resources for cancer in developing countries: a focus on PET/CT. Ecancermedical science. 2019;13:ed87. doi: 10.3332/ecancer.2019.ed87
- LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–444. doi: 10.1038/nature14539
- Guo D, Ye X, Ge J, et al. Deepstationing: thoracic lymph node station parsing in CT scans using anatomical context encoding and key organ auto search. In: International Conference on Medical Image Computing and Computer Assisted Intervention; 2021 September 27–October 1; Strasbourg. Available from: https://miccai2021.org/openaccess/paperlinks/2021/09/01/140-Paper0015.html
- Iuga AI, Carolus H, Höink AJ, et al. Automated detection and segmentation of thorac ic lymph nodes from CT using 3D foveal fully convolutional neural networks. BMC Med Imaging. 2021;21(1):69. doi: 10.1186/s12880-021-00599-z
- Iuga AI, Lossau T, Caldeira LL, et al. Automated mapping and N-staging of thoracic lymph nodes in contrast-enhanced CT scans of the chest using a fully convolutional neural network. Eur J Radiol. 2021;139:109718. doi: 10.1016/j.ejrad.2021.109718
- Zhong Y, Yuan M, Zhang T, et al. Radiomics approach to prediction of occult medi astinal lymph node metastasis of lung adenocarcinoma. AJR Am J Roentgenol. 2018;211(1):109–113. doi: 10.2214/AJR.17.19074
- Liu Y, Kim J, Balagurunathan Y, et al. Prediction of pathological nodal involvement by CT-based Radiomic features of the primary tumor in patients with clinically node negative pe ripheral lung adenocarcinomas. Med Phys. 2018;45(6):2518–2526. doi: 10.1002/mp.12901
- Cong M, Yao H, Liu H, et al. Development and evaluation of a venous computed to mography radiomics model to predict lymph node metastasis from non small cell lung cancer. Medicine (Baltimore). 2020;99(18):e20074. doi: 10.1097/MD.0000000000020074
- Gu P, Zhao YZ, Jiang LY, et al. Endobronchial ultrasound guided transbronchial nee dle aspiration for staging of lung cancer: a systematic review and meta analysis. Eur J Cancer. 2009;45(8):1389–1396. doi: 10.1016/j.ejca.2008.11.043
- Brown G, Richards CJ, Bourne MW, et al. Morphologic predictors of lymph node sta tus in rectal cancer with use of high spatial resolution MR imaging with histopathologic comparison. Radiology. 2003;227(2):371–377. doi: 10.1148/radiol.2272011747
- Som PM. Lymph nodes of the neck. Radiology. 1987;165(3):593–600. doi: 10.1148/radiology.165.3.3317494
- Curtin HD, Ishwaran H, Mancuso AA, et al. Comparison of CT and MR imaging in staging of neck metastases. Radiology. 1998;207(1):123–130. doi: 10.1148/radiology.207.1.9530307
- Loch FN, Asbach P, Haas M, et al. Accuracy of various criteria for lymph node stag ing in ductal adenocarcinoma of the pancreatic head by computed tomography and magnetic reso nance imaging. World J Surg Oncol. 2020;18(1):213. doi: 10.1186/s12957-020-01951-3
- Elsholtz FH, Asbach P, Haas M, et al. Introducing the node reporting and data system 1.0 (Node-RADS): a concept for standardized assessment of lymph nodes in cancer Eur Radiol. 2021;31(8):7217. Eur Radiol. 2021;31(9):6116–6124. doi: 10.1007/s00330-020-07572-4 Сorrected and republished from: Eur Radiol. 2021;31(9): 7217. doi: 10.1007/s00330-021-07795-z
- Ceylan N, Doğan S, Kocaçelebi K, et al. Contrast enhanced CT versus integrated PET-CT in pre-operative nodal staging of non-small cell lung cancer. Diagn Interv Radiol. 2012;18(5):435–440. doi: 10.4261/1305-3825.DIR.5100-11.2
- Kamnitsas K, Ledig C, Newcombe VF, et al. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med Image Anal. 2017;36:61–78. doi: 10.1016/j.media.2016.10.004
- Çiçek Ö, Abdulkadir A, Lienkamp SS, et al. 3D U-net: learning dense volumetric segmentation from sparse annotation. In: Medical Image Computing and Computer Assisted Inter vention (MICCAI 2016), Part II: 19th International Conference; 2016 October 17–21; Athens. P. 424–432.
- Milletari F, Navab N, Ahmadi SA. V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth international conference on 3D vision (3DV): proceedings article. 2016 October 25–28; California. P. 565–571. doi: 10.1109/3DV.2016.79
- Van Ginneken B, Armato SG, de Hoop B, et al. Comparing and combining algorithms for computer aided detection of pulmonary nodules in computed tomography scans: the ANODE09 study. Med Image Anal. 2010;14(6):707–722. doi: 10.1016/j.media.2010.05.005
- Bakas S, Reyes M, Jakab A, et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge. The international multimodal brain tumor segmentation (BraTS) challenge. 2018. doi: 10.48550/arXiv.1811.02629
- Silva F, Pereira T, Frade J, et al. Pre training autoencoder for lung nodule malignancy assessment using CT images. Applied Sciences. 2020;10(21):7837. doi: 10.3390/app10217837
- Dubost F, Adams H, Yilmaz P, et al. Weakly supervised object detection with 2D and 3D regression neural networks. Med Image Anal. 2020;65:101767. doi: 10.1016/j.media.2020.101767
- Rusch VW, Asamura H, Watanabe H, et al. The IASLC lung cancer staging project: a proposal for a new international lymph node map in the forthcoming seventh edition of the TNM classification for lung cancer. J Thorac Oncol. 2009;4(5):568–577. doi: 10.1097/JTO.0b013e3181a0d82e
- Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI 2015): 18th International Conference; 2015 May; Munich; Р. 234–241. doi: 10.48550/arXiv.1505.04597
- He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016 June 27–30; Las Vegas. P. 770–778. doi: 10.48550/arXiv.1512.03385
- Ioffe S, Szegedy Ch. Batch normalization: accelerating deep network training by reducing internal covariate shift. ArXiv. 2015;1. doi: 10.48550/arXiv.1502.03167
- Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Conference: proceedings of the 27th International Conference on Machine Learning (ICML-10); 2010 June 21–24; Haifa. Available from: https://icml.cc/Conferences/2010/papers/432.pdf
- Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Conference: proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21–24, 2010. Haifa, Israel; 2010. Р. 807–814.
- Roth HR, Lu L, Seff A, et al. A new 2.5D representation for lymph node detection using random sets of deep convolutional neural network observations. Med Image Comput Comput Assist Interv. 2014;17(1):520–527. doi: 10.1007/978-3-319-10404-1_65
- Goncharov M, Pisov M, Shevtsov A, et al. CT-based COVID-19 triage: deep multi-task learning improves joint identification and severity quantification. Med Image Anal. 2021;71:102054. doi: 10.1016/j.media.2021.102054
Supplementary files
