New vision of HookEfficientNet deep neural network: Intelligent histopathological recognition system of non-small cell lung cancer.
Efficient and precise diagnosis of non-small cell lung cancer (NSCLC) is quite critical for subsequent targeted therapy and immunotherapy. Since the advent of whole slide images (WSIs), the transition from traditional histopathology to digital pathology has aroused the application of convolutional neural networks (CNNs) in histopathological recognition and diagnosis. HookNet can make full use of macroscopic and microscopic information for pathological diagnosis, but it cannot integrate other excellent CNN structures. The new version of HookEfficientNet is based on a combination of HookNet structure and EfficientNet that performs well in the recognition of general objects. Here, a high-precision artificial intelligence-guided histopathological recognition system was established by HookEfficientNet to provide a basis for the intelligent differential diagnosis of NSCLC.
A total of 216 WSIs of lung adenocarcinoma (LUAD) and 192 WSIs of lung squamous cell carcinoma (LUSC) were recruited from the First Affiliated Hospital of Zhengzhou University. Deep learning methods based on HookEfficientNet, HookNet and EfficientNet B4-B6 were developed and compared with each other using area under the curve (AUC) and the Youden index. Temperature scaling was used to calibrate the heatmap and highlight the cancer region of interest. Four pathologists of different levels blindly reviewed 108 WSIs of LUAD and LUSC, and the diagnostic results were compared with the various deep learning models.
The HookEfficientNet model outperformed HookNet and EfficientNet B4-B6. After temperature scaling, the HookEfficientNet model achieved AUCs of 0.973, 0.980, and 0.989 and Youden index values of 0.863, 0.899, and 0.922 for LUAD, LUSC and normal lung tissue, respectively, in the testing set. The accuracy of the model was better than the average accuracy from experienced pathologists, and the model was superior to pathologists in the diagnosis of LUSC.
HookEfficientNet can effectively recognize LUAD and LUSC with performance superior to that of senior pathologists, especially for LUSC. The model has great potential to facilitate the application of deep learning-assisted histopathological diagnosis for LUAD and LUSC in the future.
Yuan H
,Kido T
,Hirata M
,Ueno K
,Imai Y
,Chen K
,Ren W
,Yang L
,Chen K
,Qu L
,Wu Y
... -
《-》
Automatic discrimination between neuroendocrine carcinomas and grade 3 neuroendocrine tumors by deep learning of H&E images.
Neuroendocrine neoplasms (NENs) arise from diffuse neuroendocrine cells and are categorized as either well-differentiated and less proliferative Neuroendocrine Tumors (NETs), divided into low (G1), middle (G2), and high grades (G3), or poorly differentiated, and more proliferative Neuroendocrine Carcinomas (NECs). Low-grade NENs typically necessitate surgical intervention, whereas high-grade ones often require chemotherapy. However, low-grade NENs may exhibit aggressive behavior. Therefore, it is crucial to precisely refine the diagnosis of NENs. This refinement is achievable when differentiation/non-differentiation is evident or when the Ki-67 or mitosis index is low. The challenge arises in cases of morphologically undifferentiated instances with a high Ki-67 percentage and/or high mitotic index. To address this challenge, we developed a Deep Learning (DL) system named NEToC, designed to differentiate between NETs and NECs using exclusively morphological information from immunohistochemistry images, without relying on Ki-67 or mitosis assessments. NEToC was developed using 95 NEN cases from the period 2015 to 2018 at Parc Tauli Hospital in Spain, comprising 588 images. Implemented as a Graphical User Interface (GUI) system, NEToC is intended for deployment in pathological departments of hospitals to perform federated supervision. We tested the performance of NEToC with 119 images that were not used during the Artificial Neural Network (ANN) training phase, and evaluated its robustness across various resolutions: 64 × 64, 128 × 128, 256 × 256, and 512 × 512 pixels. The achieved accuracies for these resolutions were 74 %, 98 %, 98 %, and 100 %, respectively, for an underrepresented NET G3 experiment, and 66 %, 89 %, 95 % and 94 % for a represented NET G3 experiment. Based on several measured performance metrics, the optimal resolution appears to be between 128 × 128 and 256 × 256 pixels, considering computational resources and accuracy requirements. However, we found that the 256 × 256-pixel resolution is more robust to classify underrepresented classes in the learning phase. These results imply that the information to discriminate between NECs and Grade 3 NETs needs to be resolved in regions with a pixel resolution of no more than 4 μm/pixel. Most of the misclassifications were false negatives, where NET G1-type images were erroneously classified as NEC-type. Our results demonstrate that a DL-based diagnostic algorithm provides a more accurate diagnosis in NEN cases where physicians face challenges. NEToC has been initially trained with and used to classify gastrointestinal NENs. Since the NEN morphology does not change among the different organs, the use of NEToC can be extrapolated to NENs from different organs. NEToC facilitates federated supervision, allowing pathologists to collect interchangeable files based on NEToC classification predictions. NEToC is an easy-to-use, adaptable software that integrates multiple ANNs to improve standardization and accuracy NEN diagnosis, opening up possibilities for combining DL and histological diagnosis in federated supervision systems. A future goal is to classify not only NETs, but also the three-tier system (NET G1, NET G2, and NET G3) based solely on tissue differentiation information.
Arrieta Legorburu A
,Bohoyo Bengoetxea J
,Gracia C
,Ferreres JC
,Bella-Cueto MR
,Araúzo-Bravo MJ
... -
《-》
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.
Survival estimation for patients with symptomatic skeletal metastases ideally should be made before a type of local treatment has already been determined. Currently available survival prediction tools, however, were generated using data from patients treated either operatively or with local radiation alone, raising concerns about whether they would generalize well to all patients presenting for assessment. The Skeletal Oncology Research Group machine-learning algorithm (SORG-MLA), trained with institution-based data of surgically treated patients, and the Metastases location, Elderly, Tumor primary, Sex, Sickness/comorbidity, and Site of radiotherapy model (METSSS), trained with registry-based data of patients treated with radiotherapy alone, are two of the most recently developed survival prediction models, but they have not been tested on patients whose local treatment strategy is not yet decided.
(1) Which of these two survival prediction models performed better in a mixed cohort made up both of patients who received local treatment with surgery followed by radiotherapy and who had radiation alone for symptomatic bone metastases? (2) Which model performed better among patients whose local treatment consisted of only palliative radiotherapy? (3) Are laboratory values used by SORG-MLA, which are not included in METSSS, independently associated with survival after controlling for predictions made by METSSS?
Between 2010 and 2018, we provided local treatment for 2113 adult patients with skeletal metastases in the extremities at an urban tertiary referral academic medical center using one of two strategies: (1) surgery followed by postoperative radiotherapy or (2) palliative radiotherapy alone. Every patient's survivorship status was ascertained either by their medical records or the national death registry from the Taiwanese National Health Insurance Administration. After applying a priori designated exclusion criteria, 91% (1920) were analyzed here. Among them, 48% (920) of the patients were female, and the median (IQR) age was 62 years (53 to 70 years). Lung was the most common primary tumor site (41% [782]), and 59% (1128) of patients had other skeletal metastases in addition to the treated lesion(s). In general, the indications for surgery were the presence of a complete pathologic fracture or an impending pathologic fracture, defined as having a Mirels score of ≥ 9, in patients with an American Society of Anesthesiologists (ASA) classification of less than or equal to IV and who were considered fit for surgery. The indications for radiotherapy were relief of pain, local tumor control, prevention of skeletal-related events, and any combination of the above. In all, 84% (1610) of the patients received palliative radiotherapy alone as local treatment for the target lesion(s), and 16% (310) underwent surgery followed by postoperative radiotherapy. Neither METSSS nor SORG-MLA was used at the point of care to aid clinical decision-making during the treatment period. Survival was retrospectively estimated by these two models to test their potential for providing survival probabilities. We first compared SORG to METSSS in the entire population. Then, we repeated the comparison in patients who received local treatment with palliative radiation alone. We assessed model performance by area under the receiver operating characteristic curve (AUROC), calibration analysis, Brier score, and decision curve analysis (DCA). The AUROC measures discrimination, which is the ability to distinguish patients with the event of interest (such as death at a particular time point) from those without. AUROC typically ranges from 0.5 to 1.0, with 0.5 indicating random guessing and 1.0 a perfect prediction, and in general, an AUROC of ≥ 0.7 indicates adequate discrimination for clinical use. Calibration refers to the agreement between the predicted outcomes (in this case, survival probabilities) and the actual outcomes, with a perfect calibration curve having an intercept of 0 and a slope of 1. A positive intercept indicates that the actual survival is generally underestimated by the prediction model, and a negative intercept suggests the opposite (overestimation). When comparing models, an intercept closer to 0 typically indicates better calibration. Calibration can also be summarized as log(O:E), the logarithm scale of the ratio of observed (O) to expected (E) survivors. A log(O:E) > 0 signals an underestimation (the observed survival is greater than the predicted survival); and a log(O:E) < 0 indicates the opposite (the observed survival is lower than the predicted survival). A model with a log(O:E) closer to 0 is generally considered better calibrated. The Brier score is the mean squared difference between the model predictions and the observed outcomes, and it ranges from 0 (best prediction) to 1 (worst prediction). The Brier score captures both discrimination and calibration, and it is considered a measure of overall model performance. In Brier score analysis, the "null model" assigns a predicted probability equal to the prevalence of the outcome and represents a model that adds no new information. A prediction model should achieve a Brier score at least lower than the null-model Brier score to be considered as useful. The DCA was developed as a method to determine whether using a model to inform treatment decisions would do more good than harm. It plots the net benefit of making decisions based on the model's predictions across all possible risk thresholds (or cost-to-benefit ratios) in relation to the two default strategies of treating all or no patients. The care provider can decide on an acceptable risk threshold for the proposed treatment in an individual and assess the corresponding net benefit to determine whether consulting with the model is superior to adopting the default strategies. Finally, we examined whether laboratory data, which were not included in the METSSS model, would have been independently associated with survival after controlling for the METSSS model's predictions by using the multivariable logistic and Cox proportional hazards regression analyses.
Between the two models, only SORG-MLA achieved adequate discrimination (an AUROC of > 0.7) in the entire cohort (of patients treated operatively or with radiation alone) and in the subgroup of patients treated with palliative radiotherapy alone. SORG-MLA outperformed METSSS by a wide margin on discrimination, calibration, and Brier score analyses in not only the entire cohort but also the subgroup of patients whose local treatment consisted of radiotherapy alone. In both the entire cohort and the subgroup, DCA demonstrated that SORG-MLA provided more net benefit compared with the two default strategies (of treating all or no patients) and compared with METSSS when risk thresholds ranged from 0.2 to 0.9 at both 90 days and 1 year, indicating that using SORG-MLA as a decision-making aid was beneficial when a patient's individualized risk threshold for opting for treatment was 0.2 to 0.9. Higher albumin, lower alkaline phosphatase, lower calcium, higher hemoglobin, lower international normalized ratio, higher lymphocytes, lower neutrophils, lower neutrophil-to-lymphocyte ratio, lower platelet-to-lymphocyte ratio, higher sodium, and lower white blood cells were independently associated with better 1-year and overall survival after adjusting for the predictions made by METSSS.
Based on these discoveries, clinicians might choose to consult SORG-MLA instead of METSSS for survival estimation in patients with long-bone metastases presenting for evaluation of local treatment. Basing a treatment decision on the predictions of SORG-MLA could be beneficial when a patient's individualized risk threshold for opting to undergo a particular treatment strategy ranged from 0.2 to 0.9. Future studies might investigate relevant laboratory items when constructing or refining a survival estimation model because these data demonstrated prognostic value independent of the predictions of the METSSS model, and future studies might also seek to keep these models up to date using data from diverse, contemporary patients undergoing both modern operative and nonoperative treatments.
Level III, diagnostic study.
Lee CC
,Chen CW
,Yen HK
,Lin YP
,Lai CY
,Wang JL
,Groot OQ
,Janssen SJ
,Schwab JH
,Hsu FM
,Lin WH
... -
《-》
State-of-the-Art of Breast Cancer Diagnosis in Medical Images via Convolutional Neural Networks (CNNs).
Early detection of breast cancer is crucial for a better prognosis. Various studies have been conducted where tumor lesions are detected and localized on images. This is a narrative review where the studies reviewed are related to five different image modalities: histopathological, mammogram, magnetic resonance imaging (MRI), ultrasound, and computed tomography (CT) images, making it different from other review studies where fewer image modalities are reviewed. The goal is to have the necessary information, such as pre-processing techniques and CNN-based diagnosis techniques for the five modalities, readily available in one place for future studies. Each modality has pros and cons, such as mammograms might give a high false positive rate for radiographically dense breasts, while ultrasounds with low soft tissue contrast result in early-stage false detection, and MRI provides a three-dimensional volumetric image, but it is expensive and cannot be used as a routine test. Various studies were manually reviewed using particular inclusion and exclusion criteria; as a result, 91 recent studies that classify and detect tumor lesions on breast cancer images from 2017 to 2022 related to the five image modalities were included. For histopathological images, the maximum accuracy achieved was around 99 %, and the maximum sensitivity achieved was 97.29 % by using DenseNet, ResNet34, and ResNet50 architecture. For mammogram images, the maximum accuracy achieved was 96.52 % using a customized CNN architecture. For MRI, the maximum accuracy achieved was 98.33 % using customized CNN architecture. For ultrasound, the maximum accuracy achieved was around 99 % by using DarkNet-53, ResNet-50, G-CNN, and VGG. For CT, the maximum sensitivity achieved was 96 % by using Xception architecture. Histopathological and ultrasound images achieved higher accuracy of around 99 % by using ResNet34, ResNet50, DarkNet-53, G-CNN, and VGG compared to other modalities for either of the following reasons: use of pre-trained architectures with pre-processing techniques, use of modified architectures with pre-processing techniques, use of two-stage CNN, and higher number of studies available for Artificial Intelligence (AI)/machine learning (ML) researchers to reference. One of the gaps we found is that only a single image modality is used for CNN-based diagnosis; in the future, a multiple image modality approach can be used to design a CNN architecture with higher accuracy.
Harrison P
,Hasan R
,Park K
《-》