-
Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study.
Upper gastrointestinal cancers (including oesophageal cancer and gastric cancer) are the most common cancers worldwide. Artificial intelligence platforms using deep learning algorithms have made remarkable progress in medical imaging but their application in upper gastrointestinal cancers has been limited. We aimed to develop and validate the Gastrointestinal Artificial Intelligence Diagnostic System (GRAIDS) for the diagnosis of upper gastrointestinal cancers through analysis of imaging data from clinical endoscopies.
This multicentre, case-control, diagnostic study was done in six hospitals of different tiers (ie, municipal, provincial, and national) in China. The images of consecutive participants, aged 18 years or older, who had not had a previous endoscopy were retrieved from all participating hospitals. All patients with upper gastrointestinal cancer lesions (including oesophageal cancer and gastric cancer) that were histologically proven malignancies were eligible for this study. Only images with standard white light were deemed eligible. The images from Sun Yat-sen University Cancer Center were randomly assigned (8:1:1) to the training and intrinsic verification datasets for developing GRAIDS, and the internal validation dataset for evaluating the performance of GRAIDS. Its diagnostic performance was evaluated using an internal and prospective validation set from Sun Yat-sen University Cancer Center (a national hospital) and additional external validation sets from five primary care hospitals. The performance of GRAIDS was also compared with endoscopists with three degrees of expertise: expert, competent, and trainee. The diagnostic accuracy, sensitivity, specificity, positive predictive value, and negative predictive value of GRAIDS and endoscopists for the identification of cancerous lesions were evaluated by calculating the 95% CIs using the Clopper-Pearson method.
1 036 496 endoscopy images from 84 424 individuals were used to develop and test GRAIDS. The diagnostic accuracy in identifying upper gastrointestinal cancers was 0·955 (95% CI 0·952-0·957) in the internal validation set, 0·927 (0·925-0·929) in the prospective set, and ranged from 0·915 (0·913-0·917) to 0·977 (0·977-0·978) in the five external validation sets. GRAIDS achieved diagnostic sensitivity similar to that of the expert endoscopist (0·942 [95% CI 0·924-0·957] vs 0·945 [0·927-0·959]; p=0·692) and superior sensitivity compared with competent (0·858 [0·832-0·880], p<0·0001) and trainee (0·722 [0·691-0·752], p<0·0001) endoscopists. The positive predictive value was 0·814 (95% CI 0·788-0·838) for GRAIDS, 0·932 (0·913-0·948) for the expert endoscopist, 0·974 (0·960-0·984) for the competent endoscopist, and 0·824 (0·795-0·850) for the trainee endoscopist. The negative predictive value was 0·978 (95% CI 0·971-0·984) for GRAIDS, 0·980 (0·974-0·985) for the expert endoscopist, 0·951 (0·942-0·959) for the competent endoscopist, and 0·904 (0·893-0·916) for the trainee endoscopist.
GRAIDS achieved high diagnostic accuracy in detecting upper gastrointestinal cancers, with sensitivity similar to that of expert endoscopists and was superior to that of non-expert endoscopists. This system could assist community-based hospitals in improving their effectiveness in upper gastrointestinal cancer diagnoses.
The National Key R&D Program of China, the Natural Science Foundation of Guangdong Province, the Science and Technology Program of Guangdong, the Science and Technology Program of Guangzhou, and the Fundamental Research Funds for the Central Universities.
Luo H
,Xu G
,Li C
,He L
,Luo L
,Wang Z
,Jing B
,Deng Y
,Jin Y
,Li Y
,Li B
,Tan W
,He C
,Seeruttun SR
,Wu Q
,Huang J
,Huang DW
,Chen B
,Lin SB
,Chen QM
,Yuan CM
,Chen HX
,Pu HY
,Zhou F
,He Y
,Xu RH
... -
《-》
-
Highly sensitive detection platform-based diagnosis of oesophageal squamous cell carcinoma in China: a multicentre, case-control, diagnostic study.
Early detection and screening of oesophageal squamous cell carcinoma rely on upper gastrointestinal endoscopy, which is not feasible for population-wide implementation. Tumour marker-based blood tests offer a potential alternative. However, the sensitivity of current clinical protein detection technologies is inadequate for identifying low-abundance circulating tumour biomarkers, leading to poor discrimination between individuals with and without cancer. We aimed to develop a highly sensitive blood test tool to improve detection of oesophageal squamous cell carcinoma.
We designed a detection platform named SENSORS and validated its effectiveness by comparing its performance in detecting the selected serological biomarkers MMP13 and SCC against ELISA and electrochemiluminescence immunoassay (ECLIA). We then developed a SENSORS-based oesophageal squamous cell carcinoma adjunct diagnostic system (with potential applications in screening and triage under clinical supervision) to classify individuals with oesophageal squamous cell carcinoma and healthy controls in a retrospective study including participants (cohort I) from Sun Yat-sen University Cancer Center (SYSUCC; Guangzhou, China), Henan Cancer Hospital (HNCH; Zhengzhou, China), and Cancer Hospital of Shantou University Medical College (CHSUMC; Shantou, China). The inclusion criteria were age 18 years or older, pathologically confirmed primary oesophageal squamous cell carcinoma, and no cancer treatments before serum sample collection. Participants without oesophageal-related diseases were recruited from the health examination department as the control group. The SENSORS-based diagnostic system is based on a multivariable logistic regression model that uses the detection values of SENSORS as the input and outputs a risk score for the predicted likelihood of oesophageal squamous cell carcinoma. We further evaluated the clinical utility of the system in an independent prospective multicentre study with different participants selected from the same three institutions. Patients with newly diagnosed oesophageal-related diseases without previous cancer treatment were enrolled. The inclusion criteria for healthy controls were no obvious abnormalities in routine blood and tumour marker tests, no oesophageal-associated diseases, and no history of cancer. Finally, we assessed whether classification could be improved by integrating machine-learning algorithms with the system, which combined baseline clinical characteristics, epidemiological risk factors, and serological tumour marker concentrations. Retrospective SYSUCC cohort I (randomly assigned [7:3] to a training set and an internal validation set) and three prospective validation sets (SYSUCC cohort II [internal validation], HNCH cohort II [external validation], and CHSUMC cohort II [external validation]) were used in this step. Six machine-learning algorithms were compared (the least absolute shrinkage and selector operator regression, ridge regression, random forest, logistic regression, support vector machine, and neural network), and the best-performing algorithm was chosen as the final prediction model. Performance of SENSORS and the SENSORS-based diagnostic system was primarily assessed using accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).
Between Oct 1, 2017, and April 30, 2020, 1051 participants were included in the retrospective study. In the prospective diagnostic study, 924 participants were included from April 2, 2022, to Feb 2, 2023. Compared with ELISA (108·90 pg/mL) and ECLIA (41·79 pg/mL), SENSORS (243·03 fg/mL) showed 448 times and 172 times improvements, respectively. In the three retrospective validation sets, the SENSORS-based diagnostic system achieved AUCs of 0·95 (95% CI 0·90-0·99) in the SYSUCC internal validation set, 0·93 (0·89-0·97) in the HNCH external validation set, and 0·98 (0·97-1·00) in the CHSUMC external validation set, sensitivities of 87·1% (79·3-92·3), 98·6% (94·4-99·8), and 93·5% (88·1-96·7), and specificities of 88·9% (75·2-95·8), 74·6% (61·3-84·6), and 92·1% (81·7-97·0), respectively, successfully distinguishing between patients with oesophageal squamous cell carcinoma and healthy controls. Additionally, in three prospective validation cohorts, it yielded sensitivities of 90·9% (95% CI 86·1-94·2) for SYSUCC, 84·8% (76·1-90·8) for HNCH, and 95·2% (85·6-98·7) for CHSUMC. Of the six machine-learning algorithms compared, the random forest model showed the best performance. A feature selection step identified five features to have the highest performance to predictions (SCC, age, MMP13, CEA, and NSE) and a simplified random forest model using these five features further improved classification, achieving sensitivities of 98·2% (95% CI 93·2-99·7) in the internal validation set from retrospective SYSUCC cohort I, 94·1% (89·9-96·7) in SYSUCC prospective cohort II, 88·6% (80·5-93·7) in HNCH prospective cohort II, and 98·4% (90·2-99·9) in CHSUMC prospective cohort II.
The SENSORS system facilitates highly sensitive detection of oesophageal squamous cell carcinoma tumour biomarkers, overcoming the limitations of detecting low-abundance circulating proteins, and could substantially improve oesophageal squamous cell carcinoma diagnostics. This method could act as a minimally invasive screening tool, potentially reducing the need for unnecessary endoscopies.
The National Key R&D Program of China, the National Natural Science Foundation of China, and the Enterprises Joint Fund-Key Program of Guangdong Province.
For the Chinese translation of the abstract see Supplementary Materials section.
Wang Y
,Xing S
,Xu YW
,Xu QX
,Ji MF
,Peng YH
,Wu YX
,Wu M
,Xue N
,Zhang B
,Xie SH
,Zhu RD
,Ou XY
,Huang Q
,Tian BY
,Li HL
,Jiang Y
,Yao XB
,Li JP
,Ling L
,Cao SM
,Zhong Q
,Liu WL
,Zeng MS
... -
《The Lancet Digital Health》
-
Artificial intelligence-based model for lymph node metastases detection on whole slide images in bladder cancer: a retrospective, multicentre, diagnostic study.
Accurate lymph node staging is important for the diagnosis and treatment of patients with bladder cancer. We aimed to develop a lymph node metastases diagnostic model (LNMDM) on whole slide images and to assess the clinical effect of an artificial intelligence-assisted (AI) workflow.
In this retrospective, multicentre, diagnostic study in China, we included consecutive patients with bladder cancer who had radical cystectomy and pelvic lymph node dissection, and from whom whole slide images of lymph node sections were available, for model development. We excluded patients with non-bladder cancer and concurrent surgery, or low-quality images. Patients from two hospitals (Sun Yat-sen Memorial Hospital of Sun Yat-sen University and Zhujiang Hospital of Southern Medical University, Guangzhou, Guangdong, China) were assigned before a cutoff date to a training set and after the date to internal validation sets for each hospital. Patients from three other hospitals (the Third Affiliated Hospital of Sun Yat-sen University, Nanfang Hospital of Southern Medical University, and the Third Affiliated Hospital of Southern Medical University, Guangzhou, Guangdong, China) were included as external validation sets. A validation subset of challenging cases from the five validation sets was used to compare performance between the LNMDM and pathologists, and two other datasets (breast cancer from the CAMELYON16 dataset and prostate cancer from the Sun Yat-sen Memorial Hospital of Sun Yat-sen University) were collected for a multi-cancer test. The primary endpoint was diagnostic sensitivity in the four prespecified groups (ie, the five validation sets, a single-lymph-node test set, the multi-cancer test set, and the subset for a performance comparison between the LNMDM and pathologists).
Between Jan 1, 2013 and Dec 31, 2021, 1012 patients with bladder cancer had radical cystectomy and pelvic lymph node dissection and were included (8177 images and 20 954 lymph nodes). We excluded 14 patients (165 images) with concurrent non-bladder cancer and also excluded 21 low-quality images. We included 998 patients and 7991 images (881 [88%] men; 117 [12%] women; median age 64 years [IQR 56-72]; ethnicity data not available; 268 [27%] with lymph node metastases) to develop the LNMDM. The area under the curve (AUC) for accurate diagnosis of the LNMDM ranged from 0·978 (95% CI 0·960-0·996) to 0·998 (0·996-1·000) in the five validation sets. Performance comparisons between the LNMDM and pathologists showed that the diagnostic sensitivity of the model (0·983 [95% CI 0·941-0·998]) substantially exceeded that of both junior pathologists (0·906 [0·871-0·934]) and senior pathologists (0·947 [0·919-0·968]), and that AI assistance improved sensitivity for both junior (from 0·906 without AI to 0·953 with AI) and senior (from 0·947 to 0·986) pathologists. In the multi-cancer test, the LNMDM maintained an AUC of 0·943 (95% CI 0·918-0·969) in breast cancer images and 0·922 (0·884-0·960) in prostate cancer images. In 13 patients, the LNMDM detected tumour micrometastases that had been missed by pathologists who had previously classified these patients' results as negative. Receiver operating characteristic curves showed that the LNMDM would enable pathologists to exclude 80-92% of negative slides while maintaining 100% sensitivity in clinical application.
We developed an AI-based diagnostic model that did well in detecting lymph node metastases, particularly micrometastases. The LNMDM showed substantial potential for clinical applications in improving the accuracy and efficiency of pathologists' work.
National Natural Science Foundation of China, the Science and Technology Planning Project of Guangdong Province, the National Key Research and Development Programme of China, and the Guangdong Provincial Clinical Research Centre for Urological Diseases.
Wu S
,Hong G
,Xu A
,Zeng H
,Chen X
,Wang Y
,Luo Y
,Wu P
,Liu C
,Jiang N
,Dang Q
,Yang C
,Liu B
,Shen R
,Chen Z
,Liao C
,Lin Z
,Wang J
,Lin T
... -
《-》
-
Development and validation of a real-time artificial intelligence-assisted system for detecting early gastric cancer: A multicentre retrospective diagnostic study.
We aimed to develop and validate a real-time deep convolutional neural networks (DCNNs) system for detecting early gastric cancer (EGC).
All 45,240 endoscopic images from 1364 patients were divided into a training dataset (35823 images from 1085 patients) and a validation dataset (9417 images from 279 patients). Another 1514 images from three other hospitals were used as external validation. We compared the diagnostic performance of the DCNN system with endoscopists, and then evaluated the performance of endoscopists with or without referring to the system. Thereafter, we evaluated the diagnostic ability of the DCNN system in video streams. The accuracy, sensitivity, specificity, positive predictive value, negative predictive value and Cohen's kappa coefficient were measured to assess the detection performance.
The DCNN system showed good performance in EGC detection in validation datasets, with accuracy (85.1%-91.2%), sensitivity (85.9%-95.5%), specificity (81.7%-90.3%), and AUC (0.887-0.940). The DCNN system showed better diagnostic performance than endoscopists and improved the performance of endoscopists. The DCNN system was able to process oesophagogastroduodenoscopy (OGD) video streams to detect EGC lesions in real time.
We developed a real-time DCNN system for EGC detection with high accuracy and stability. Multicentre prospective validation is needed to acquire high-level evidence for its clinical application.
This work was supported by the National Natural Science Foundation of China (grant nos. 81672935 and 81871947), Jiangsu Clinical Medical Center of Digestive System Diseases and Gastrointestinal Cancer (grant no. YXZXB2016002), and Nanjing Science and Technology Development Foundation (grant no. 2017sb332019).
Tang D
,Wang L
,Ling T
,Lv Y
,Ni M
,Zhan Q
,Fu Y
,Zhuang D
,Guo H
,Dou X
,Zhang W
,Xu G
,Zou X
... -
《EBioMedicine》
-
Deep learning-based artificial intelligence model to assist thyroid nodule diagnosis and management: a multicentre diagnostic study.
Strategies for integrating artificial intelligence (AI) into thyroid nodule management require additional development and testing. We developed a deep-learning AI model (ThyNet) to differentiate between malignant tumours and benign thyroid nodules and aimed to investigate how ThyNet could help radiologists improve diagnostic performance and avoid unnecessary fine needle aspiration.
ThyNet was developed and trained on 18 049 images of 8339 patients (training set) from two hospitals (the First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China, and Sun Yat-sen University Cancer Center, Guangzhou, China) and tested on 4305 images of 2775 patients (total test set) from seven hospitals (the First Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China; the Sixth Affiliated Hospital of Sun Yat-sen University, Guangzhou, China; the Guangzhou Army General Hospital, Guangzhou, China; the Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China; the First Affiliated Hospital of Sun Yat-sen University; Sun Yat-sen University Cancer Center; and the First Affiliated Hospital of Guangxi Medical University, Nanning, China) in three stages. All nodules in the training and total test set were pathologically confirmed. The diagnostic performance of ThyNet was first compared with 12 radiologists (test set A); a ThyNet-assisted strategy, in which ThyNet assisted diagnoses made by radiologists, was developed to improve diagnostic performance of radiologists using images (test set B); the ThyNet assisted strategy was then tested in a real-world clinical setting (using images and videos; test set C). In a simulated scenario, the number of unnecessary fine needle aspirations avoided by ThyNet-assisted strategy was calculated.
The area under the receiver operating characteristic curve (AUROC) for accurate diagnosis of ThyNet (0·922 [95% CI 0·910-0·934]) was significantly higher than that of the radiologists (0·839 [0·834-0·844]; p<0·0001). Furthermore, ThyNet-assisted strategy improved the pooled AUROC of the radiologists from 0·837 (0·832-0·842) when diagnosing without ThyNet to 0·875 (0·871-0·880; p<0·0001) with ThyNet for reviewing images, and from 0·862 (0·851-0·872) to 0·873 (0·863-0·883; p<0·0001) in the clinical test, which used images and videos. In the simulated scenario, the number of fine needle aspirations decreased from 61·9% to 35·2% using the ThyNet-assisted strategy, while missed malignancy decreased from 18·9% to 17·0%.
The ThyNet-assisted strategy can significantly improve the diagnostic performance of radiologists and help reduce unnecessary fine needle aspirations for thyroid nodules.
National Natural Science Foundation of China and Guangzhou Science and Technology Project.
Peng S
,Liu Y
,Lv W
,Liu L
,Zhou Q
,Yang H
,Ren J
,Liu G
,Wang X
,Zhang X
,Du Q
,Nie F
,Huang G
,Guo Y
,Li J
,Liang J
,Hu H
,Xiao H
,Liu Z
,Lai F
,Zheng Q
,Wang H
,Li Y
,Alexander EK
,Wang W
,Xiao H
... -
《The Lancet Digital Health》