-
Galileo-an Artificial Intelligence tool for evaluating pre-implantation kidney biopsies.
Pre-transplant procurement biopsy interpretation is challenging, also because of the low number of renal pathology experts. Artificial intelligence (AI) can assist by aiding pathologists with kidney donor biopsy assessment. Herein we present the "Galileo" AI tool, designed specifically to assist the on-call pathologist with interpreting pre-implantation kidney biopsies.
A multicenter cohort of whole slide images acquired from core-needle and wedge biopsies of the kidney was collected. A deep learning algorithm was trained to detect the main findings evaluated in the pre-implantation setting (normal glomeruli, globally sclerosed glomeruli, ischemic glomeruli, arterioles and arteries). The model obtained on the Aiforia Create platform was validated on an external dataset by three independent pathologists to evaluate the performance of the algorithm.
Galileo demonstrated a precision, sensitivity, F1 score and total area error of 81.96%, 94.39%, 87.74%, 2.81% and 74.05%, 71.03%, 72.5%, 2% in the training and validation sets, respectively. Galileo was significantly faster than pathologists, requiring 2 min overall in the validation phase (vs 25, 22 and 31 min by 3 separate human readers, p < 0.001). Galileo-assisted detection of renal structures and quantitative information was directly integrated in the final report.
The Galileo AI-assisted tool shows promise in speeding up pre-implantation kidney biopsy interpretation, as well as in reducing inter-observer variability. This tool may represent a starting point for further improvements based on hard endpoints such as graft survival.
Eccher A
,L'Imperio V
,Pantanowitz L
,Cazzaniga G
,Del Carro F
,Marletta S
,Gambaro G
,Barreca A
,Becker JU
,Gobbo S
,Della Mea V
,Alberici F
,Pagni F
,Dei Tos AP
... -
《-》
-
Artificial intelligence applications for pre-implantation kidney biopsy pathology practice: a systematic review.
Transplant nephropathology is a highly specialized field of pathology comprising both the evaluation of organ donor biopsy for organ allocation and post-transplant graft biopsy for assessment of rejection or graft damage. The introduction of digital pathology with whole-slide imaging (WSI) in clinical research, trials and practice has catalyzed the application of artificial intelligence (AI) for histopathology, with development of novel machine-learning models for tissue interrogation and discovery. We aimed to review the literature for studies specifically applying AI algorithms to WSI-digitized pre-implantation kidney biopsy.
A systematic search was carried out in the electronic databases PubMed-MEDLINE and Embase until 25th September, 2021 with a combination of the key terms "kidney", "biopsy", "transplantation" and "artificial intelligence" and their aliases. Studies dealing with the application of AI algorithms coupled with WSI in pre-implantation kidney biopsies were included. The main theme addressed was detection and quantification of tissue components. Extracted data were: author, year and country of the study, type of biopsy features investigated, number of cases, type of algorithm deployed, main results of the study in terms of diagnostic outcome, and the main limitations of the study.
Of 5761 retrieved articles, 7 met our inclusion criteria. All studies focused largely on AI-based detection and classification of glomerular structures and to a lesser extent on tubular and vascular structures. Performance of AI algorithms was excellent and promising.
All studies highlighted the importance of expert pathologist annotation to reliably train models and the need to acknowledge clinical nuances of the pre-implantation setting. Close cooperation between computer scientists and practicing as well as expert renal pathologists is needed, helping to refine the performance of AI-based models for routine pre-implantation kidney biopsy clinical practice.
Girolami I
,Pantanowitz L
,Marletta S
,Hermsen M
,van der Laak J
,Munari E
,Furian L
,Vistoli F
,Zaza G
,Cardillo M
,Gesualdo L
,Gambaro G
,Eccher A
... -
《-》
-
Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study.
An increasing volume of prostate biopsies and a worldwide shortage of urological pathologists puts a strain on pathology departments. Additionally, the high intra-observer and inter-observer variability in grading can result in overtreatment and undertreatment of prostate cancer. To alleviate these problems, we aimed to develop an artificial intelligence (AI) system with clinically acceptable accuracy for prostate cancer detection, localisation, and Gleason grading.
We digitised 6682 slides from needle core biopsies from 976 randomly selected participants aged 50-69 in the Swedish prospective and population-based STHLM3 diagnostic study done between May 28, 2012, and Dec 30, 2014 (ISRCTN84445406), and another 271 from 93 men from outside the study. The resulting images were used to train deep neural networks for assessment of prostate biopsies. The networks were evaluated by predicting the presence, extent, and Gleason grade of malignant tissue for an independent test dataset comprising 1631 biopsies from 246 men from STHLM3 and an external validation dataset of 330 biopsies from 73 men. We also evaluated grading performance on 87 biopsies individually graded by 23 experienced urological pathologists from the International Society of Urological Pathology. We assessed discriminatory performance by receiver operating characteristics and tumour extent predictions by correlating predicted cancer length against measurements by the reporting pathologist. We quantified the concordance between grades assigned by the AI system and the expert urological pathologists using Cohen's kappa.
The AI achieved an area under the receiver operating characteristics curve of 0·997 (95% CI 0·994-0·999) for distinguishing between benign (n=910) and malignant (n=721) biopsy cores on the independent test dataset and 0·986 (0·972-0·996) on the external validation dataset (benign n=108, malignant n=222). The correlation between cancer length predicted by the AI and assigned by the reporting pathologist was 0·96 (95% CI 0·95-0·97) for the independent test dataset and 0·87 (0·84-0·90) for the external validation dataset. For assigning Gleason grades, the AI achieved a mean pairwise kappa of 0·62, which was within the range of the corresponding values for the expert pathologists (0·60-0·73).
An AI system can be trained to detect and grade cancer in prostate needle biopsy samples at a ranking comparable to that of international experts in prostate pathology. Clinical application could reduce pathology workload by reducing the assessment of benign biopsies and by automating the task of measuring cancer length in positive biopsy cores. An AI system with expert-level grading performance might contribute a second opinion, aid in standardising grading, and provide pathology expertise in parts of the world where it does not exist.
Swedish Research Council, Swedish Cancer Society, Swedish eScience Research Center, EIT Health.
Ström P
,Kartasalo K
,Olsson H
,Solorzano L
,Delahunt B
,Berney DM
,Bostwick DG
,Evans AJ
,Grignon DJ
,Humphrey PA
,Iczkowski KA
,Kench JG
,Kristiansen G
,van der Kwast TH
,Leite KRM
,McKenney JK
,Oxley J
,Pan CC
,Samaratunga H
,Srigley JR
,Takahashi H
,Tsuzuki T
,Varma M
,Zhou M
,Lindberg J
,Lindskog C
,Ruusuvuori P
,Wählby C
,Grönberg H
,Rantalainen M
,Egevad L
,Eklund M
... -
《-》
-
A large-scale retrospective study enabled deep-learning based pathological assessment of frozen procurement kidney biopsies to predict graft loss and guide organ utilization.
Lesion scores on procurement donor biopsies are commonly used to guide organ utilization for deceased-donor kidneys. However, frozen sections present challenges for histological scoring, leading to inter- and intra-observer variability and inappropriate discard. Therefore, we constructed deep-learning based models to recognize kidney tissue compartments in hematoxylin & eosin-stained sections from procurement needle biopsies performed nationwide in years 2011-2020. To do this, we extracted whole-slide abnormality features from 2431 kidneys and correlated with pathologists' scores and transplant outcomes. A Kidney Donor Quality Score (KDQS) was derived and used in combination with recipient demographic and peri-transplant characteristics to predict graft loss or assist organ utilization. The performance on wedge biopsies was additionally evaluated. Our model identified 96% and 91% of normal/sclerotic glomeruli respectively; 94% of arteries/arterial intimal fibrosis; 90% of tubules. Whole-slide features of Sclerotic Glomeruli (GS)%, Arterial Intimal Fibrosis (AIF)%, and Interstitial Space Abnormality (ISA)% demonstrated strong correlations with corresponding pathologists' scores of all 2431 kidneys, but had superior associations with post-transplant estimated glomerular filtration rates in 2033 and graft loss in 1560 kidneys. The combination of KDQS and other factors predicted one- and four-year graft loss in a discovery set of 520 kidneys and a validation set of 1040 kidneys. By using the composite KDQS of 398 discarded kidneys due to "biopsy findings", we suggest that if transplanted, 110 discarded kidneys could have had similar survival to that of other transplanted kidneys. Thus, our composite KDQS and survival prediction models may facilitate risk stratification and organ utilization while potentially reducing unnecessary organ discard.
Yi Z
,Xi C
,Menon MC
,Cravedi P
,Tedla F
,Soto A
,Sun Z
,Liu K
,Zhang J
,Wei C
,Chen M
,Wang W
,Veremis B
,Garcia-Barros M
,Kumar A
,Haakinson D
,Brody R
,Azeloglu EU
,Gallon L
,O'Connell P
,Naesens M
,Shapiro R
,Colvin RB
,Ward S
,Salem F
,Zhang W
... -
《-》
-
Development and validation of artificial intelligence-based prescreening of large-bowel biopsies taken in the UK and Portugal: a retrospective cohort study.
Histopathological examination is a crucial step in the diagnosis and treatment of many major diseases. Aiming to facilitate diagnostic decision making and improve the workload of pathologists, we developed an artificial intelligence (AI)-based prescreening tool that analyses whole-slide images (WSIs) of large-bowel biopsies to identify typical, non-neoplastic, and neoplastic biopsies.
This retrospective cohort study was conducted with an internal development cohort of slides acquired from a hospital in the UK and three external validation cohorts of WSIs acquired from two hospitals in the UK and one clinical laboratory in Portugal. To learn the differential histological patterns from digitised WSIs of large-bowel biopsy slides, our proposed weakly supervised deep-learning model (Colorectal AI Model for Abnormality Detection [CAIMAN]) used slide-level diagnostic labels and no detailed cell or region-level annotations. The method was developed with an internal development cohort of 5054 biopsy slides from 2080 patients that were labelled with corresponding diagnostic categories assigned by pathologists. The three external validation cohorts, with a total of 1536 slides, were used for independent validation of CAIMAN. Each WSI was classified into one of three classes (ie, typical, atypical non-neoplastic, and atypical neoplastic). Prediction scores of image tiles were aggregated into three prediction scores for the whole slide, one for its likelihood of being typical, one for its likelihood of being non-neoplastic, and one for its likelihood of being neoplastic. The assessment of the external validation cohorts was conducted by the trained and frozen CAIMAN model. To evaluate model performance, we calculated area under the convex hull of the receiver operating characteristic curve (AUROC), area under the precision-recall curve, and specificity compared with our previously published iterative draw and rank sampling (IDaRS) algorithm. We also generated heat maps and saliency maps to analyse and visualise the relationship between the WSI diagnostic labels and spatial features of the tissue microenvironment. The main outcome of this study was the ability of CAIMAN to accurately identify typical and atypical WSIs of colon biopsies, which could potentially facilitate automatic removing of typical biopsies from the diagnostic workload in clinics.
A randomly selected subset of all large bowel biopsies was obtained between Jan 1, 2012, and Dec 31, 2017. The AI training, validation, and assessments were done between Jan 1, 2021, and Sept 30, 2022. WSIs with diagnostic labels were collected between Jan 1 and Sept 30, 2022. Our analysis showed no statistically significant differences across prediction scores from CAIMAN for typical and atypical classes based on anatomical sites of the biopsy. At 0·99 sensitivity, CAIMAN (specificity 0·5592) was more accurate than an IDaRS-based weakly supervised WSI-classification pipeline (0·4629) in identifying typical and atypical biopsies on cross-validation in the internal development cohort (p<0·0001). At 0·99 sensitivity, CAIMAN was also more accurate than IDaRS for two external validation cohorts (p<0·0001), but not for a third external validation cohort (p=0·10). CAIMAN provided higher specificity than IDaRS at some high-sensitivity thresholds (0·7763 vs 0·6222 for 0·95 sensitivity, 0·7126 vs 0·5407 for 0·97 sensitivity, and 0·5615 vs 0·3970 for 0·99 sensitivity on one of the external validation cohorts) and showed high classification performance in distinguishing between neoplastic biopsies (AUROC 0·9928, 95% CI 0·9927-0·9929), inflammatory biopsies (0·9658, 0·9655-0·9661), and atypical biopsies (0·9789, 0·9786-0·9792). On the three external validation cohorts, CAIMAN had AUROC values of 0·9431 (95% CI 0·9165-0·9697), 0·9576 (0·9568-0·9584), and 0·9636 (0·9615-0·9657) for the detection of atypical biopsies. Saliency maps supported the representation of disease heterogeneity in model predictions and its association with relevant histological features.
CAIMAN, with its high sensitivity in detecting atypical large-bowel biopsies, might be a promising improvement in clinical workflow efficiency and diagnostic decision making in prescreening of typical colorectal biopsies.
The Pathology Image Data Lake for Analytics, Knowledge and Education Centre of Excellence; the UK Government's Industrial Strategy Challenge Fund; and Innovate UK on behalf of UK Research and Innovation.
Bilal M
,Tsang YW
,Ali M
,Graham S
,Hero E
,Wahab N
,Dodd K
,Sahota H
,Wu S
,Lu W
,Jahanifar M
,Robinson A
,Azam A
,Benes K
,Nimir M
,Hewitt K
,Bhalerao A
,Eldaly H
,Raza SEA
,Gopalakrishnan K
,Minhas F
,Snead D
,Rajpoot N
... -
《The Lancet Digital Health》