Deep learning for classifying fibrotic lung disease on high-resolution computed tomography: a case-cohort study.
Based on international diagnostic guidelines, high-resolution CT plays a central part in the diagnosis of fibrotic lung disease. In the correct clinical context, when high-resolution CT appearances are those of usual interstitial pneumonia, a diagnosis of idiopathic pulmonary fibrosis can be made without surgical lung biopsy. We investigated the use of a deep learning algorithm for provision of automated classification of fibrotic lung disease on high-resolution CT according to criteria specified in two international diagnostic guideline statements: the 2011 American Thoracic Society (ATS)/European Respiratory Society (ERS)/Japanese Respiratory Society (JRS)/Latin American Thoracic Association (ALAT) guidelines for diagnosis and management of idiopathic pulmonary fibrosis and the Fleischner Society diagnostic criteria for idiopathic pulmonary fibrosis.
In this case-cohort study, for algorithm development and testing, a database of 1157 anonymised high-resolution CT scans showing evidence of diffuse fibrotic lung disease was generated from two institutions. We separated the scans into three non-overlapping cohorts (training set, n=929; validation set, n=89; and test set A, n=139) and classified them using 2011 ATS/ERS/JRS/ALAT idiopathic pulmonary fibrosis diagnostic guidelines. For each scan, the lungs were segmented and resampled to create a maximum of 500 unique four slice combinations, which we converted into image montages. The final training dataset consisted of 420 096 unique montages for algorithm training. We evaluated algorithm performance, reported as accuracy, prognostic accuracy, and weighted κ coefficient (κw) of interobserver agreement, on test set A and a cohort of 150 high-resolution CT scans (test set B) with fibrotic lung disease compared with the majority vote of 91 specialist thoracic radiologists drawn from multiple international thoracic imaging societies. We then reclassified high-resolution CT scans according to Fleischner Society diagnostic criteria for idiopathic pulmonary fibrosis. We retrained the algorithm using these criteria and evaluated its performance on 75 fibrotic lung disease specific high-resolution CT scans compared with four specialist thoracic radiologists using weighted κ coefficient of interobserver agreement.
The accuracy of the algorithm on test set A was 76·4%, with 92·7% of diagnoses within one category. The algorithm took 2·31 s to evaluate 150 four slice montages (each montage representing a single case from test set B). The median accuracy of the thoracic radiologists on test set B was 70·7% (IQR 65·3-74·7), and the accuracy of the algorithm was 73·3% (93·3% were within one category), outperforming 60 (66%) of 91 thoracic radiologists. Median interobserver agreement between each of the thoracic radiologists and the radiologist's majority opinion was good (κw=0·67 [IQR 0·58-0·72]). Interobserver agreement between the algorithm and the radiologist's majority opinion was good (κw=0·69), outperforming 56 (62%) of 91 thoracic radiologists. The algorithm provided equally prognostic discrimination between usual interstitial pneumonia and non-usual interstitial pneumonia diagnoses (hazard ratio 2·88, 95% CI 1·79-4·61, p<0·0001) compared with the majority opinion of the thoracic radiologists (2·74, 1·67-4·48, p<0·0001). For Fleischner Society high-resolution CT criteria for usual interstitial pneumonia, median interobserver agreement between the radiologists was moderate (κw=0·56 [IQR 0·55-0·58]), but was good between the algorithm and the radiologists (κw=0·64 [0·55-0·72]).
High-resolution CT evaluation by a deep learning algorithm might provide low-cost, reproducible, near-instantaneous classification of fibrotic lung disease with human-level accuracy. These methods could be of benefit to centres at which thoracic imaging expertise is scarce, as well as for stratification of patients in clinical trials.
None.
Walsh SLF
,Calandriello L
,Silva M
,Sverzellati N
... -
《-》
Multicentre evaluation of multidisciplinary team meeting agreement on diagnosis in diffuse parenchymal lung disease: a case-cohort study.
Diffuse parenchymal lung disease represents a diverse and challenging group of pulmonary disorders. A consistent diagnostic approach to diffuse parenchymal lung disease is crucial if clinical trial data are to be applied to individual patients. We aimed to evaluate inter-multidisciplinary team agreement for the diagnosis of diffuse parenchymal lung disease.
We did a multicentre evaluation of clinical data of patients who presented to the interstitial lung disease unit of the Royal Brompton and Harefield NHS Foundation Trust (London, UK; host institution) and required multidisciplinary team meeting (MDTM) characterisation between March 1, 2010, and Aug 31, 2010. Only patients whose baseline clinical, radiological, and, if biopsy was taken, pathological data were undertaken at the host institution were included. Seven MDTMs, consisting of at least one clinician, radiologist, and pathologist, from seven countries (Denmark, France, Italy, Japan, Netherlands, Portugal, and the UK) evaluated cases of diffuse parenchymal lung disease in a two-stage process between Jan 1, and Oct 15, 2015. First, the clinician, radiologist, and pathologist (if lung biopsy was completed) independently evaluated each case, selected up to five differential diagnoses from a choice of diffuse lung diseases, and chose likelihoods (censored at 5% and summing to 100% in each case) for each of their differential diagnoses, without inter-disciplinary consultation. Second, these specialists convened at an MDTM and reviewed all data, selected up to five differential diagnoses, and chose diagnosis likelihoods. We compared inter-observer and inter-MDTM agreements on patient first-choice diagnoses using Cohen's kappa coefficient (κ). We then estimated inter-observer and inter-MDTM agreement on the probability of diagnosis using weighted kappa coefficient (κw). We compared inter-observer and inter-MDTM confidence of patient first-choice diagnosis. Finally, we evaluated the prognostic significance of a first-choice diagnosis of idiopathic pulmonary fibrosis (IPF) versus not IPF for MDTMs, clinicians, and radiologists, using univariate Cox regression analysis.
70 patients were included in the final study cohort. Clinicians, radiologists, pathologists, and the MDTMs assigned their patient diagnoses between Jan 1, and Oct 15, 2015. IPF made up 88 (18%) of all 490 MDTM first-choice diagnoses. Inter-MDTM agreement for first-choice diagnoses overall was moderate (κ=0·50). Inter-MDTM agreement on diagnostic likelihoods was good for IPF (κw=0·71 [IQR 0·64-0·77]) and connective tissue disease-related interstitial lung disease (κw=0·73 [0·68-0·78]); moderate for non-specific interstitial pneumonia (NSIP; κw=0·42 [0·37-0·49]); and fair for hypersensitivity pneumonitis (κw=0·29 [0·24-0·40]). High-confidence diagnoses (>65% likelihood) of IPF were given in 68 (77%) of 88 cases by MDTMs, 62 (65%) of 96 cases by clinicians, and in 57 (66%) of 86 cases by radiologists. Greater prognostic separation was shown for an MDTM diagnosis of IPF than compared with individual clinician's diagnosis of this disease in five of seven MDTMs, and radiologist's diagnosis of IPF in four of seven MDTMs.
Agreement between MDTMs for diagnosis in diffuse lung disease is acceptable and good for a diagnosis of IPF, as validated by the non-significant greater prognostic separation of an IPF diagnosis made by MDTMs than the separation of a diagnosis made by individual clinicians or radiologists. Furthermore, MDTMs made the diagnosis of IPF with higher confidence and more frequently than did clinicians or radiologists. This difference is of particular importance, because accurate and consistent diagnoses of IPF are needed if clinical outcomes are to be optimised. Inter-multidisciplinary team agreement for a diagnosis of hypersensitivity pneumonitis is low, highlighting an urgent need for standardised diagnostic guidelines for this disease.
National Institute of Health Research, Imperial College London.
Walsh SLF
,Wells AU
,Desai SR
,Poletti V
,Piciucchi S
,Dubini A
,Nunes H
,Valeyre D
,Brillet PY
,Kambouchner M
,Morais A
,Pereira JM
,Moura CS
,Grutters JC
,van den Heuvel DA
,van Es HW
,van Oosterhout MF
,Seldenrijk CA
,Bendstrup E
,Rasmussen F
,Madsen LB
,Gooptu B
,Pomplun S
,Taniguchi H
,Fukuoka J
,Johkoh T
,Nicholson AG
,Sayer C
,Edmunds L
,Jacob J
,Kokosi MA
,Myers JL
,Flaherty KR
,Hansell DM
... -
《-》
Novel 3D-based deep learning for classification of acute exacerbation of idiopathic pulmonary fibrosis using high-resolution CT.
Acute exacerbation of idiopathic pulmonary fibrosis (AE-IPF) is the primary cause of death in patients with IPF, characterised by diffuse, bilateral ground-glass opacification on high-resolution CT (HRCT). This study proposes a three-dimensional (3D)-based deep learning algorithm for classifying AE-IPF using HRCT images.
A novel 3D-based deep learning algorithm, SlowFast, was developed by applying a database of 306 HRCT scans obtained from two centres. The scans were divided into four separate subsets (training set, n=105; internal validation set, n=26; temporal test set 1, n=79; and geographical test set 2, n=96). The final training data set consisted of 1050 samples with 33 600 images for algorithm training. Algorithm performance was evaluated using accuracy, sensitivity, specificity, positive predictive value, negative predictive value, receiver operating characteristic (ROC) curve and weighted κ coefficient.
The accuracy of the algorithm in classifying AE-IPF on the test sets 1 and 2 was 93.9% and 86.5%, respectively. Interobserver agreements between the algorithm and the majority opinion of the radiologists were good (κw=0.90 for test set 1 and κw=0.73 for test set 2, respectively). The ROC accuracy of the algorithm for classifying AE-IPF on the test sets 1 and 2 was 0.96 and 0.92, respectively. The algorithm performance was superior to visual analysis in accurately diagnosing radiological findings. Furthermore, the algorithm's categorisation was a significant predictor of IPF progression.
The deep learning algorithm provides high auxiliary diagnostic efficiency in patients with AE-IPF and may serve as a useful clinical aid for diagnosis.
Huang X
,Si W
,Ye X
,Zhao Y
,Gu H
,Zhang M
,Wu S
,Shi Y
,Gui X
,Xiao Y
,Cao M
... -
《BMJ Open Respiratory Research》