Testing the performance, adequacy, and applicability of an artificial intelligence model for pediatric pneumonia diagnosis.

来自 PUBMED

摘要:

Community-acquired Pneumonia (CAP) is a common childhood infectious disease. Deep learning models show promise in X-ray interpretation and diagnosis, but their validation should be extended due to limitations in the current validation workflow. To extend the standard validation workflow we propose doing a pilot test with the next characteristics. First, the assumption of perfect ground truth (100% sensitive and specific) is unrealistic, as high intra and inter-observer variability have been reported. To address this, we propose using Bayesian latent class models (BLCA) to estimate accuracy during the pilot. Additionally, assessing only the performance of a model without considering its applicability and acceptance by physicians is insufficient if we hope to integrate AI systems into day-to-day clinical practice. Therefore, we propose employing explainable artificial intelligence (XAI) methods during the pilot test to involve physicians and evaluate how well a Deep Learning model is accepted and how helpful it is for routine decisions as well as analyze its limitations by assessing the etiology. This study aims to apply the proposed pilot to test a deep Convolutional Neural Network (CNN)-based model for identifying consolidation in pediatric chest-X-ray (CXR) images already validated using the standard workflow. For the standard validation workflow, a total of 5856 public CXRs and 950 private CXRs were used to train and validate the performance of the CNN model. The performance of the model was estimated assuming a perfect ground truth. For the pilot test proposed in this article, a total of 190 pediatric chest-X-ray (CXRs) images were used to test the CNN model support decision tool (SDT). The performance of the model on the pilot test was estimated using extensions of the two-test Bayesian Latent-Class model (BLCA). The sensitivity, specificity, and accuracy of the model were also assessed. The clinical characteristics of the patients were compared according to the model performance. The adequacy and applicability of the SDT was tested using XAI techniques. The adequacy of the SDT was assessed by asking two senior physicians the agreement rate with the SDT. The applicability was tested by asking three medical residents before and after using the SDT and the agreement between experts was calculated using the kappa index. The CRXs of the pilot test were labeled by the panel of experts into consolidation (124/176, 70.4%) and no-consolidation/other infiltrates (52/176, 29.5%). A total of 31/176 (17.6%) discrepancies were found between the model and the panel of experts with a kappa index of 0.6. The sensitivity and specificity reached a median of 90.9 (95% Credible Interval (CrI), 81.2-99.9) and 77.7 (95% CrI, 63.3-98.1), respectively. The senior physicians reported a high agreement rate (70%) with the system in identifying logical consolidation patterns. The three medical residents reached a higher agreement using SDT than alone with experts (0.66±0.1 vs. 0.75±0.2). Through the pilot test, we have successfully verified that the deep learning model was underestimated when a perfect ground truth was considered. Furthermore, by conducting adequacy and applicability tests, we can ensure that the model is able to identify logical patterns within the CXRs and that augmenting clinicians with automated preliminary read assistants could accelerate their workflows and enhance accuracy in identifying consolidation in pediatric CXR images.

收起

展开

DOI:

10.1016/j.cmpb.2023.107765

被引量:

0

年份:

1970

SCI-Hub (全网免费下载) 发表链接

通过 文献互助 平台发起求助,成功后即可免费获取论文全文。

查看求助

求助方法1:

知识发现用户

每天可免费求助50篇

求助

求助方法1:

关注微信公众号

每天可免费求助2篇

求助方法2:

求助需要支付5个财富值

您现在财富值不足

您可以通过 应助全文 获取财富值

求助方法2:

完成求助需要支付5财富值

您目前有 1000 财富值

求助

我们已与文献出版商建立了直接购买合作。

你可以通过身份认证进行实名认证,认证成功后本次下载的费用将由您所在的图书馆支付

您可以直接购买此文献,1~5分钟即可下载全文,部分资源由于网络原因可能需要更长时间,请您耐心等待哦~

身份认证 全文购买

相似文献(242)

参考文献(0)

引证文献(0)

来源期刊

-

影响因子:暂无数据

JCR分区: 暂无

中科院分区:暂无

研究点推荐

关于我们

zlive学术集成海量学术资源,融合人工智能、深度学习、大数据分析等技术,为科研工作者提供全面快捷的学术服务。在这里我们不忘初心,砥砺前行。

友情链接

联系我们

合作与服务

©2024 zlive学术声明使用前必读