Classifying cancer pathology reports with hierarchical self-attention networks.
摘要:
We introduce a deep learning architecture, hierarchical self-attention networks (HiSANs), designed for classifying pathology reports and show how its unique architecture leads to a new state-of-the-art in accuracy, faster training, and clear interpretability. We evaluate performance on a corpus of 374,899 pathology reports obtained from the National Cancer Institute's (NCI) Surveillance, Epidemiology, and End Results (SEER) program. Each pathology report is associated with five clinical classification tasks - site, laterality, behavior, histology, and grade. We compare the performance of the HiSAN against other machine learning and deep learning approaches commonly used on medical text data - Naive Bayes, logistic regression, convolutional neural networks, and hierarchical attention networks (the previous state-of-the-art). We show that HiSANs are superior to other machine learning and deep learning text classifiers in both accuracy and macro F-score across all five classification tasks. Compared to the previous state-of-the-art, hierarchical attention networks, HiSANs not only are an order of magnitude faster to train, but also achieve about 1% better relative accuracy and 5% better relative macro F-score.
收起
展开
DOI:
10.1016/j.artmed.2019.101726
被引量:
年份:
1970


通过 文献互助 平台发起求助,成功后即可免费获取论文全文。
求助方法1:
知识发现用户
每天可免费求助50篇
求助方法1:
关注微信公众号
每天可免费求助2篇
求助方法2:
完成求助需要支付5财富值
您目前有 1000 财富值
相似文献(134)
参考文献(0)
引证文献(21)
来源期刊
影响因子:暂无数据
JCR分区: 暂无
中科院分区:暂无