High-throughput and efficient multilocus genome-wide association study on longitudinal outcomes.
摘要:
With the emerging of high-dimensional genomic data, genetic analysis such as genome-wide association studies (GWAS) have played an important role in identifying disease-related genetic variants and novel treatments. Complex longitudinal phenotypes are commonly collected in medical studies. However, since limited analytical approaches are available for longitudinal traits, these data are often underutilized. In this article, we develop a high-throughput machine learning approach for multilocus GWAS using longitudinal traits by coupling Empirical Bayesian Estimates from mixed-effects modeling with a novel ℓ0-norm algorithm. Extensive simulations demonstrated that the proposed approach not only provided accurate selection of single nucleotide polymorphisms (SNPs) with comparable or higher power but also robust control of false positives. More importantly, this novel approach is highly scalable and could be approximately >1000 times faster than recently published approaches, making genome-wide multilocus analysis of longitudinal traits possible. In addition, our proposed approach can simultaneously analyze millions of SNPs if the computer memory allows, thereby potentially allowing a true multilocus analysis for high-dimensional genomic data. With application to the data from Alzheimer's Disease Neuroimaging Initiative, we confirmed that our approach can identify well-known SNPs associated with AD and were much faster than recently published approaches (≥6000 times). The source code and the testing datasets are available at https://github.com/Myuan2019/EBE_APML0. Supplementary data are available at Bioinformatics online.
收起
展开
DOI:
10.1093/bioinformatics/btaa120
被引量:
年份:
2020


通过 文献互助 平台发起求助,成功后即可免费获取论文全文。
求助方法1:
知识发现用户
每天可免费求助50篇
求助方法1:
关注微信公众号
每天可免费求助2篇
求助方法2:
完成求助需要支付5财富值
您目前有 1000 财富值
相似文献(177)
参考文献(0)
引证文献(2)
来源期刊
影响因子:暂无数据
JCR分区: 暂无
中科院分区:暂无