SpLitteR: diploid genome assembly using TELL-Seq linked-reads and assembly graphs.
摘要:
Recent advances in long-read sequencing technologies enabled accurate and contiguous assemblies of large genomes and metagenomes. However, even long and accurate high-fidelity (HiFi) reads do not resolve repeats that are longer than the read lengths. This limitation negatively affects the contiguity of diploid genome assemblies since two haplomes share many long identical regions. To generate the telomere-to-telomere assemblies of diploid genomes, biologists now construct their HiFi-based phased assemblies and use additional experimental technologies to transform them into more contiguous diploid assemblies. The barcoded linked-reads, generated using an inexpensive TELL-Seq technology, provide an attractive way to bridge unresolved repeats in phased assemblies of diploid genomes. We developed the SpLitteR tool for diploid genome assembly using linked-reads and assembly graphs and benchmarked it against state-of-the-art linked-read scaffolders ARKS and SLR-superscaffolder using human HG002 genome and sheep gut microbiome datasets. The benchmark showed that SpLitteR scaffolding results in 1.5-fold increase in NGA50 compared to the baseline LJA assembly and other scaffolders while introducing no additional misassemblies on the human dataset. We developed the SpLitteR tool for assembly graph phasing and scaffolding using barcoded linked-reads. We benchmarked SpLitteR on assembly graphs produced by various long-read assemblers and have demonstrated that TELL-Seq reads facilitate phasing and scaffolding in these graphs. This benchmarking demonstrates that SpLitteR improves upon the state-of-the-art linked-read scaffolders in the accuracy and contiguity metrics. SpLitteR is implemented in C++ as a part of the freely available SPAdes package and is available at https://github.com/ablab/spades/releases/tag/splitter-preprint.
收起
展开
关键词:
DOI:
10.7717/peerj.18050
被引量:
年份:
1970


通过 文献互助 平台发起求助,成功后即可免费获取论文全文。
求助方法1:
知识发现用户
每天可免费求助50篇
求助方法1:
关注微信公众号
每天可免费求助2篇
求助方法2:
完成求助需要支付5财富值
您目前有 1000 财富值
相似文献(0)
参考文献(0)
引证文献(0)
来源期刊
影响因子:3.058
JCR分区: 暂无
中科院分区:暂无