Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study.-Z研学术

Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study.

来自 PUBMED

作者：

Zhao QY ， Wang Y ， Kong YM ， Luo D ， Li X ， Hao P

展开 

摘要：

With the fast advances in nextgen sequencing technology, high-throughput RNA sequencing has emerged as a powerful and cost-effective way for transcriptome study. De novo assembly of transcripts provides an important solution to transcriptome analysis for organisms with no reference genome. However, there lacked understanding on how the different variables affected assembly outcomes, and there was no consensus on how to approach an optimal solution by selecting software tool and suitable strategy based on the properties of RNA-Seq data. To reveal the performance of different programs for transcriptome assembly, this work analyzed some important factors, including k-mer values, genome complexity, coverage depth, directional reads, etc. Seven program conditions, four single k-mer assemblers (SK: SOAPdenovo, ABySS, Oases and Trinity) and three multiple k-mer methods (MK: SOAPdenovo-MK, trans-ABySS and Oases-MK) were tested. While small and large k-mer values performed better for reconstructing lowly and highly expressed transcripts, respectively, MK strategy worked well for almost all ranges of expression quintiles. Among SK tools, Trinity performed well across various conditions but took the longest running time. Oases consumed the most memory whereas SOAPdenovo required the shortest runtime but worked poorly to reconstruct full-length CDS. ABySS showed some good balance between resource usage and quality of assemblies. Our work compared the performance of publicly available transcriptome assemblers, and analyzed important factors affecting de novo assembly. Some practical guidelines for transcript reconstruction from short-read RNA-Seq data were proposed. De novo assembly of C. sinensis transcriptome was greatly improved using some optimized methods.

收起

展开 

DOI：

10.1186/1471-2105-12-S14-S2

被引量：

254

年份：

1970

全部来源

SCI-Hub (全网免费下载)

发表链接

ResearchGate (全网免费下载)

钛学术 (全网免费下载)

通过文献互助平台发起求助，成功后即可免费获取论文全文。

查看求助

求助方法1：

知识发现用户

每天可免费求助50篇

求助

求助方法1：

关注微信公众号

每天可免费求助2篇

求助方法2：

求助需要支付5个财富值

您现在财富值不足

您可以通过应助全文获取财富值

求助方法2：

完成求助需要支付5财富值

您目前有 1000 财富值

求助

我们已与文献出版商建立了直接购买合作。

你可以通过身份认证进行实名认证，认证成功后本次下载的费用将由您所在的图书馆支付

您可以直接购买此文献，1~5分钟即可下载全文，部分资源由于网络原因可能需要更长时间，请您耐心等待哦~

身份认证全文购买

相似文献(246)

参考文献(21)

引证文献(254)

Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study.

With the fast advances in nextgen sequencing technology, high-throughput RNA sequencing has emerged as a powerful and cost-effective way for transcriptome study. De novo assembly of transcripts provides an important solution to transcriptome analysis for organisms with no reference genome. However, there lacked understanding on how the different variables affected assembly outcomes, and there was no consensus on how to approach an optimal solution by selecting software tool and suitable strategy based on the properties of RNA-Seq data. To reveal the performance of different programs for transcriptome assembly, this work analyzed some important factors, including k-mer values, genome complexity, coverage depth, directional reads, etc. Seven program conditions, four single k-mer assemblers (SK: SOAPdenovo, ABySS, Oases and Trinity) and three multiple k-mer methods (MK: SOAPdenovo-MK, trans-ABySS and Oases-MK) were tested. While small and large k-mer values performed better for reconstructing lowly and highly expressed transcripts, respectively, MK strategy worked well for almost all ranges of expression quintiles. Among SK tools, Trinity performed well across various conditions but took the longest running time. Oases consumed the most memory whereas SOAPdenovo required the shortest runtime but worked poorly to reconstruct full-length CDS. ABySS showed some good balance between resource usage and quality of assemblies. Our work compared the performance of publicly available transcriptome assemblers, and analyzed important factors affecting de novo assembly. Some practical guidelines for transcript reconstruction from short-read RNA-Seq data were proposed. De novo assembly of C. sinensis transcriptome was greatly improved using some optimized methods.

Zhao QY ，Wang Y ，Kong YM ，Luo D ，Li X ，Hao P ... - 《BMC BIOINFORMATICS》

被引量: 254 发表:1970年
Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus.

Rana SB ，Zadlock FJ 4th ，Zhang Z ，Murphy WR ，Bentivegna CS ... - 《PLoS One》

被引量: 23 发表:1970年
Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis.

Wang S ，Gribskov M 《-》

被引量: 37 发表:2017年
Optimizing de novo assembly of short-read RNA-seq data for phylogenomics.

Yang Y ，Smith SA 《BMC GENOMICS》

被引量: 103 发表:1970年
Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data.

Duan J ，Xia C ，Zhao G ，Jia J ，Kong X ... - 《BMC GENOMICS》

被引量: 66 发表:1970年

加载更多

来源期刊

BMC BIOINFORMATICS

影响因子：3.304

JCR分区：暂无

中科院分区：暂无