-
Multifaceted biological insights from a draft genome sequence of the tobacco hornworm moth, Manduca sexta.
Manduca sexta, known as the tobacco hornworm or Carolina sphinx moth, is a lepidopteran insect that is used extensively as a model system for research in insect biochemistry, physiology, neurobiology, development, and immunity. One important benefit of this species as an experimental model is its extremely large size, reaching more than 10 g in the larval stage. M. sexta larvae feed on solanaceous plants and thus must tolerate a substantial challenge from plant allelochemicals, including nicotine. We report the sequence and annotation of the M. sexta genome, and a survey of gene expression in various tissues and developmental stages. The Msex_1.0 genome assembly resulted in a total genome size of 419.4 Mbp. Repetitive sequences accounted for 25.8% of the assembled genome. The official gene set is comprised of 15,451 protein-coding genes, of which 2498 were manually curated. Extensive RNA-seq data from many tissues and developmental stages were used to improve gene models and for insights into gene expression patterns. Genome wide synteny analysis indicated a high level of macrosynteny in the Lepidoptera. Annotation and analyses were carried out for gene families involved in a wide spectrum of biological processes, including apoptosis, vacuole sorting, growth and development, structures of exoskeleton, egg shells, and muscle, vision, chemosensation, ion channels, signal transduction, neuropeptide signaling, neurotransmitter synthesis and transport, nicotine tolerance, lipid metabolism, and immunity. This genome sequence, annotation, and analysis provide an important new resource from a well-studied model insect species and will facilitate further biochemical and mechanistic experimental studies of many biological systems in insects.
Kanost MR
,Arrese EL
,Cao X
,Chen YR
,Chellapilla S
,Goldsmith MR
,Grosse-Wilde E
,Heckel DG
,Herndon N
,Jiang H
,Papanicolaou A
,Qu J
,Soulages JL
,Vogel H
,Walters J
,Waterhouse RM
,Ahn SJ
,Almeida FC
,An C
,Aqrawi P
,Bretschneider A
,Bryant WB
,Bucks S
,Chao H
,Chevignon G
,Christen JM
,Clarke DF
,Dittmer NT
,Ferguson LCF
,Garavelou S
,Gordon KHJ
,Gunaratna RT
,Han Y
,Hauser F
,He Y
,Heidel-Fischer H
,Hirsh A
,Hu Y
,Jiang H
,Kalra D
,Klinner C
,König C
,Kovar C
,Kroll AR
,Kuwar SS
,Lee SL
,Lehman R
,Li K
,Li Z
,Liang H
,Lovelace S
,Lu Z
,Mansfield JH
,McCulloch KJ
,Mathew T
,Morton B
,Muzny DM
,Neunemann D
,Ongeri F
,Pauchet Y
,Pu LL
,Pyrousis I
,Rao XJ
,Redding A
,Roesel C
,Sanchez-Gracia A
,Schaack S
,Shukla A
,Tetreau G
,Wang Y
,Xiong GH
,Traut W
,Walsh TK
,Worley KC
,Wu D
,Wu W
,Wu YQ
,Zhang X
,Zou Z
,Zucker H
,Briscoe AD
,Burmester T
,Clem RJ
,Feyereisen R
,Grimmelikhuijzen CJP
,Hamodrakas SJ
,Hansson BS
,Huguet E
,Jermiin LS
,Lan Q
,Lehman HK
,Lorenzen M
,Merzendorfer H
,Michalopoulos I
,Morton DB
,Muthukrishnan S
,Oakeshott JG
,Palmer W
,Park Y
,Passarelli AL
,Rozas J
,Schwartz LM
,Smith W
,Southgate A
,Vilcinskas A
,Vogt R
,Wang P
,Werren J
,Yu XQ
,Zhou JJ
,Brown SJ
,Scherer SE
,Richards S
,Blissard GW
... -
《-》
-
Integrated modeling of protein-coding genes in the Manduca sexta genome using RNA-Seq data from the biochemical model insect.
The genome sequence of Manduca sexta was recently determined using 454 technology. Cufflinks and MAKER2 were used to establish gene models in the genome assembly based on the RNA-Seq data and other species' sequences. Aided by the extensive RNA-Seq data from 50 tissue samples at various life stages, annotators over the world (including the present authors) have manually confirmed and improved a small percentage of the models after spending months of effort. While such collaborative efforts are highly commendable, many of the predicted genes still have problems which may hamper future research on this insect species. As a biochemical model representing lepidopteran pests, M. sexta has been used extensively to study insect physiological processes for over five decades. In this work, we assembled Manduca datasets Cufflinks 3.0, Trinity 4.0, and Oases 4.0 to assist the manual annotation efforts and development of Official Gene Set (OGS) 2.0. To further improve annotation quality, we developed methods to evaluate gene models in the MAKER2, Cufflinks, Oases and Trinity assemblies and selected the best ones to constitute MCOT 1.0 after thorough crosschecking. MCOT 1.0 has 18,089 genes encoding 31,666 proteins: 32.8% match OGS 2.0 models perfectly or near perfectly, 11,747 differ considerably, and 29.5% are absent in OGS 2.0. Future automation of this process is anticipated to greatly reduce human efforts in generating comprehensive, reliable models of structural genes in other genome projects where extensive RNA-Seq data are available.
Cao X
,Jiang H
《-》
-
De novo genome assembly of the tobacco hornworm moth (Manduca sexta).
The tobacco hornworm, Manduca sexta, is a lepidopteran insect that is used extensively as a model system for studying insect biology, development, neuroscience, and immunity. However, current studies rely on the highly fragmented reference genome Msex_1.0, which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies. We present a new reference genome for M. sexta, JHU_Msex_v1.0, applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly is 470 Mb and is ∼20× more continuous than the original assembly, with scaffold N50 > 14 Mb. We annotated the assembly by lifting over existing annotations and supplementing with additional supporting RNA-based data for a total of 25,256 genes. The new reference assembly is accessible in annotated form for public use. We demonstrate that improved continuity of the M. sexta genome improves resequencing studies and benefits future research on M. sexta as a model organism.
Gershman A
,Romer TG
,Fan Y
,Razaghi R
,Smith WA
,Timp W
... -
《G3-Genes Genomes Genetics》
-
An analysis of 67 RNA-seq datasets from various tissues at different stages of a model insect, Manduca sexta.
Manduca sexta is a large lepidopteran insect widely used as a model to study biochemistry of insect physiological processes. As a part of its genome project, over 50 cDNA libraries have been analyzed to profile gene expression in different tissues and life stages. While the RNA-seq data were used to study genes related to cuticle structure, chitin metabolism and immunity, a vast amount of the information has not yet been mined for understanding the basic molecular biology of this model insect. In fact, the basic features of these data, such as composition of the RNA-seq reads and lists of library-correlated genes, are unclear. From an extended view of all insects, clear-cut tempospatial expression data are rarely seen in the largest group of animals including Drosophila and mosquitoes, mainly due to their small sizes.
We obtained the transcriptome data, analyzed the raw reads in relation to the assembled genome, and generated heatmaps for clustered genes. Library characteristics (tissues, stages), number of mapped bases, and sequencing methods affected the observed percentages of genome transcription. While up to 40% of the reads were not mapped to the genome in the initial Cufflinks gene modeling, we identified the causes for the mapping failure and reduced the number of non-mappable reads to <8%. Similarities between libraries, measured based on library-correlated genes, clearly identified differences among tissues or life stages. We calculated gene expression levels, analyzed the most abundantly expressed genes in the libraries. Furthermore, we analyzed tissue-specific gene expression and identified 18 groups of genes with distinct expression patterns.
We performed a thorough analysis of the 67 RNA-seq datasets to characterize new genomic features of M. sexta. Integrated knowledge of gene functions and expression features will facilitate future functional studies in this biochemical model insect.
Cao X
,Jiang H
《BMC GENOMICS》
-
Extensive conserved synteny of genes between the karyotypes of Manduca sexta and Bombyx mori revealed by BAC-FISH mapping.
Genome sequencing projects have been completed for several species representing four highly diverged holometabolous insect orders, Diptera, Hymenoptera, Coleoptera, and Lepidoptera. The striking evolutionary diversity of insects argues a need for efficient methods to apply genome information from such models to genetically uncharacterized species. Constructing conserved synteny maps plays a crucial role in this task. Here, we demonstrate the use of fluorescence in situ hybridization with bacterial artificial chromosome probes as a powerful tool for physical mapping of genes and comparative genome analysis in Lepidoptera, which have numerous and morphologically uniform holokinetic chromosomes.
We isolated 214 clones containing 159 orthologs of well conserved single-copy genes of a sequenced lepidopteran model, the silkworm, Bombyx mori, from a BAC library of a sphingid with an unexplored genome, the tobacco hornworm, Manduca sexta. We then constructed a BAC-FISH karyotype identifying all 28 chromosomes of M. sexta by mapping 124 loci using the corresponding BAC clones. BAC probes from three M. sexta chromosomes also generated clear signals on the corresponding chromosomes of the convolvulus hawk moth, Agrius convolvuli, which belongs to the same subfamily, Sphinginae, as M. sexta.
Comparison of the M. sexta BAC physical map with the linkage map and genome sequence of B. mori pointed to extensive conserved synteny including conserved gene order in most chromosomes. Only a few rearrangements, including three inversions, three translocations, and two fission/fusion events were estimated to have occurred after the divergence of Bombycidae and Sphingidae. These results add to accumulating evidence for the stability of lepidopteran genomes. Generating signals on A. convolvuli chromosomes using heterologous M. sexta probes demonstrated that BAC-FISH with orthologous sequences can be used for karyotyping a wide range of related and genetically uncharacterized species, significantly extending the ability to develop synteny maps for comparative and functional genomics.
Yasukochi Y
,Tanaka-Okuyama M
,Shibata F
,Yoshido A
,Marec F
,Wu C
,Zhang H
,Goldsmith MR
,Sahara K
... -
《PLoS One》