-
A chromosome-level reference genome of a Convolvulaceae species Ipomoea cairica.
Ipomoea cairica is a perennial creeper that has been widely introduced as a garden ornamental across tropical, subtropical, and temperate regions. Because it grows extremely fast and spreads easily, it has been listed as an invasive species in many countries. Here, we constructed the chromosome-level reference genome of Ipomoea cairica by Pacific Biosciences HiFi and Hi-C sequencing, with the assembly size of 733.0 Mb, the contig N50 of 43.8 Mb, the scaffold N50 of 45.7 Mb, and the Benchmarking Universal Single-Copy Orthologs complete rate of 98.0%. Hi-C scaffolding assigned 97.9% of the contigs to 15 pseudo-chromosomes. Telomeric repeat analysis reveals that 7 of the 15 pseudo-chromosomes are gapless and telomere to telomere. The transposable element content of Ipomoea cairica is 73.4%, obviously higher than that of other Ipomoea species. A total of 38,115 protein-coding genes were predicted, with the Benchmarking Universal Single-Copy Orthologs complete rate of 98.5%, comparable to that of the genome assembly, and 92.6% of genes were functional annotated. In addition, we identified 3,039 tRNA genes and 2,403 rRNA genes in the assembled genome. Phylogenetic analysis showed that Ipomoea cairica formed a clade with Ipomoea aquatica, and they diverged from each other 8.1 million years ago. Through comparative genome analysis, we reconfirmed that a whole genome triplication event occurred specific to Convolvulaceae family and in the ancestor of the genus Ipomoea and Cuscuta. This high-quality reference genome of Ipomoea cairica will greatly facilitate the studies on the molecular mechanisms of its rapid growth and invasiveness.
Jiang F
,Wang S
,Wang H
,Wang A
,Xu D
,Liu H
,Yang B
,Yuan L
,Lei L
,Chen R
,Li W
,Fan W
... -
《G3-Genes Genomes Genetics》
-
Chromosome-level assembly of Dictyophora rubrovolvata genome using third-generation DNA sequencing and Hi-C analysis.
Dictyophora rubrovolvata, a rare edible mushroom with both nutritional and medicinal values, was regarded as the "queen of the mushroom" for its attractive appearance. Dictyophora rubrovolvata has been widely cultivated in China in recent years, and many researchers were focusing on its nutrition, culture condition, and artificial cultivation. Due to a lack of genomic information, research on bioactive substances, cross breeding, lignocellulose degradation, and molecular biology is limited. In this study, we report a chromosome-level reference genome of D. rubrovolvata using the PacBio single-molecule real-time-sequencing technique and high-throughput chromosome conformation capture (Hi-C) technologies. A total of 1.83 Gb circular consensus sequencing reads representing ∼983.34 coverage of the D. rubrovolvata genome were generated. The final genome was assembled into 136 contigs with a total length of 32.89 Mb. The scaffold and contig N50 length were 2.71 and 2.48 Mb, respectively. After chromosome-level scaffolding, 11 chromosomes with a total length of 28.24 Mb were constructed. Genome annotation further revealed that 9.86% of the genome was composed of repetitive sequences, and a total of 508 noncoding RNA (rRNA: 329, tRNA: 150, ncRNA: 29) were annotated. In addition, 9,725 protein-coding genes were predicted, among which 8,830 (90.79%) genes were predicted using homology or RNA-seq. Benchmarking Universal Single-Copy Orthologs results further revealed that there were 80.34% complete single-copy fungal orthologs. In this study, a total of 360 genes were annotated as belonging to the carbohydrate-active enzymes family. Further analysis also predicted 425 cytochromes P450 genes, which can be classified into 41 families. This highly accurate, chromosome-level reference genome of D. rubrovolvata will provide essential genomic information for understanding the molecular mechanism in its fruiting body formation during morphological development and facilitate the exploitation of medicinal compounds produced by this mushroom.
Ma L
,Yang C
,Xiao D
,Liu X
,Jiang X
,Lin H
,Ying Z
,Lin Y
... -
《G3-Genes Genomes Genetics》
-
EndHiC: assemble large contigs into chromosome-level scaffolds using the Hi-C links from contig ends.
The application of PacBio HiFi and ultra-long ONT reads have enabled huge progress in the contig-level assembly, but it is still challenging to assemble large contigs into chromosomes with available Hi-C scaffolding tools, which count Hi-C links between contigs using the whole or a large part of contig regions. As the Hi-C links of two adjacent contigs concentrate only at the neighbor ends of the contigs, larger contig size will reduce the power to differentiate adjacent (signal) and non-adjacent (noise) contig linkages, leading to a higher rate of mis-assembly.
We design and develop a novel Hi-C based scaffolding tool EndHiC, which is suitable to assemble large contigs into chromosomal-level scaffolds. The core idea behind EndHiC, which distinguishes it from other Hi-C scaffolding tools, is using Hi-C links only from the most effective regions of contig ends. By this way, the signal neighbor contig linkages and noise non-neighbor contig linkages are separated more clearly. Benefiting from the increased signal to noise ratio, the reciprocal best requirement, as well as the robustness evaluation, EndHiC achieves higher accuracy for scaffolding large contigs compared to existing tools. EndHiC has been successfully applied in the Hi-C scaffolding of simulated data from human, rice and Arabidopsis, and real data from human, great burdock, water spinach, chicory, endive, yacon, and Ipomoea cairica, suggesting that EndHiC can be applied to a broad range of plant and animal genomes.
EndHiC is a novel Hi-C scaffolding tool, which is suitable for scaffolding of contig assemblies with contig N50 size near or over 10 Mb and N90 size near or over 1 Mb. EndHiC is efficient both in time and memory, and it is interface-friendly to the users. As more genome projects have been launched and the contig continuity constantly improved, we believe EndHiC has the potential to make a great contribution to the genomics field and liberate the scientists from labor-intensive manual curation works.
Wang S
,Wang H
,Jiang F
,Wang A
,Liu H
,Zhao H
,Yang B
,Xu D
,Zhang Y
,Fan W
... -
《BMC BIOINFORMATICS》
-
Genome sequence of the barred knifejaw Oplegnathus fasciatus (Temminck & Schlegel, 1844): the first chromosome-level draft genome in the family Oplegnathidae.
The barred knifejaw (Oplegnathus fasciatus), a member of the Oplegnathidae family of the Centrarchiformes, is a commercially important rocky reef fish native to East Asia. Oplegnathus fasciatus has become an important fishery resource for offshore cage aquaculture and fish stocking of marine ranching in China, Japan, and Korea. Recently, sexual dimorphism in growth with neo-sex chromosome and widespread biotic diseases in O. fasciatus have been increasing concern in the industry. However, adequate genome resources for gaining insight into sex-determining mechanisms and establishing genetically resistant breeding systems for O. fasciatus are lacking. Here, we analyzed the entire genome of a female O. fasciatus fish using long-read sequencing and Hi-C data to generate chromosome-length scaffolds and a highly contiguous genome assembly.
We assembled the O. fasciatus genome with a total of 245.0 Gb of raw reads that were generated using both Pacific Bioscience (PacBio) Sequel and Illumina HiSeq 2000 platforms. The final draft genome assembly was approximately 778.7 Mb, which reached a high level of continuity with a contig N50 of 2.1 Mb. The genome size was consistent with the estimated genome size (777.5 Mb) based on k-mer analysis. We combined Hi-C data with a draft genome assembly to generate chromosome-length scaffolds. Twenty-four scaffolds corresponding to the 24 chromosomes were assembled to a final size of 768.8 Mb with a contig N50 of 2.1 Mb and a scaffold N50 of 33.5 Mb using 1,372 contigs. The identified repeat sequences accounted for 33.9% of the entire genome, and 24 003 protein-coding genes with an average of 10.1 exons per gene were annotated using de novo methods, with RNA sequencing data and homologies to other teleosts. According to phylogenetic analysis using protein-coding genes, O. fasciatus is closely related to Larimichthys crocea, with O. fasciatus diverging from their common ancestor approximately 70.5-88.5 million years ago.
We generated a high-quality draft genome for O. fasciatus using long-read PacBio sequencing technology, which represents the first chromosome-level reference genome for Oplegnathidae species. Assembly of this genome assists research into fish sex-determining mechanisms and can serve as a resource for accelerating genome-assisted improvements in resistant breeding systems.
Xiao Y
,Xiao Z
,Ma D
,Liu J
,Li J
... -
《GigaScience》
-
Chromosome-level assembly of the mustache toad genome using third-generation DNA sequencing and Hi-C analysis.
The mustache toad, Vibrissaphora ailaonica, is endemic to China and belongs to the Megophryidae family. Like other mustache toad species, V. ailaonica males temporarily develop keratinized nuptial spines on their upper jaw during each breeding season, which fall off at the end of the breeding season. This feature is likely result of the reversal of sexual dimorphism in body size, with males being larger than females. A high-quality reference genome for the mustache toad would be invaluable to investigate the genetic mechanism underlying these repeatedly developing keratinized spines.
To construct the mustache toad genome, we generated 225 Gb of short reads and 277 Gb of long reads using Illumina and Pacific Biosciences (PacBio) sequencing technologies, respectively. Sequencing data were assembled into a 3.53-Gb genome assembly, with a contig N50 length of 821 kb. We also used high-throughput chromosome conformation capture (Hi-C) technology to identify contacts between contigs, then assembled contigs into scaffolds and assembled a genome with 13 chromosomes and a scaffold N50 length of 412.42 Mb. Based on the 26,227 protein-coding genes annotated in the genome, we analyzed phylogenetic relationships between the mustache toad and other chordate species. The mustache toad has a relatively higher evolutionary rate and separated from a common ancestor of the marine toad, bullfrog, and Tibetan frog 206.1 million years ago. Furthermore, we identified 201 expanded gene families in the mustache toad, which were mainly enriched in immune pathway, keratin filament, and metabolic processes.
Using Illumina, PacBio, and Hi-C technologies, we constructed the first high-quality chromosome-level mustache toad genome. This work not only offers a valuable reference genome for functional studies of mustache toad traits but also provides important chromosomal information for wider genome comparisons.
Li Y
,Ren Y
,Zhang D
,Jiang H
,Wang Z
,Li X
,Rao D
... -
《GigaScience》