Thirteen complete chloroplast genomes of the costaceae family: insights into genome structure, selective pressure and phylogenetic relationships.
Costaceae, commonly known as the spiral ginger family, consists of approximately 120 species distributed in the tropical regions of South America, Africa, and Southeast Asia, of which some species have important ornamental, medicinal and ecological values. Previous studies on the phylogenetic and taxonomic of Costaceae by using nuclear internal transcribed spacer (ITS) and chloroplast genome fragments data had low resolutions. Additionally, the structures, variations and molecular evolution of complete chloroplast genomes in Costaceae still remain unclear. Herein, a total of 13 complete chloroplast genomes of Costaceae including 8 newly sequenced and 5 from the NCBI GenBank database, representing all three distribution regions of this family, were comprehensively analyzed for comparative genomics and phylogenetic relationships.
The 13 complete chloroplast genomes of Costaceae possessed typical quadripartite structures with lengths from 166,360 to 168,966 bp, comprising a large single copy (LSC, 90,802 - 92,189 bp), a small single copy (SSC, 18,363 - 20,124 bp) and a pair of inverted repeats (IRs, 27,982 - 29,203 bp). These genomes coded 111 - 113 different genes, including 79 protein-coding genes, 4 rRNA genes and 28 - 30 tRNAs genes. The gene orders, gene contents, amino acid frequencies and codon usage within Costaceae were highly conservative, but several variations in intron loss, long repeats, simple sequence repeats (SSRs) and gene expansion on the IR/SC boundaries were also found among these 13 genomes. Comparative genomics within Costaceae identified five highly divergent regions including ndhF, ycf1-D2, ccsA-ndhD, rps15-ycf1-D2 and rpl16-exon2-rpl16-exon1. Five combined DNA regions (ycf1-D2 + ndhF, ccsA-ndhD + rps15-ycf1-D2, rps15-ycf1-D2 + rpl16-exon2-rpl16-exon1, ccsA-ndhD + rpl16-exon2-rpl16-exon1, and ccsA-ndhD + rps15-ycf1-D2 + rpl16-exon2-rpl16-exon1) could be used as potential markers for future phylogenetic analyses and species identification in Costaceae. Positive selection was found in eight protein-coding genes, including cemA, clpP, ndhA, ndhF, petB, psbD, rps12 and ycf1. Maximum likelihood and Bayesian phylogenetic trees using chloroplast genome sequences consistently revealed identical tree topologies with high supports between species of Costaceae. Three clades were divided within Costaceae, including the Asian clade, Costus clade and South American clade. Tapeinochilos was a sister of Hellenia, and Parahellenia was a sister to the cluster of Tapeinochilos + Hellenia with strong support in the Asian clade. The results of molecular dating showed that the crown age of Costaceae was about 30.5 Mya (95% HPD: 14.9 - 49.3 Mya), and then started to diverge into the Costus clade and Asian clade around 23.8 Mya (95% HPD: 10.1 - 41.5 Mya). The Asian clade diverged into Hellenia and Parahellenia at approximately 10.7 Mya (95% HPD: 3.5 - 25.1 Mya).
The complete chloroplast genomes can resolve the phylogenetic relationships of Costaceae and provide new insights into genome structures, variations and evolution. The identified DNA divergent regions would be useful for species identification and phylogenetic inference in Costaceae.
Li DM
,Pan YG
,Liu HL
,Yu B
,Huang D
,Zhu GF
... -
《BMC GENOMICS》
Complete Chloroplast Genome Analysis of Two Important Medicinal Alpinia Species: Alpinia galanga and Alpinia kwangsiensis.
Most Alpinia species are valued as foods, ornamental plants, or plants with medicinal properties. However, morphological characteristics and commonly used DNA barcode fragments are not sufficient for accurately identifying Alpinia species. Difficulties in species identification have led to confusion in the sale and use of Alpinia for medicinal use. To mine resources and improve the molecular methods for distinguishing among Alpinia species, we report the complete chloroplast (CP) genomes of Alpinia galanga and Alpinia kwangsiensis species, obtained via high-throughput Illumina sequencing. The CP genomes of A. galanga and A. kwangsiensis exhibited a typical circular tetramerous structure, including a large single-copy region (87,565 and 87,732 bp, respectively), a small single-copy region (17,909 and 15,181 bp, respectively), and a pair of inverted repeats (27,313 and 29,705 bp, respectively). The guanine-cytosine content of the CP genomes is 36.26 and 36.15%, respectively. Furthermore, each CP genome contained 133 genes, including 87 protein-coding genes, 38 distinct tRNA genes, and 8 distinct rRNA genes. We identified 110 and 125 simple sequence repeats in the CP genomes of A. galanga and A. kwangsiensis, respectively. We then combined these data with publicly available CP genome data from four other Alpinia species (A. hainanensis, A. oxyphylla, A. pumila, and A. zerumbet) and analyzed their sequence characteristics. Nucleotide diversity was analyzed based on the alignment of the complete CP genome sequences, and five candidate highly variable site markers (trnS-trnG, trnC-petN, rpl32-trnL, psaC-ndhE, and ndhC-trnV) were found. Twenty-eight complete CP genome sequences belonging to Alpinieae species were used to construct phylogenetic trees. The results fully demonstrated the phylogenetic relationship among the genera of the Alpinieae, and further proved that Alpinia is a non-monophyletic group. The complete CP genomes of the two medicinal Alpinia species provides lays the foundation for the use of CP genomes in species identification and phylogenetic analyses of Alpinia species.
Zhang Y
,Song MF
,Li Y
,Sun HF
,Tang DY
,Xu AS
,Yin CY
,Zhang ZL
,Zhang LX
... -
《Frontiers in Plant Science》