Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Several angiosperm plant genomes, including Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), poplar (Populus trichocarpa), and grapevine (Vitis vinifera), have been sequenced, but the lack of reference genomes in gymnosperm phyla reduces our understanding of plant evolution and restricts the potential impacts of genomics research. A gene catalog was developed for the conifer tree Picea glauca (white spruce) through large-scale expressed sequence tag sequencing and full-length cDNA sequencing to facilitate genome characterizations, comparative genomics, and gene mapping. The resource incorporates new and publicly available sequences into 27,720 cDNA clusters, 23,589 of which are represented by full-length insert cDNAs. Expressed sequence tags, mate-pair cDNA clone analysis, and custom sequencing were integrated through an iterative process to improve the accuracy of clustering outcomes. The entire catalog spans 30 Mb of unique transcribed sequence. We estimated that the P. glauca nuclear genome contains up to 32,520 transcribed genes owing to incomplete, partially sequenced, and unsampled transcripts and that its transcriptome could span up to 47 Mb. These estimates are in the same range as the Arabidopsis and rice transcriptomes. Next-generation methods confirmed and enhanced the catalog by providing deeper coverage for rare transcripts, by extending many incomplete clusters, and by augmenting the overall transcriptome coverage to 38 Mb of unique sequence. Genomic sample sequencing at 8.5% of the 19.8-Gb P. glauca genome identified 1,495 clusters representing highly repeated sequences among the cDNA clusters. With a conifer transcriptome in full view, functional and protein domain annotations clearly highlighted the divergences between conifers and angiosperms, likely reflecting their respective evolutionary paths.

Original publication




Journal article


Plant Physiol

Publication Date





14 - 28


Coniferophyta, DNA, Complementary, Evolution, Molecular, Expressed Sequence Tags, Genome, Plant, Multigene Family, RNA, Messenger