The Norway spruce genome sequence and conifer genome evolution.
Nystedt B., Street NR., Wetterbom A., Zuccolo A., Lin Y-C., Scofield DG., Vezzi F., Delhomme N., Giacomello S., Alexeyenko A., Vicedomini R., Sahlin K., Sherwood E., Elfstrand M., Gramzow L., Holmberg K., Hällman J., Keech O., Klasson L., Koriabine M., Kucukoglu M., Käller M., Luthman J., Lysholm F., Niittylä T., Olson A., Rilakovic N., Ritland C., Rosselló JA., Sena J., Svensson T., Talavera-López C., Theißen G., Tuominen H., Vanneste K., Wu Z-Q., Zhang B., Zerbe P., Arvestad L., Bhalerao R., Bohlmann J., Bousquet J., Garcia Gil R., Hvidsten TR., de Jong P., MacKay J., Morgante M., Ritland K., Sundberg B., Thompson SL., Van de Peer Y., Andersson B., Nilsson O., Ingvarsson PK., Lundeberg J., Jansson S.
Conifers have dominated forests for more than 200 million years and are of huge ecological and economic importance. Here we present the draft assembly of the 20-gigabase genome of Norway spruce (Picea abies), the first available for any gymnosperm. The number of well-supported genes (28,354) is similar to the >100 times smaller genome of Arabidopsis thaliana, and there is no evidence of a recent whole-genome duplication in the gymnosperm lineage. Instead, the large genome size seems to result from the slow and steady accumulation of a diverse set of long-terminal repeat transposable elements, possibly owing to the lack of an efficient elimination mechanism. Comparative sequencing of Pinus sylvestris, Abies sibirica, Juniperus communis, Taxus baccata and Gnetum gnemon reveals that the transposable element diversity is shared among extant conifers. Expression of 24-nucleotide small RNAs, previously implicated in transposable element silencing, is tissue-specific and much lower than in other plants. We further identify numerous long (>10,000 base pairs) introns, gene-like fragments, uncharacterized long non-coding RNAs and short RNAs. This opens up new genomic avenues for conifer forestry and breeding.