Evolution of gene structure in the conifer Picea glauca: a comparative analysis of the impact of intron size.
Stival Sena J., Giguère I., Boyle B., Rigault P., Birol I., Zuccolo A., Ritland K., Ritland C., Bohlmann J., Jones S., Bousquet J., Mackay J.
BACKGROUND: A positive relationship between genome size and intron length is observed across eukaryotes including Angiosperms plants, indicating a co-evolution of genome size and gene structure. Conifers have very large genomes and longer introns on average than most plants, but impacts of their large genome and longer introns on gene structure has not be described. RESULTS: Gene structure was analyzed for 35 genes of Picea glauca obtained from BAC sequencing and genome assembly, including comparisons with A. thaliana, P. trichocarpa and Z. mays. We aimed to develop an understanding of impact of long introns on the structure of individual genes. The number and length of exons was well conserved among the species compared but on average, P. glauca introns were longer and genes had four times more intronic sequence than Arabidopsis, and 2 times more than poplar and maize. However, pairwise comparisons of individual genes gave variable results and not all contrasts were statistically significant. Genes generally accumulated one or a few longer introns in species with larger genomes but the position of long introns was variable between plant lineages. In P. glauca, highly expressed genes generally had more intronic sequence than tissue preferential genes. Comparisons with the Pinus taeda BACs and genome scaffolds showed a high conservation for position of long introns and for sequence of short introns. A survey of 1836 P. glauca genes obtained by sequence capture mostly containing introns <1 Kbp showed that repeated sequences were 10× more abundant in introns than in exons. CONCLUSION: Conifers have large amounts of intronic sequence per gene for seed plants due to the presence of few long introns and repetitive element sequences are ubiquitous in their introns. Results indicate a complex landscape of intron sizes and distribution across taxa and between genes with different expression profiles.