Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Recent advances in both next-generation sequencing and assembly programmes have made the low-cost construction of transcriptome datasets for non-model species feasible, capable of yielding a raft of information even from less well-transcribed genes. Here we present the results of assemblies performed on a 51-bp paired end Illumina dataset derived from a mixed larval sample of the annelid Pomatoceros lamarckii at 24, 48 and 72 h post-fertilization. We used Oases to assemble 36.5 million paired end reads with k-mer sizes from 21 to 29, followed by amalgamation of assemblies, redundancy removal with Vmatch and TGICL and removal of contigs less than 500 bp in length. This resulted in a final assembly of 50,151 contigs, with a mean length of 1,221 bp and covering 61.3 Mbp. A total of 34,846 (69.4 %) of these returned a BlastX hit above a cutoff of 1.0e (-3), and 17,967 (35.8 %) were assigned at least one GO annotation using Blast2GO. We used the assembly to identify genes belonging to the homeobox superclass and the Fox, Sox and Tbx classes, recovering 37, 16, four and three genes, respectively. This included orthologues of genes previously unidentified in lophotrochozoans and protostomes. Our study illustrates the utility of such transcriptomic assembly methods as a gene discovery tool and greatly expands our knowledge of transcription factor genes in annelids in general and in this species in particular.

Original publication




Journal article


Dev Genes Evol

Publication Date





325 - 339


Amino Acid Sequence, Animals, Homeodomain Proteins, Humans, Molecular Sequence Data, Phylogeny, Polychaeta, Sequence Alignment, Transcription Factors, Transcriptome