Using multilocus sequence typing to study bacterial variation: prospects in the genomic era.
Jolley KA., Maiden MCJ.
Multilocus sequence typing (MLST) indexes the sequence variation present in a small number (usually seven) of housekeeping gene fragments located around the bacterial genome. Unique alleles at these loci are assigned arbitrary integer identifiers, which effectively summarizes the variation present in several thousand base pairs of genome sequence information as a series of numbers. Comparing bacterial isolates using allele-based methods efficiently corrects for the effects of lateral gene transfer present in many bacterial populations and is computationally efficient. This 'gene-by-gene' approach can be applied to larger collections of loci, such as the ribosomal protein genes used in ribosomal MLST (rMLST), up to and including the complete set of coding sequences present in a genome, whole-genome MLST (wgMLST), providing scalable, efficient and readily interpreted genome analysis.