Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

The prediction of transmembrane (TM) helices plays an important role in the study of membrane proteins, given the relatively small number (approximately 0.5% of the PDB) of high-resolution structures for such proteins. We used two datasets (one redundant and one non-redundant) of high-resolution structures of membrane proteins to evaluate and analyse TM helix prediction. The redundant (non-redundant) dataset contains structure of 434 (268) TM helices, from 112 (73) polypeptide chains. Of the 434 helices in the dataset, 20 may be classified as 'half-TM' as they are too short to span a lipid bilayer. We compared 13 TM helix prediction methods, evaluating each method using per segment, per residue and termini scores. Four methods consistently performed well: SPLIT4, TMHMM2, HMMTOP2 and TMAP. However, even the best methods were in error by, on average, about two turns of helix at the TM helix termini. The best and worst case predictions for individual proteins were analysed. In particular, the performance of the various methods and of a consensus prediction method, were compared for a number of proteins (e.g. SecY, ClC, KvAP) containing half-TM helices. The difficulties of predicting half-TM helices suggests that current prediction methods successfully embody the two-state model of membrane protein folding, but do not accommodate a third stage in which, e.g., short helices and re-entrant loops fold within a bundle of stable TM helices.

Original publication

DOI

10.1093/protein/gzi032

Type

Journal article

Journal

Protein Eng Des Sel

Publication Date

06/2005

Volume

18

Pages

295 - 308

Keywords

Amino Acid Sequence, Artificial Intelligence, Databases, Factual, Internet, Membrane Proteins, Models, Chemical, Molecular Sequence Data, Protein Structure, Secondary