A novel methodology for large-scale phylogeny partition.
Prosperi MCF., Ciccozzi M., Fanti I., Saladini F., Pecorari M., Borghi V., Di Giambenedetto S., Bruzzone B., Capetti A., Vivarelli A., Rusconi S., Re MC., Gismondo MR., Sighinolfi L., Gray RR., Salemi M., Zazzi M., De Luca A., ARCA collaborative group None.
Understanding the determinants of virus transmission is a fundamental step for effective design of screening and intervention strategies to control viral epidemics. Phylogenetic analysis can be a valid approach for the identification of transmission chains, and very-large data sets can be analysed through parallel computation. Here we propose and validate a new methodology for the partition of large-scale phylogenies and the inference of transmission clusters. This approach, on the basis of a depth-first search algorithm, conjugates the evaluation of node reliability, tree topology and patristic distance analysis. The method has been applied to identify transmission clusters of a phylogeny of 11,541 human immunodeficiency virus-1 subtype B pol gene sequences from a large Italian cohort. Molecular transmission chains were characterized by means of different clinical/demographic factors, such as the interaction between male homosexuals and male heterosexuals. Our method takes an advantage of a flexible notion of transmission cluster and can become a general framework to analyse other epidemics.