Statistical tests for discrete cross-species data
Grafen A., Ridley M.
Four methods have been proposed that can be used to test for associations between the states of discrete characters in cross-species data and that do not suffer from non-independence due to overcounting of data points. The tests are those of Ridley (1983), Burt (1989), Grafen (1989), and a new test called the ICDE test. The aim of the paper is to measure the Type I error rates for these methods with simulated null distributions of discrete characters. The null data is generated by a model of discrete character evolution, using three shapes of phylogeny: tetratomous, dichotomous, and realistic. Ridley's and Burt's tests are both reasonably valid with the realistic phylogeny but biased with the tetratomous and dichotomous phylogenies. Grafen's phylogenetic regression is reasonably valid with all tree shapes. One version of the ICDE test was valid, the other less so. The invalid results are explained in terms of two kinds of statistical non-independence that arise in discrete data: non-independence due to the reconstruction of character states by parsimony, and the 'family problem' in which similar patterns are found in null data in many separate radiations because all the radiations began from the same ancestral state.