Accuracy assessment of global and local atrophy measurement techniques with realistic simulated longitudinal Alzheimer's disease images.
Camara O., Schnabel JA., Ridgway GR., Crum WR., Douiri A., Scahill RI., Hill DLG., Fox NC.
The evaluation of atrophy quantification methods based on magnetic resonance imaging have been usually hindered by the lack of realistic gold standard data against which to judge these methods or to help refine them. Recently [Camara, O., Schweiger, M., Scahill, R., Crum, W., Sneller, B., Schnabel, J., Ridgway, G., Cash, D., Hill, D., Fox, N., 2006. Phenomenological model of diffuse global and regional atrophy using finite-element methods. IEEE Trans. Med.l Imaging 25, 1417-1430], we presented a technique in which atrophy is realistically simulated in different tissue compartments or neuroanatomical structures with a phenomenological model. In this study, we have generated a cohort of realistic simulated Alzheimer's disease (AD) images with known amounts of atrophy, mimicking a set of 19 real controls and 27 probable AD subjects, with an improved version of our atrophy simulation methodology. This database was then used to assess the accuracy of several well-known computational anatomy methods which provide global (BSI and SIENA) or local (Jacobian integration) estimates of longitudinal atrophy in brain structures using MR images. SIENA and BSI results correlated very well with gold standard data (Pearson coefficient of 0.962 and 0.969 respectively), achieving small mean absolute differences with respect to the gold standard (percentage change from baseline volume): BSI of 0.23%+/-0.26%; SIENA of 0.22%+/-0.28%. Jacobian integration was guided by both fluid and FFD-based registration techniques and resulting deformation fields and associated Jacobians were compared, region by region, with gold standard ones. The FFD-based technique outperformed the fluid one in all evaluated structures (mean absolute differences from the gold standard in percentage change from baseline volume): whole brain, FFD=0.31%, fluid=0.58%; lateral ventricles, FFD=0.79%; fluid=1.45%; left hippocampus, FFD=0.82%; fluid=1.42%; right hippocampus, FFD=0.95%; fluid=1.62%. The largest errors for both local techniques occurred in the sulcal CSF (FFD=2.27%; fluid=3.55%) regions. For large structures such as the whole brain, these mean absolute differences, relative to the applied atrophy, represented similar percentages for the BSI, SIENA and FFD techniques (controls/patients): BSI, 51.99%/16.36%; SIENA, 62.34%/21.59%; FFD, 41.02%/24.95%. For small structures such as the hippocampi, these percentages were larger, especially for controls where errors were approximately equal to the small applied changes (controls/patients): FFD, 92.82%/43.61%. However, these apparently large relative errors have not prevented the global or hippocampal measures from finding significant group separation in our study. The evaluation framework presented here will help in quantifying whether the accuracy of future methodological developments is sufficient for analysing change in smaller or less atrophied local brain regions. Results obtained in our experiments with realistic simulated data confirm previously published estimates of accuracy for both evaluated global techniques. Regarding Jacobian Integration methods, the FFD-based one demonstrated promising results and potential for being used in clinical studies alongside (or in place of) the more common global methods. The generated gold standard data has also allowed us to identify some stages and sets of parameters in the evaluated techniques--the brain extraction step in the global techniques and the number of multi-resolution levels and the stopping criteria in the registration-based methods--that are critical for their accuracy.