The importance of correcting for sampling bias in MaxEnt species distribution models
Kramer-Schadt S., Niedballa J., Lindenborn J., Reinfelder V., Stillfried M., Heckmann I., Scharf AK., Hofer H., Wilting A., Pilgrim JD., Schröder B., Augeri DM., Cheyne SM., Hearn AJ., Ross J., Macdonald DW., Mathai J., Eaton J., Marshall AJ., Semiadi G., Rustam R., Bernard H., Alfred R., Samejima H., Duckworth JW., Breitenmoser-Wuersten C., Belant JL.
Aim: Advancement in ecological methods predicting species distributions is a crucial precondition for deriving sound management actions. Maximum entropy (MaxEnt) models are a popular tool to predict species distributions, as they are considered able to cope well with sparse, irregularly sampled data and minor location errors. Although a fundamental assumption of MaxEnt is that the entire area of interest has been systematically sampled, in practice, MaxEnt models are usually built from occurrence records that are spatially biased towards better-surveyed areas. Two common, yet not compared, strategies to cope with uneven sampling effort are spatial filtering of occurrence data and background manipulation using environmental data with the same spatial bias as occurrence data. We tested these strategies using simulated data and a recently collated dataset on Malay civet Viverra tangalunga in Borneo. Location: Borneo, Southeast Asia. Methods: We collated 504 occurrence records of Malay civets from Borneo of which 291 records were from 2001 to 2011 and used them in the MaxEnt analysis (baseline scenario) together with 25 environmental input variables. We simulated datasets for two virtual species (similar to a range-restricted highland and a lowland species) using the same number of records for model building. As occurrence records were biased towards north-eastern Borneo, we investigated the efficacy of spatial filtering versus background manipulation to reduce overprediction or underprediction in specific areas. Results: Spatial filtering minimized omission errors (false negatives) and commission errors (false positives). We recommend that when sample size is insufficient to allow spatial filtering, manipulation of the background dataset is preferable to not correcting for sampling bias, although predictions were comparatively weak and commission errors increased. Main Conclusions: We conclude that a substantial improvement in the quality of model predictions can be achieved if uneven sampling effort is taken into account, thereby improving the efficacy of species conservation planning. © 2013 John Wiley & Sons Ltd.