How to overcome the rare species modelling paradox? Frank Breiner

How to overcome the rare species modelling paradox?
A test on 109 species of conservation concern
Frank Breiner1*, Michael Nobis1, Antoine Guisan2, Ariel Bergamini1
1Swiss Federal Research Institute WSL, Birmensdorf, Switzerland , 2University of Lausanne, Lausanne, Switzerland
* Contact: Frank Breiner, Swiss Federal Research Institute WSL, Zürcherstr. 111, 8903 Birmensdorf, [email protected], +41 44 7392 418
Methods cont.
• 3 SDM techniques (GLM, GBM,
Maxent) and the AUC based averaged
mean Ensemble Forecast (EF)
Species Distribution Models (SDMs) are
widely used within basic and applied
research. In nature conservation SDMs for
rare species which are characterised by low
sample sizes are of particular interest.
Unfortunately, model accuracy is usually
decreasing with decreasing sample size, i.e.
rare species and many predictor variables
boost model over-fitting. Species which are
most in need for conservation purposes are
the most difficult to model ('Rare Species
Modelling Paradox’).
Recently, Lomba et. al (2010) introduced a
new SDM strategy, the so-called ‘Ensemble
Bivariate Modelling’ (EBM) to overcome the
‘Rare Species Modelling Paradox’.
Here, we test different SDM strategies on
109 rare and under-sampled vascular plant
• 3 SDM strategies:
− ‘Reduced set’: standard strategy for
variable selection (e.g. BIC stepwise for GLMs)
− ‘PCA set’: using the first 6
components of a PCA as predictors
− ‘EBM’: new strategy by fitting all
possible bivariate models for a
given set of environmental variables
and average them using the AUC
based weighted average (Fig. 1)
Fig. 1: Framework of the ‘Ensemble Bivariate
• 109 species with number of occurrences
ranging between 10 and 377 (median 37)
EBM clearly outperforms the other SDM strategies (Fig. 2)
For very rare species (n<25) EBM is especially advantageous
While there is considerable variation between SDM techniques for the ‘Reduced’ and
the ‘PCA’ set, there hardly is any variation for the EBM strategy
• Red list categories are represented in our
target species as following: 5 critically
endangered, 25 endangered, 21 vulnerable, 26 near threatened and 32 least
• 50-fold split sampling with 90% train and
10% test data for the evaluation with AUC
and Boyce Index
• 11 low correlated (|r| ≤ 0.7) climatic,
topological and geological variables and
their quadratic terms were used to
calibrate the models
Fig.2: Model evaluation using AUC (top) and Boyce Index (bottom) for three groups with increasing
sample size. The EBM-strategy outperforms standard strategies especially for rare species.
All predictor variables could be included in the
EBM-Strategy, i.e. there is more information
available for modelling species’ distributions.
Interestingly, variability between SDM techniques is marginal for the EBM-strategy in
contrast to the standard strategies. It is thus
not so important which technique is used.
A ‘double-ensemble framework’ (EBM and
EF) does not improve model performance
and is thus not recommended also due to
its high computational effort and complexity.
Lomba, A., Pellissier, L., Randin, C., Vicente, J., Moreira, F., Honrado, J., Guisan,
A., 2010. Overcoming the rare species modelling paradox: A novel hierarchical
framework applied to an Iberian endemic plant. Biological Conservation 143,
We recommend to use the ‘EBMStrategy’ especially for rare species
(approx. <25 presences). We are
convinced that its use in future
modelling is worth the higher effort to
implement it.
Session Spatial Ecology | 43rd Annual Meeting of the Ecological Society GFÖ 2013 | Potsdam