This is an old revision of the document!


PHP's gd library is missing or unable to create PNG images

Evaluating Weights of Evidence method for habitat suitability modeling: a comparison to Maximum Entropy in case of few presence records

Andre Carvalho Silveira, Daniel Fernandes Mamede Teixeira Lopes and Britaldo Silveira Soares Filho

Abstract

Studies on habitat suitability and species distribution aided by spatially explicit modeling are widely applied in ecological fields by enabling exploratory analysis of the relationship between species and its environmental context, thus predicting the likelihood of species occurrence (Guisan & Zimmermann, 2000). Therefore, suitability maps can be seen as an operational application of the niche theory, using environmental variables to indicate high suitable areas for presence and absence of a species within a quantitative range (Hirzel & Le Lay, 2008). This is a useful tool for ecological dynamics investigations, even for species without pseudo absence records and with only a few presence records available (Pearson et al., 2007). This article reports the application of two modeling methods providing suitability maps for Cotinga maculata (Cotingidae). The results comparison reaffirms the maximum entropy as an efficient approach to cases with only few records of presence (Pearson et. al, 2007), as well results show that the weights of evidence method presents satisfactory performance based on ROC analysis and similarity index.The availability of the weights of evidence method in the free platform Dinamica-EGO (Soares-Filho et al., 2013) offers an alternative, regarding habitat suitability.

Keywords

Habitat suitability, Cotinga maculata, Maximum Entropy, Weights of Evidence, Maxent, Dinamica EGO, ROC analysis.

Introduction

Spatially explicit models are computational representations of systems with expression in space (Wu & David, 2002). In terms of environmental requirements the ecological niche of determined species is a natural object proper for spatial modeling (Guisan & Zimmermann, 2000; Hirzel & Le Lay, 2008). Based on the relationship between different environmental variables and registers of species occurrence, it is possible to establish a spatial model for habitat suitability. In theory, such models enable a review about the knowledge of the species’ potential spatial distribuition (Franklin, 2011), even in areas still not visited. For that it is used projections of habitat suitability provided by modeling (Pearson et al., 2007).

Cotinga maculata is a species from Order Passariformes, Family Cotingidae, endemic to narrow remnants of the Brazilian Atlantic Forest between south of Bahia state and Rio de Janeiro state. The species occurs in lowland rainforest, up to 200 meters of altitude, primary vegetation or in advanced regeneration stage. Eventually the species can visit little forest patches looking for small fruits to compose its staple food. Considered rare by experts this species is difficult to observe due long immobile and quiet staying on trees. The few occurrence registers available concentrate at Conservation Units from the south Bahia state and north of Espírito Santo state (MMA, 2008). In this article it was used 18 registers from Conservation International Brazil database.

Methods

Maximum entropy modeling

The maximum entropy method proposes inferences from incomplete information defining a probability distribution that accepts all the constraints imposed by a given dataset, and also avoids any yielding for a specific constraint, ie, keeping the maximum entropy of the data. The method application assumes that there are features expressed by environmental variables distributed in a dataset (raster grid), the constraints that will drive these variables derive from the crossing with species occurrence points (organized in raster grid). Thus the entropy that can be understood as a measure of “inner amount of choice”, being maximized stochastically returns in a result that attends to the major number of constraints possible. In this sense the method avoids taking any unknown assumption (Philips et al., 2006). The final product is a map that indicates the suitability for species occurrence in the area relative to each raster cell.

Weights of evidence modeling

The weights of evidence method applies Bayesian probability to ponder the influence of each explanatory variable in respect to behavior of the response variable (Bonham-Carter 1994, Soares-Filho et al. 2004). The approach uses categorical and binary explanatory variables to assess how attractive or repulsive they are in relation to species occurrence (response variable). Thereby if the study includes continuous variables, they are categorized and each defined category is evaluated in terms of attractiveness (positive weight) and repulsiveness (negative weight) to the species occurrence. The suitability map produced assimilates how suitable is the environmental context of each portion of area to the species occurrence. The weights of each variable are explicit and can be manipulated through the Dinamica EGO platform.

The explanatory variables selected initially were: altitude, annual precipitation, maximum, minimum and mean annual temperature, all obtained online from WorldClim database (Hijmans et al., 2005). Every raster relative to these variables were resampled in 1000×1000 meters of spatial resolution. All variables and its intervals were submitted to statistical significance test. The variables considered in the study were the same used on both platforms: Maxent (for maximum entropy method) and Dinamica EGO (for weights of evidence method).

Suitability maps, congruence and divergence

The figure 1 shows the suitability maps obtained by both methods. It is possible to note the maps convergence over the areas with higher suitability to the species in analysis. The coastal area on extreme northeast of study area is the main region that concentrated high values of suitability. However it is possible identify traces of each approach in the respective produced maps. A substantial difference between the methods is the fact that the maximum entropy treats directly continuous variables. On other hand, the weights of evidence method categorizes all the continuous variables and treats each category as a binary secondary variable. Thus the gradient of the map produced by weights of evidence presents nuances that correspond to the categories created previously. This is the main feature that differentiates both obtained gradients.

 Figure 01: Suitability maps: (a) Weights of Evidence, (b) Maximum Entropy. Both the gradients were normalized to 0:100 range. Figure 01: Suitability maps: (a) Weights of Evidence, (b) Maximum Entropy. Both the gradients were normalized to 0:100 range.

One way to explore the concordance between different methods of building a suitability surface is generate congruence and divergence maps. Thereby it is possible observe spatially areas predicted suitable by both methods, areas predicted suitable exclusively by one method, and also the concordance by ranges of suitability. Furthermore, both maps can also be evaluated by the Dinamica EGO reciprocal similarity functor, as ilustrated by figure 2. This functor calculates a two-way fuzzy similarity index between two maps (Calc Reciprocal Similarity Map).

Figure 02: Congruence and divergence maps comparing Maximum Entropy and Weights of Evidence methods + similarity index. Figure 02: Congruence and divergence maps comparing Maximum Entropy and Weights of Evidence methods + similarity index. Figure 02: Congruence and divergence maps comparing Maximum Entropy and Weights of Evidence methods + similarity index.

ROC performance evaluation

The Receiver Operating Characteristic (ROC) is a method to evaluate image similarity considering a prefixed binary pattern. ROC ponders true positive rate and false positive rate through incremental binary classifications (Mas et. al, 2013a). Despite the method has been applied to many study fields, ROC is commonly used in GIS to evaluate predictions provided by modeling versus observed data. Thus this work uses ROC metrics to evaluate the performance of each method individually, as well as to compare predictions between the both methods.

Figure 03: ROC curve and respective metrics. Figure 03: ROC curve and respective metrics.

The main ROC metrics used to evaluate the results were the area under curve (AUC) and the partial area under curve (pAUC). Figure 03 presents the standard ROC chart contrasting true positive rate and false positive rate. The red diagonal curve represents a low-skilled prediction, ie, a hypothetical model that predicts how much hits as much false alarms. The suitability maps are interpreted on ROC as predictions to be compared with the fixed diagonal. Each suitability map evaluated generates a new curve for the same chart. Any superposition of the prediction in analysis over the fixed diagonal is interpreted as a performance gain. The final gain offered by the prediction analyzed (relative to the suitability map) is summarized by the AUC measure. The same reading can be applied for a restricted range of hit rate or error rate, this partial measure is called pAUC, as illustrated in the figure 03.

Results and Discussions

The suitability maps generated by maximum entropy and weights of evidence were compared by sampling due to allow a feasible analyses in terms of computational effort. The comparison process more costly took around 15 hours to be concluded on a computer with 64 GB of memory RAM. There were executed 469 bootstraps, each one generating a curve based in binary classifications incremented by 10% (ie, 10 points to compose the ROC curve). The methods were compared considering all the area under curve (AUC), and also considering partial area under curve (pAUC).

The maximum entropy method has reached AUC = 0.92, while the weights of evidence method has reached AUC = 0.81. The comparison between the methods through multiple sampling has generated a p-value = 0,030. The comparison restricted to high hit indices, conform suggested by Pearson (2007), resulted in a p-value = 0,045. To the partial area under curve comparison were used 50 bootstraps in order of computational limitations. The p-value of 0,030 obtained by comparison between both methods points a statistical correlation between both projections. This fact indicates that weights of evidence method has enough skill for habitat suitability modeling, even in cases of small size samples. Being the maximum entropy a method considered high skilled for these cases.

Figure 04: ROC curve and p-value for AUC comparison between maximum entropy and weights of evidence. Figure 04: ROC curve and p-value for AUC comparison between maximum entropy and weights of evidence.

Figure 05: ROC curve and p-value for pAUC comparison between maximum entropy and weights of evidence. Figure 05: ROC curve and p-value for pAUC comparison between maximum entropy and weights of evidence.

Besides direct comparison, the suitability maps of both methods were normalized to 0:100 range and then compared by ROC. In this case the p-value was 0.117. Meanwhile this result was obtained by a less exhaustive analysis in reason of computational limitations: 50 bootstraps and 10% of increment.

Results show that performance of weights of evidence is enough close to maximum entropy one. Once the maximum entropy is recognized as appropriated approach to model habitat suitability in low sampling cases (Pearson et al., 2007), weights of evidence emerges as an alternative for such studies. Considering the availability of weights of evidence method in the spatially explicit environment Dinamica EGO, it turns up an alternative to the commercial software Maxent, the main framework used to apply maximum entropy method. Still it is important to note that both modeling methods could be improved by more sophisticated configurations (heuristic searching, knowledge-driven adjustments, etc.). In this work both methods were compared assuming only calibration direct by sampling.

References

  1. Bonham-Carter, G. 1994. Geographic information systems for geoscientists: modelling with GIS. Pergamon, Oxford, UK.
  2. Franklin, J. Mapping Species Distribution: Spatial Inference and Prediction. 2011. Cambridge. Cambridge, UK.
  3. Mas, J.F. Tools for ROC analysis of spatial models Installation instructions and application examples. Centro de Investigaciones en Geografía Ambiental Universidad Nacional Autónoma de México (UNAM), 2013b.
  4. Ministério do Meio Ambiente. Livro Vermelho da Fauna Brasileira Ameaçada de Extinção. V(2) Brasília. 2008.

Resources

The links below are relative to models and available datasets used in this article:

Inputs Models Outputs