Getting Familiar with DIVA GIS

ESS 221 Tutorial 2

Getting Familiar with DIVA GIS

Tutorial Purpose: (1) Set up DIVA GIS software and climate data on lab machines (if not already available) and on personal laptops (where available) and (2) complete a short exercise in DIVA

Content: Species modeling (niche modeling) plays an important role in the prediction of species distributions, enabling the study of biodiversity past and present, as well as proposing scenarios and strategies for sustainable use and preservation in the future.

It is critical to understand what ecological niche modeling is. This refers to the process of using algorithms (there are many different algorithms available) to predict the distribution of species in geographic space on the basis of a mathematical representation of their known distribution in environmental space (ecological niche as interpreted from climate data). These models allow for interpolation between limited numbers of species occurrence points to generate a continuous surface of possible distribution. Different algorithms can be used; we will be focusing on two: Bioclim and Domain.

Bioclim models are heuristic based, working with presence-only data. The output of this model is a surface predicting potential distributions. The surface represents the climatic suitability across the geography in question. By default the output is broken into 6 classes: Excellent, Very High, High, Medium, Low, and Not Suitable.
Domain models are also heuristic based, working with presence-only data. The output of this model is a surface having values from 0-100, representing the probability that the species will be found in that geography. 100 indicates that there is a 0% chance the species will not be found in that geography, while a 0 indicates that there is a 100% chance the species will be found there. By default the output is broken into 6 classes: 100, 98-99, 96-97, 91-95, 51-90, and 0-50.

These models are concerned only with the environmental envelop. In other words, their output surface is based solely on comparing climatic data of known occurrences to climatic data of other geographies. The climate data we are using incorporates the following 19 factors:

1)Annual Mean Temperature

2)Mean Monthly Temperature

3)Isothermality

4)Temperature Seasonality

5)Max Temperature of Warmest Month

6)Min Temperature of Coldest Month

7)Temperature Annual Range

8)Mean Temperature of Wettest Quarter

9)Mean Temperature of Driest Quarter

10)Mean Temperature of Warmest Quarter

11)Mean Temperature of Coldest Quarter

12)Annual Precipitation

13)Precipitation of Wettest Month

14)Precipitation of Driest Month

15)Precipitation Seasonality

16)Precipitation of Wettest Quarter

17)Precipitation of Driest Quarter

18)Precipitation of Warmest Quarter

19)Precipitation of Coldest Quarter

Unlike the Bioclim and Domain models, the Ecocrop model incorporates physiological data about the species, where the species is present within DIVA GIS. Note that all species being considered for the Video Project are present within DIVA GIS. Physiological data for each species includes:

1)Temperature at which the crop will die in Celsius

2)Minimum temperature at which the crop will grow in Celsius

3)Minimum optimum temperature at which the crop grows in Celsius

4)Maximum optimum temperature at which the crop grows in Celsius

5)Maximum temperature at which the crop will grow in Celsius

6)Minimum amount of rain water required for the crop to grow in mm

7)Minimum optimum amount of rain water required for the crop to grow in mm

8)Maximum optimum amount of rain water required for the crop to growin mm

9)Maximum amount of rain water below which the crop grows in mm

10)Minimum length of the growing season in days

11)Maximum length of the growing season in days

Using these parameters, the model computes a suitability index for temperature and rainfall separately; these indexes are combined to evaluate the suitability of a certain place (and its known climate) to hold a certain species (and its known climate parameters).

------

In this tutorial, we will first set up DIVA GIS on your machines. We will set up climate data appropriately as well. Then, we will get familiar with DIVA basics and finally go through how to run these models. We will only begin to understand the results.

Step 1: Setting up Data from UWC’s NISL

All data for this tutorial is available at the course NISL within the Video Project/Tutorial2 directory.

1)Contents of NISL include:

Climate Data
ClimateData.zip
GIS Data
GISData.zip
Setup.exe
Tutorial2_Getting Familiar with DIVA GIS.doc

2)Create two folders in your course folder:

3)Download the contents of each folder on NISL into the respective directory:

Climate Data – extract the ClimateData.zip into this folder in your course folder. You will use this climate data in your final Video Projects so do not lose it.
The zip contains 12 climate files – related to current and future climate
Current climate data is worldclim_2-5
Future climate data is wcc_ccm3_2-5m
Extract this zip file by right clicking it and selecting ‘Extract Files…’. Then navigate in the pop up to the Climate Data directory and select ‘OK’:

GIS Data –extract theGISData.zip in this folder.
The zip contains 7 files related to two shapefiles:
Countries – countries of the world
Peanuts – peanut locations
Setup.exe
This is the DIVA GIS setup installation

Step 2: Set up DIVA GIS with Climate Data

Installing DIVA GIS:

1)Double-click on the setup.exe file to launch it

You will be prompted with an automated installation wizard

2)Proceed through the installation process selecting the defaults

3)Launch DIVA GIS – the icon looks like:

Setting up climate data:

1)Climate data will be found in the Climate Data directory of the extract from Step 1.

2)Within DIVA GIS

Navigate to Tools – Options

Within the Options menu, select the Climate tab at the top. The menu should appear as follows:

Note, if there is climate data already on your machine in the BCB 5th Floor lab, the text boxes will be filled in with appropriate information. If there is no climate data linked to your DIVA software, these text boxes will be blank.
We are using 2.5 minute resolution data. Therefore, you should be using the climate data downloaded from NISL only. Thus, if these boxes are already filled, we will be changing it!

Select the Folder button in the top left
Within the ‘Browse for Files or Folders’ select the Climate Data directory within our Climate Data directory that you have just downloaded and extracted – you should be selecting the folder itself
Press ‘OK’
All text boxed will shift to the new information
This information should be similar to that seen in the screenshot above
Note* If the text boxes are still empty after selecting a directory, press the Apply button at the bottom
Ensure that the drop down in the top left has worldclim_2-5m selected
This corresponds to the current climate data

Press ‘Make Default’. This will make the current climate data default selected on any modeling exercises completed.
Press ‘Apply’
Press ‘OK’

Step 3: Complete a DIVA GIS Exercise

1)Launch DIVA GIS

2)Add the countries_shp and peanuts shapefiles to the map

Layer – Add Layer…
Navigate to the GIS Data directory (from the .zip file you have downloaded and extracted)
You will have to add each layer separately – peanuts.shp and countries.shp

3)Your screen should look similar to:

4)Distribution Modeling

Frequency Distributions.
Now we are going to explore the climate data associated with the peanut accessions that we have imported. Make the “peanuts” layer the active layer.
Use Modeling – Bioclim / Domain
Go to the second tab (Frequency) and press Apply

Note that there are 205 non-duplicate observations. Points that are on exactly the same location are included only once. Points that are at a different location but in the same cell of the climate grid are also included once, in this case (this is an option that can be changed on the first tab (Remove duplicates – From same grid cell; after which there are 224 unique observations).
Explore this tab by changing the climate variables. This is helpful to get an ideaabout the general characteristics of the distribution of these points in ecologicalspace (as opposed to geographical space). It may also reveal ecological outliers,perhaps caused by incorrect coordinates. Check the Show percentile and Show IQR(inter quartile range) boxes.
If you click on a point in the graph different things can happen, depending on thestate of checkboxes above the graph. Check this out.

Environmental Envelope
Go to the Envelope tab. Here you can look at the distribution of the points in twodimensions of the ecological space.
Define an “Envelope” using the Percentile button and adjacent text box. Thepoints inside the multidimensional (of all climatic variables) envelope are coloredgreen; the other ones are colored red. Of course there are never green pointsoutside the two-dimensional envelope that can be seen on the graph.
The percentiles are calculated for each variable individually, and thencombined. The percentage of the points that are inside amultidimensional envelope at a certain percentile will be different foreach data set.
The points that are inside the (two-dimensional) envelope on the graphare now colored blue on the map. In this way we have established alink between the distributions in ecological and geographic space.

Modeled distributions
Go to the Predict tab.
Within Modeling – Bioclim/Domain menu
Note, for this menu to appear, the peanuts must be the active layer because these models work for point data.
First enter the area for which you want to make a prediction. Use these values:MinX = -71; MaxX = -39; MinY = -33; MaxY = -4
Select the Bioclim output, choose an output filename, and press Apply.
The resultshould be like in the image below. Note that this result suggests that there is alarge area in south central Brazil where the climate appears to be suitable forpeanuts but where no peanuts were collected (or even occur?).

Now make distribution models with different output types (use “Domaim”, “Bioclim True/False”, “Domain True/False”, “Bioclim Most Limiting Factor”, and “Domain Most Limiting Factor”. Note that the Bioclim models are all representations of the same algorithm; and the Domain models are all representations of the same algorithm.
The True/False surface has two possible values: 0 (False or Not suitable) or 1 (True or Suitable), and that the result is dependent on the percentile cut-off you choose (explore this cut-off in the graph of the Envelope tab).
The “Most Limiting Factor” outputs the variable with the lowest score in each grid cell for which there is prediction.

Ex) Bioclim Output

Ex) Bioclim True/False Output

5)Climate Change

Before we consider future climates, lets first have a final Bioclim model output with the current climate data
With peanuts as the active layer, navigate to Modeling – Bioclim/Domain
In the Input tab, ensure ‘One Class’ is selected
In the Predict tab, use the same coordinates as we have been using.
Ensure the Bioclim type is selected
Save the layer as an output in your directory
Press ‘Apply’
The output layer will be a grid of climatic suitability for the occurrences (all species in the input peanut file) colored green (low suitability) to red (high suitability)

Now let’s consider the potential effect of climate change on the distributions of wild Arachis.
With peanuts as the active layer, navigate to Modeling – Bioclim/Domain
In the Input tab, ensure ‘One Class’ is selected
Also ensure the DIVA Climate data is set to worldclim_2-5m as climatic adaptation is inferred from this data set

In the Predict tab
Ensure the Climate database (output) is set to the FUTURE climate data – set to wc_ccm3_2-5m
Use the same coordinates as we have been using
Ensure the Bioclim type is selected
Save the layer as an output in your directory

Press ‘Apply’
The output layer will be a grid of climatic suitability for the occurrences (all species in the input peanut file) colored green (low suitability) to red (high suitability)

If you compare this future map with the current map, or simply with the observed localities, you will note that the areas that are predicted suitable in the future are almost completely shifted away from where the Arachis species are now. One must be very careful not to jump to conclusions from this result. There are a number of reasons for this. Particularly important is that we do not know the real climatic adaptation of the genus. All we know is where it currently occurs, and that is a result of past climate change, current climate, dispersal limitation, and interactions with other species including humans. Whereas it is quite reasonable to assume a steady state primarily determined by the climate for the current distribution, it is highly speculative to extend that to the climatic conditions predicted for the future. Nevertheless, this type of predictive modeling may be of some use, for example, to identify the areas or species that are most likely to be affected.

It is probably better, when making predictions for the future potential distribution, to select fewer climate variables, and perhaps not all “tails” of these variables. Ideally, to predict future distributions one would use models that are not simply based on current distribution but that are based on physiological knowledge about the taxon. Such an approach, albeit rudimentary, can be taken using the Ecocrop module.

6)ECOCROP – Physiological Inputs

Go to Modeling – Ecocrop
On the first tab (‘Select’) search for scientific names that start with Arachis. None of the species we were dealing with is in the list. So let’s use the cultivated peanut (groundnut) Arachis hypogea. Select that record.
Go to the Predict tab
Use the coordinates used for the Bioclim models
Ensure the Climate database is set to the current climate data
Set the output to your working directory

Note that the area predicted to be highly suitable for cultivated peanut includes all or the area of our wild Arachis records, except for a few records of A. batizocoi. If you look at this species in the Frequency tab of the Bioclim/Domain module you will see that these records are outliers in the rainfall distribution, it is much drier where they are than in the other locations where the species has been observed. Again, in the first place it would be good to check the validity of the geographic coordinates of the data, and if these appear to be correct, then these populations might be of special interest.

Ecocrop Model Output with Current Climate Data

Now repeat but the above steps in the Ecocrop module but using the future climate data to model the climatic suitability for cultivated peanut in this area. The modeling is in both cases based on the same physiological parameters that are shown in the second tab of the Ecocrop window. Note how little overall change there is between the two results (but there are some major changes in the western end of the suitable area).

Ecocrop Output with Future Climate Data

7)Single Species Models

So far we have only considered the peanuts layer as a whole, for the Arachis genus. However, if you investigate the underlying attribute table, you will see that several species are present within this shapefile.

Now let’s go to the species level. Go to the Input tab and select “Many Classes” and select the “TAXON” field. Go over the list of taxa to assure that there are no errors in the list of taxa. A typical problem is that the same species occurs more than once because of spelling mistakes.

Now go to the Frequency tab and select one or two classes (species in our case). In the example below we compare the annual rainfall for two species. A. kuhlmannii is clearly distributed in drier areas than A. stenosperma. Note that there is one huge outlier in the rainfall distribution of A.kuhlmanii.

Click on that outlier on the graph, to find out what record it belongs to, what its associated climate data are, and where it is located on the map (use the checkboxes above the graph).
Examine this point geographically by comparing it to the other observations of this species.
First select the species, using Layer – Select Records so that the records for that species are colored blue on the map. Then click on the outlier on the 'frequency' graph, and see to which point on the map it corresponds. It is clearly a geographical outlier as well. No members of the species (or any other Arachis species) have been observed in this area. In this case it would be prudent to check the coordinates against the locality description. However, that information is not provided here so there is not much we can do to assess the whether this record is valid or not.

Go to the Predict tab and make modeled predictions (Bioclim) for the same area as before
Set the coordinates again as before OR make the previously modeled distribution the active layer and press the “read from layer” button to get the coordinates
In this case, however, use the batch option. This will create a prediction for each species. Note that in order for the Batch option to appear in the Predict tab, ‘Many Classes’ must be selected in the Input tab with a proper field selected (TAXON).
Save the output file in a new folder – You must create this folder (you can create it as you are in the ‘Save As’ window)
Note that the output file is a “stack”. A stack is a collection of 1 or more gridfiles of the same origin, extent, and resolution.