how tu use data cleaning, eoo estimation and niche of occurrence features in modestr
DESCRIPTION
How tu use data cleaning, EOO estimation and niche of occurrence features in ModestR. Describes data cleaning; EOO methods supported in ModestR (convex hull, alpha shape, kernel density); niche of occurrence in ModestR.TRANSCRIPT
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Step by step tutorial: Using data cleaning and EOO
estimation in ModestR
What do you need for this tutorial:
1. ModestRv2.02. Environmental data already
integrated in ModestR3. Internet connection4. About 30 minutesModestR software can be freely downloaded from http://www.ipez.es/ModestR
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
We’ll describe how to clean occurrence data in ModestR, and how to estimate EOO using different hulling algorithms,
as well as niche of occurrence approach. Follow the next steps!
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Let’s suppose you want to download occurrence records from GBIF. To do so, select File/Import/Samples from online GBIF database
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Other options already exist such asimporting occurrence data from aCSV file, from KML files, shapefiles,etc.All those options are explained inthe ModestR tutorial available inthe ModestR website.
You can also retrieve a map storedin a ModestR database, asexplained in the step-by-steptutorial “ How to create a ModestRDatabase” available in the ModestRwebsite
In the dialog box that will appear we will enter the species name “isurus paucus”
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Type the species name “isurus paucus” in the
textbox.
Click on the Accept button.
The first step will download all synonyms of the species found in the GBIF online database. For this example, you just accept to download data for all of them and continue.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on the Continue button to accept
downloading occurrence data
including all synonyms of the species.
This list shows all synonyms of the
species found in the GBIF online database.
You can eventually select which
synonyms to include when downloading occurrence data. By
default, all are included.
The next step will query GBIF database how many occurrence records there are for the selected species. It will display this information and ask you to confirm you want to continue downloading. Downloading of occurrence records will be done. Wait a few minutes. When download finished, click on accept and continue.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Just click on the Accept button to
continue downloading occurrence data from
GBIF.
Select the valid habitats for the species. In this case, it is the Sea habitat. Once imported, MapMaker will display a brief summary of the imported, valid and invalid samples.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Check the Sea habitat and the Accept button
to continue
There are other options to clean imported data. Now we’ll just use the default
settings.But all those options are
explained in the ModestRTutorial available in the
ModestR website.
By default samples will autochecked. That is, ModestR will automatically check samples validity regarding the habitat.
Now imported data are shown on the map. By default samples are automatically checked regarding species valid habitats, and displayed using different colors (in this example usually all samples of Isurus paucus are correctly located).
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Let’s suppose you want to automatically clean current data detecting and removing outliers. To do that, go to and click on the Process/Data cleaning/Automatic environmental-based cleaning menu item.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on Process/Data cleaning/Automatic
environmental-based cleaning
The first step will be selecting which environmental variables will be used to perform data cleaning. Environmental data has to be previously integrated importing ASC raster files in ModestR. This is explained in the ModestR Quick start tutorial.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Besides the environmentalvariables you may integrated inModestR, there are anotherthree variables only applicableto samples, that allow datacleaning using geographicaldispersion criteria (see ModestRQuick start tutorial for moredetails).Note: for more details of how tointegrate environmental data inModestR, see the step-by-steptutorials or the Quick start tutorial.
For this example we will select all “Dispersal capacity” variables. You can also select another variable if you have it integrated in ModestR (for this example, you should select a variable with data for marine areas).Then click on Continue button.
In the next step, ModestR will detect the value for each one of the variables in the positions of the current samples. Then it will apply different outliers detection methods to determine acceptance ranges for each variable. Then it selects the “best” of them by default (being the best the one that excludes the less data). You can manually select another method, or even manually set range limits using the Custom method. This, and the other settings you can configure in this step are explained in more detail in the ModestR Quick start tutorial.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
You can select any variable in the list to see/modify the validation mode (i.e. outlier detection mode) that will be used to detect outliers for this variable.
For this example we will just use defaults.
Click on Continue button.
Once data cleaning task ended, a brief summary will be shown, indicating the number of processed samples, and how many were considered invalid (cleaned).
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on the Accept button.
You can also save a detailed report of the cleaning process into a CSV file, easily readable using a worksheet program such as Excel.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
In the report generated you can check the calculated acceptance ranges for each environmental variable used, the samples cleaned (invalidated), and the rule that caused it. So you can see which environmental variable was out of range in the sample location and motivated its invalidation.
Note: this report only contains the trace of the cleaning task on samples (occurrences). It doesn’t keep trace of the cleaning on areas, if there is any.
As you can see, cleaned samples are not definitely deleted. They are marked using a placemark and another color. But more important, even if they are preserved, they will not be took into account for any other subsequent task (presence calculation, hulling, EOO estimation, etc.). So, why are they preserved? First of all, because you may later decide to re-validate those samples. And second, because this way you can keep a trace of the original data and of the cleaning process.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Cleaned samples are marked by default using a placemark and another color.
Now we’ll see how you can estimate Extent Of Occurrence in MapMaker, using different algorithms. To do that, go to and click on the Process/Hull transformation menu item. Let’s start by Convex Hull option.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Select Convex Hull option
By default, visual simulation option is selected. So you can preview the resulting hull as a visual template, without modifying the distribution data.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Convex hull can include areas to be calculated, but this is a
NOT recommended and error-prone option, so it must be avoided unless you clearly
understand its implications. See ModestR Quick start tutorial for more details.
In ModestR a visual template is a polygonal shape that is
shown on the map, but that doesn’t affect presence data. It is just for displaying purposes, and it will not be stored with
the map. It is usually shown as a semi transparent shape over
the map.
Just click on the OK button. A visual template will appear on the map, as a preview of the EOO calculated using convex hull.
If you want to really add a presence area to the map, just change the option to “Add as presence area”.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Change the option to “Add as presence area and click on the OK button.
A presence area will be addedto the map. Unlike visualtemplates, this area is addedto the map and will beconsidered as an area ofpresence of the species.As you can see, ModestR onlyadds the presence area in thespecies currently validhabitats.In this example, the area isonly added on the sea areas,because it’s the only habitatwe indicated as valid for thespecies when we importedoccurrences from GBIF (seeprevious steps).
Well, as we want to test other EOO estimation methods, we’ll undo last changes to remove the convex hull presence area we have added.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on the Undo button to go back to the map as it was previously to convex hull, with occurrences only.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Now we’ll test another method to estimate Extent Of Occurrence in MapMaker. Go to Process/Hull transformation menu item and now click on the Alpha shape option.
Select Alpha shape option
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
As for convex hull, by default, visual simulation option is selected. So you can preview the resulting hull as a visual template, without modifying the distribution data.
Just click on the OK button. A visual template will appear on the map, as a preview of the EOO calculated using alpha shape.
As you can see, while withconvex hull we’ll obtain asingle shape that contains alloccurrences, with alpha shapewe can obtain severalseparated shapes.Alpha shape is more “fine-grained” in this way thanconvex hull.The number and size of theshapes depends on the alphaparameter.
For this example, we’ll not add a presence area (the procedure would be the same than for convex hull). We’ll just close the alpha share dialog box, and clear the visual templates that were added.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Use this button to clear visual templates from the map
We’ll test the last method to estimate Extent Of Occurrence in MapMaker. Go to Process/Hull transformation menu item and now click on the Kernel density estimation option.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Select Kernel density estimation option
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
The operation is almost the same than previously explained options. By default, visual simulation option is selected. So you can preview the resulting EOO as a visual template, without modifying the distribution data.
Just click on the OK button. A visual template will appear on the map, as a preview of the EOO calculated using Kernel density.
Kernel density method is basedon the density of occurrences of aspecies.
The kernel density “smoothing”parameter can be modified, evenif the default value has beencalculated to have a goodperformance in most of thesituations.
The cell width parameter controlsthe precision used to calculatedensities in the map. A value of 5’is a good compromise betweenprecision and performance. For afast preview, you can increase thisvalue (e.g. to 20’)
Besides calculating EOO, kernel density estimation can be useful to easily visualize the density of the occurrences of a species. MapMaker can easily show a kernel density map. To do that, go to and click on the View/Show kernel density map as raster menu item.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Select View/Show kernel density map as raster
Put plainly, a raster is a bitmap image. That is, an image composed from a
matrix or grid where each cell has a specific color.
Mapmaker can calculate kernel density and show it
as a raster on the map.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
The operation is similar to the previously explained for kernel density-based EOO. Just adjust the parameters and click on Apply. The map will be calculated and shown on the map without modifying the distribution data.
Just click on the Apply button. A visual template will appear on the map, as a preview of the EOO calculated using Kernel density.
Kernel density map is based onthe density of occurrences of aspecies.
The kernel density “smoothing”parameter can be modified toproduce more “appealing” andvisual maps. Here we increased itto x5.
The cell width parameter controlsthe precision used to calculatedensities in the map. A value of 5’is a good compromise betweenprecision and performance. For afast preview, you can increase thisvalue (e.g. to 20’)
You can export this map as an image, may be previously hiding the samples, for example.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Then export the current view (which includes the density map) as an image.
You can momentarily hide the samples from the map using those checkboxes, to have a clearest view of the density map.
To clear the density map, you just have to use the Clear visual raster button.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Use the Clear visual rasterbutton to remove any raster from the map (the kernel density map is a raster).
Until here we have seen several methods to calculate EOO and add it to the map. But ModestR can refine this approach applying an environmental filtering to the potential EOO. That’s what is called the niche of occurrence in ModestR. Let’s see how to proceed:
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Select Process/Niche calculation/Niche of occurrence option
The first step to calculate niche of occurrence is choosing the method to be used to calculate potential EOO. The available methods are the ones we have explained previously: convex hull, alpha shape, and kernel density.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Select the method to be used to calculate potential EOO. In this example, we are using alpha shape.
You can use the Preview button to see a visual
template of the potential EOO on the map.
Then click on the Continue button
In the next step we’ll select the environmental variables to be used to calculate niche of occurrence. To do that, you must have previously integrated environmental data into ModestR. To do that you can follow our tutorial “How integrate and use environmental data in ModestR”, available in our website.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Select the environmental variables to be used to calculate niche of occurrence.
Environmental variables will be used to determine acceptable ranges of each variable for the species.
This is done detecting the values of those variables
where there are occurrences of the
species.
Then click on the Continue button
In the next step % coverage for each variable is shown. This is the % of the EOO area where there are data for this variable. A too low coverage indicates that a variable may not be suitable to calculate niche of occurrence of this species, because there are not enough data for this variable into the EOO.You can also optionally calculate the VIF (variance inflation factor) of the selected variables to eventually detect multicollinearities and therefore remove some variables.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on the Continue button to go to the next step
To calculate VIF, click on this button. VIF will be shown in the list, aside each variable.You may eventually decide to remove a variable which has as higher VIF selecting it and using this button.
Remember that this step is optional. You can just
continue without calculating VIF neither modifying the
selected variable list.
In the next step you can optionally calculate a contribution index of the selected variables. This contribution index measures the relative importance of each variable to determine presence/absence of the species within its potential EOO. Once calculated, you can set a minimal contribution % to achieve, then ModestR will automatically select the variables with the highest contributions until attain this % (more details about contribution index calculation can be found on Modestr documentation)
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on the Continue button to go to the next step
To calculate contribution index, click on this button. It will be shown in the list, aside each variable.You may eventually set a minimal contribution % to achieve using this checkbox and entering the wanted % . In this case, selected variables will appear highlighted in the list (here with green color).
Remember that this step is optional. You can just continue
without calculating variable contribution. In this case all
variables will be used to calculate niche of occurrence.
In the last step you can set how each variable will be used to determine niche of occurrence. By default, a %1 of tolerance is allowed regarding the range detected for each environmental variable. This range is determined from the values of each variable in the current presence areas (in this case, that is where there are occurrences of the species).(more details about available settings can be found on Modestr documentation)
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on the Continue button to start niche of occurrence calculation
Selecting a variable in the left list, you can see the allowed range on the right panel (minimum and maximum accepted values). This range results from applying the set tolerance % to the range detected using the current presence data for the species.
Once calculated niche of occurrence, you’ll see new presence areas added to the map.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
New presence areas resulting from the niche of occurrence calculation are added to the map.
Those areas result from the selected EOO (alpha shape in this example), where areas that don’t comply with environmental restrictions (that is, where some variable has a value out of
the acceptance range) have been removed.
At first sight it may seems that there are no differences with the EOO using alpha shape. But if we take a detailed look, we’ll see differences:
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Using the Zoom tool, we’ll select an area to zoom on it
As you can see, niche of occurrence results in a filtered EOO, where some areas have been removed because the environmental conditions were out of the determined range.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Here for example, we can see that some small areas were removed from potential EOO because some of the environmental variables (from the selected to perform the niche of occurrence) took values out of the range accepted for this species.
Of course, here we are showing a particular example. The results you’ll obtain will depend of the species and the environmental
variables used to calculate niche of occurrence.
Once the niche of occurrence added to the map, maybe you’ll want to save it into a ModestR database.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Click on this button or go to menu File/Save/To Modestrmaps database to save the current map into a ModestRdatabase
You can find a step-by-step tutorial about how to create a
ModestR database in the ModestR website.
You can select the database that MapMaker works with in
menu File/Select database. That will be the database
where the maps will be saved by default.
As this is a new map, MapMaker will search in the database for a species with the same name and ask you to confirm the whole taxonomy before saving it.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
In this example we created a new map from GBIF data. That’s why MapMaker will
ask you to confirm taxonomy before saving it
into a database.
But if you loaded the map from a database, it will be
automaticallysaved with the current taxonomy.
Until here we have shown how to clean data, calculate EOO and niche of occurrence, for a single map in MapMaker. This can be useful when we’re working with few maps; or when we want to test several methods and interactively see results.But if we want to clean or calculate EOO for many maps, it can be very tedious to do it one by one. In that case, you can use DataManager! In DataManager you can do the same things than in MapMaker, but for a whole set of maps. For example for a full order, a family, etc. All in a single task! (of course, the more maps have to be processed, the more time it’ll take).
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/
Here, for example, we are adding EOO to all maps of a particular family, calculated using convex hull or alpha shape.
It was theStep by step tutorial:
Using data cleaning and EOO estimation in ModestR
Thank you for your interest.
MODESTR QUICK TUTORIALS HTTP://WWW.IPEZ.ES/MODESTR/º
You can find this one and other tutorials in PDF in http://www.ipez.es/ModestR (see Documentation section)