Open code

Downloading biodiversity records from iNaturalist

Automatically download biodiversity records from iNaturalist, the most recognised citizen science initiative.

iNaturalist is a global platform where naturalists, citizen scientists, and biologists post their observations with photographs. Observations can be curated by the network of users to provide valuable open data to scientific research projects, conservation agencies and the general public. In particular, the data has been used to describe the global distribution of species, address niche-based questions, support biodiversity and ecosystem-based conservation, and to understand correlations between anthropogenic pressures and population extinctions.

Data may be accessed via its website, mobile application or API. Here I provide some line of python code to automatically download multiple records from iNaturalist using the API. In this particular case, the code will get all data for the Parque Marinho Luiz Saldanha, a Marine Protected Area located in Portugal.

# import dependencies
from requests import request
import json
from pandas.io.json import json_normalize
import pandas as pd
import urllib.request
import os

# Get all records for Parque Marinho Luiz Saldanha
outputFolder = '/theFolderToSaveData'
placeID = '128892'
if not os.path.exists(outputFolder):
os.makedirs(outputFolder)
df = pd.DataFrame([])
iteraction = 0
dataSize = 1
while (dataSize > 0):
iteraction = iteraction + 1
url = 'https://api.inaturalist.org/v1/
observations?geo=true&identified=true&place_id='+placeID+'&rank=species&
order=desc&order_by=created_at&page='+str(iteraction)+'&per_page=200'
response = requests.get(url)
dictr = response.json()
recs = dictr['results']
dataSize = len(recs)
data = json_normalize(recs)
if dataSize > 0:
df = df.append(data)

# Save images to the folder
for x in range(0, len(df)):
id = df['id'].values[x]
imageURL = json_normalize(df['photos'].values[x])['url'].values[0]
imageURLM = imageURL.replace("square", "medium")
imageURLO = imageURL.replace("square", "original")
urllib.request.urlretrieve(imageURL, outputFolder+'/'+str(id)+'Sq.jpg')
urllib.request.urlretrieve(imageURLM, outputFolder+'/'+str(id)+'M.jpg')
urllib.request.urlretrieve(imageURLO, outputFolder+'/'+str(id)+'O.jpg')

# Prepare pandas data.frame and export to csv
list(df.columns)
dfLatitude = df["location"].str.split(",", expand = True)[0]
dfLongitude= df["location"].str.split(",", expand = True)[1]
df1 = df[['id','taxon.name', 'quality_grade', 'time_observed_at', 'species_guess', 'license_code', 'observed_on',
'community_taxon_id']]
df1 = pd.concat([df1, dfLongitude, dfLatitude], axis=1)
df1.columns = ['id','taxon.name', 'quality_grade', 'time_observed_at', 'species_guess', 'license_code', 'observed_on',
'community_taxon_id','Longitude','Latitude']
df1.to_csv(outputFolder+'/dataFrame.csv', sep='\t', header=True)

Main reference

  • Featured code
Marine climate layers for ecological modelling

High-resolution marine data layers to model the distribution of species at global scales.

biodiversityDS.

Jorge Assis [PhD, Associate Researcher]
Centre of Marine Sciences, University of Algarve [Faro, Portugal]
© 2023 Biodiversity Data Science, All Rights Reserved