Choosing an atlas

Martin Westgate, Dax Kellie

2023-01-12

The GBIF network consists of a series of a series of ‘node’ organisations who collate biodiversity data from their own countries, with GBIF acting as an umbrella organisation to store data from all nodes. Several nodes have their own APIs, often built from the ‘living atlas’ codebase developed by the ALA. At present, galah supports the following functions and atlases:

Set Organisation

Set which atlas you want to use by changing the atlas argument in galah_config(). The atlas argument can accept a full name, an acronym, or a region to select a given atlas, all of which are available via show_all(atlases). Once a value is provided, it will automatically update galah’s server configuration to your selected atlas. The default atlas is Australia.

If you intend to download records, you may need to register a user profile with the relevant atlas first.

galah_config(atlas = "GBIF.es", email = "your_email_here")

Look up Information

You can use the same look-up functions to find useful information about the Atlas you have set. Available information may vary for each Living Atlas.

galah_config(atlas = "Guatemala")
## Atlas selected: Sistema Nacional de Información sobre Diversidad Biológica de Guatemala (SNIBgt) [Guatemala]
show_all(datasets)
## # A tibble: 100 × 3
##    name                                                                                                               uri   uid  
##    <chr>                                                                                                              <chr> <chr>
##  1 A Distribution and Taxonomic Reference Dataset of Geranium (Geraniaceae) in the New World                          http… dr321
##  2 A global database for the distributions of crop wild relatives                                                     http… dr12 
##  3 A matrix-based revision of the genus Hypogena Dejean, 1834 (Coleoptera Tenebrionidae)                              http… dr467
##  4 A new Mexican species of Chrysina Kirby (Coleoptera: Scarabaeidae: Rutelinae), with nomenclatural changes, new re… http… dr554
##  5 A new species of Setostylus Matile and new records of Setostylus bellulus (Williston) (Diptera: Keroplatidae)      http… dr538
##  6 A revision of the genus Bromeloecia Spuler (Diptera: Sphaeroceridae: Limosininae)                                  http… dr481
##  7 A synopsis of American Caraphia Gahan, 1906 (Coleoptera: Cerambycidae: Lepturinae) with description of two new sp… http… dr539
##  8 A synopsis of the Neotropical genus Protoneura (Odonata: Coenagrionidae)                                           http… dr448
##  9 A systematic revision of the genus Archocentrus (Perciformes: Cichlidae), with the description of two new genera … http… dr362
## 10 A taxonomic monograph of the genus Tylodinus Champion (Coleoptera: Curculionidae: Cryptorhynchinae: Tylodina) of … http… dr564
## # … with 90 more rows
show_all(fields)
## # A tibble: 129 × 4
##    id                   description                                                                                   type  link 
##    <chr>                <chr>                                                                                         <chr> <chr>
##  1 all_image_url        Image URLs for this record                                                                    fiel… <NA> 
##  2 assertion_user_id    User ID of the person who has made an assertion about this record                             fiel… <NA> 
##  3 assertions           A list of all assertions (user and system supplied) for a record resulting from data quality… fiel… <NA> 
##  4 assertions_missing   Assertion indicating missing field values for a record. E.g. missing basis of record          fiel… <NA> 
##  5 assertions_passed    Assertion indicating a data quality test has been passed by this record                       fiel… <NA> 
##  6 assertions_unchecked Assertion indicating a data quality test was not performed for this record typically due to … fiel… <NA> 
##  7 basis_of_record      The basis of record e.g. Specimen, Observation http://rs.tdwg.org/dwc/terms/basisOfRecord     fiel… <NA> 
##  8 catalogue_number     http://rs.tdwg.org/dwc/terms/catalogNumber                                                    fiel… <NA> 
##  9 class                The class the Atlas has matched this record to in the NSL http://rs.tdwg.org/dwc/terms/class  fiel… <NA> 
## 10 collection_code      The collection code for this record. This will be populated if the data has come from a muse… fiel… <NA> 
## # … with 119 more rows
search_all(fields, "year")
## # A tibble: 3 × 4
##   id              description                                                                                         type  link 
##   <chr>           <chr>                                                                                               <chr> <chr>
## 1 year            http://rs.tdwg.org/dwc/terms/year                                                                   fiel… <NA> 
## 2 occurrence_year Year ranges for a search. Calculated based on the unique values for a query.                        fiel… <NA> 
## 3 date_precision  The precision of the date information for the record. Values include 'Day', 'Month', 'Year', 'Year… fiel… <NA>
search_taxa("lagomorpha")
## # A tibble: 1 × 12
##   search_term taxon_concept_id scientific_name scientific_name_authorship rank  kingdom  phylum  class order family genus species
##   <chr>       <chr>            <chr>           <chr>                      <chr> <chr>    <chr>   <chr> <chr> <chr>  <chr> <chr>  
## 1 lagomorpha  785              Lagomorpha      <NA>                       order Animalia Chorda… Mamm… Lago… <NA>   <NA>  <NA>

Download data

You can build queries as you normally would in galah. For taxonomic queries, use search_taxa() to make sure your searches are returning the correct taxonomic data.

galah_config(atlas = "United Kingdom")
## Atlas selected: National Biodiversity Network (NBN) [United Kingdom]
search_taxa("vlps")   # Returns no data due to misspelling
## # A tibble: 1 × 1
##   search_term
##   <chr>      
## 1 vlps
search_taxa("vulpes") # Returns data
## # A tibble: 1 × 13
##   search_term taxon_concept_id scientific_name scientific_name_au…¹ rank  kingdom super…² order class genus phylum family species
##   <chr>       <chr>            <chr>           <chr>                <chr> <chr>   <chr>   <chr> <chr> <chr> <chr>  <chr>  <chr>  
## 1 vulpes      NBNSYS0000138878 Vulpes          Frisch, 1775         genus Animal… Tetrap… Carn… Mamm… Vulp… Chord… Canid… <NA>   
## # … with abbreviated variable names ¹​scientific_name_authorship, ²​superclass
galah_call() |>
  galah_identify("vulpes") |>
  galah_filter(year > 2010) |>
  atlas_counts()
## # A tibble: 1 × 1
##   count
##   <int>
## 1 98000

Download species occurrence records from other atlases with atlas_occurrences()

galah_config(atlas = "Guatemala")
## Atlas selected: Sistema Nacional de Información sobre Diversidad Biológica de Guatemala (SNIBgt) [Guatemala]
galah_call() |>
  galah_identify("Lagomorpha") |>
  galah_filter(year <= 1980) |>
  galah_select(taxon_name, year) |>
  atlas_occurrences()
## # A tibble: 39 × 2
##    scientificName                                         year
##    <chr>                                                 <dbl>
##  1 Sylvilagus floridanus (J. A. Allen, 1890)              1968
##  2 Sylvilagus brasiliensis subsp. truei (J. Allen, 1890)  1947
##  3 Sylvilagus floridanus (J. A. Allen, 1890)              1968
##  4 Sylvilagus floridanus (J. A. Allen, 1890)              1968
##  5 Sylvilagus floridanus subsp. aztecus (J. Allen, 1890)  1924
##  6 Sylvilagus floridanus subsp. aztecus (J. Allen, 1890)  1924
##  7 Sylvilagus floridanus (J. A. Allen, 1890)              1906
##  8 Sylvilagus floridanus (J. A. Allen, 1890)              1906
##  9 Sylvilagus floridanus (J. A. Allen, 1890)              1960
## 10 Sylvilagus brasiliensis subsp. truei (J. Allen, 1890)  1960
## # … with 29 more rows

Complex queries with multiple Atlases

It is also possible to create more complex queries that return data from multiple Living Atlases. As an example, setting atlases within a loop with galah_config() and purrr::map() allows us to return the total number of species records in each Living Atlas in one table.

library(purrr)
library(tibble)
library(dplyr)
library(gt)

atlases <- show_all(atlases)

counts <- map(atlases$region, 
  function(x){
    galah_config(atlas = x)
    atlas_counts()
})

tibble(
  atlas = atlases$region, 
  n = unlist(counts)) |> 
  filter(n > 0) |>
  arrange(desc(n)) |>
  gt() |>
  fmt_number(column = n)
atlas n
Global 1,803,221,416.00
United Kingdom 207,672,900.00
France 128,490,040.00
Australia 112,998,157.00
Sweden 103,417,222.00
Spain 38,279,454.00
Brazil 23,739,468.00
Portugal 16,043,865.00
Austria 8,075,385.00
Estonia 7,127,695.00
Guatemala 3,586,634.00