The atlas_
functions are used to return data from the
atlas chosen using galah_config()
. They are:
atlas_counts
atlas_occurrences
atlas_species
atlas_media
atlas_taxonomy
The final atlas_
function - atlas_citation
- is unusual in that it does not return any new data. Instead it
provides a citation for an existing dataset ( downloaded using
atlas_occurrences
) that has an associated DOI. The other
functions are described below.
atlas_counts()
provides summary counts on records in the
specified atlas, without needing to download all the records.
galah_config(atlas = "Australia")
# Total number of records in the ALA
atlas_counts()
## # A tibble: 1 × 1
## count
## <int>
## 1 112555050
In addition to the filter arguments, it has an optional
group_by
argument, which provides counts binned by the
requested field.
galah_call() |>
galah_group_by(kingdom) |>
atlas_counts()
## # A tibble: 10 × 2
## kingdom count
## <chr> <int>
## 1 Animalia 85432793
## 2 Plantae 23468480
## 3 Fungi 2076156
## 4 Chromista 853644
## 5 Protista 144729
## 6 Bacteria 71362
## 7 Protozoa 3211
## 8 Eukaryota 1340
## 9 Archaea 1106
## 10 Virus 486
A common use case of atlas data is to identify which species occur in
a specified region, time period, or taxonomic group.
atlas_species()
is similar to search_taxa
, in
that it returns taxonomic information and unique identifiers in a
tibble
. It differs in not being able to return information
on taxonomic levels other than the species; but also in being more
flexible by supporting filtering:
<- galah_call() |>
species galah_identify("Rodentia") |>
galah_filter(stateProvince == "Northern Territory") |>
atlas_species()
|> head() species
## # A tibble: 6 × 10
## kingdom phylum class order family genus species author species_guid verna…¹
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Animalia Chordata Mammalia Rodentia Muridae Mesembriomys Mesembriomys gouldii (J.E. Gray, 1843) https://biodivers… Black-…
## 2 Animalia Chordata Mammalia Rodentia Muridae Zyzomys Zyzomys argurus (Thomas, 1889) https://biodivers… Common…
## 3 Animalia Chordata Mammalia Rodentia Muridae Pseudomys Pseudomys hermannsburgensis (Waite, 1896) https://biodivers… Sandy …
## 4 Animalia Chordata Mammalia Rodentia Muridae Notomys Notomys alexis Thomas, 1922 https://biodivers… Spinif…
## 5 Animalia Chordata Mammalia Rodentia Muridae Melomys Melomys burtoni (Ramsay, 1887) https://biodivers… Grassl…
## 6 Animalia Chordata Mammalia Rodentia Muridae Mus Mus musculus Linnaeus, 1758 https://biodivers… House …
## # … with abbreviated variable name ¹vernacular_name
To download occurrence data you will need to specify your email in
galah_config()
. This email must be associated with an
active ALA account. See more information in the config
section
galah_config(email = "your_email@email.com", atlas = "Australia")
Download occurrence records for Eolophus roseicapilla
<- galah_call() |>
occ galah_identify("Eolophus roseicapilla") |>
galah_filter(
== "Australian Capital Territory",
stateProvince >= 2010,
year profile = "ALA"
|>
) galah_select(institutionID, group = "basic") |>
atlas_occurrences()
## Warning: One or more parsing issues, call `problems()` on your data frame for details, e.g.:
## dat <- vroom(...)
## problems(dat)
|> head() occ
## # A tibble: 6 × 9
## decimalLatitude decimalLongitude eventDate scientificName taxonConceptID recor…¹ dataR…² occur…³ insti…⁴
## <dbl> <dbl> <dttm> <chr> <chr> <chr> <chr> <chr> <lgl>
## 1 -35.9 149. 2020-09-12 14:00:00 Eolophus roseicapilla https://biodiversity.… 17f46d… eBird … PRESENT NA
## 2 -35.9 149. 2021-09-27 14:00:00 Eolophus roseicapilla https://biodiversity.… dbb711… eBird … PRESENT NA
## 3 -35.9 149. 2012-01-18 13:00:00 Eolophus roseicapilla https://biodiversity.… 4f7cd7… BirdLi… PRESENT NA
## 4 -35.9 149. 2017-03-17 13:00:00 Eolophus roseicapilla https://biodiversity.… 3236c4… eBird … PRESENT NA
## 5 -35.9 149. 2020-11-14 13:00:00 Eolophus roseicapilla https://biodiversity.… ef2b90… eBird … PRESENT NA
## 6 -35.8 149. 2021-04-02 13:00:00 Eolophus roseicapilla https://biodiversity.… 45a589… eBird … PRESENT NA
## # … with abbreviated variable names ¹recordID, ²dataResourceName, ³occurrenceStatus, ⁴institutionID
In addition to text data describing individual occurrences and their
attributes, ALA stores images, sounds and videos associated with a given
record. Metadata on these records can be downloaded to R
using atlas_media()
and the same set of filters as the
other data download functions.
<- galah_call() |>
media_data galah_identify("Eolophus roseicapilla") |>
galah_filter(
== 2020,
year == "Australian Capital Territory") |>
cl22 atlas_media()
|> head() media_data
## # A tibble: 6 × 20
## decimalLati…¹ decim…² eventDate scien…³ taxon…⁴ recor…⁵ dataR…⁶ occur…⁷ multi…⁸ media…⁹ mime_…˟ size_…˟ date_…˟ date_…˟
## <dbl> <dbl> <dttm> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int> <chr> <chr>
## 1 -35.6 149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image 2f4d32… image/… 2654217 2020-0… 2020-0…
## 2 -35.6 149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image 734074… image/… 2422643 2020-0… 2020-0…
## 3 -35.6 149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image 89171c… image/… 2212660 2020-0… 2020-0…
## 4 -35.6 149. 2020-08-04 01:50:00 Eoloph… https:… 063bb0… iNatur… PRESENT Image e681d3… image/… 3414736 2020-0… 2020-0…
## 5 -35.5 149. 2020-08-26 01:53:00 Eoloph… https:… 286841… iNatur… PRESENT Image 1295c2… image/… 863158 2021-0… 2021-0…
## 6 -35.5 149. 2020-10-14 02:34:00 Eoloph… https:… 064a39… iNatur… PRESENT Image f97686… image/… 955916 2020-1… 2020-1…
## # … with 6 more variables: height <int>, width <int>, creator <chr>, license <chr>, data_resource_uid <chr>, occurrence_id <chr>,
## # and abbreviated variable names ¹decimalLatitude, ²decimalLongitude, ³scientificName, ⁴taxonConceptID, ⁵recordID,
## # ⁶dataResourceName, ⁷occurrenceStatus, ⁸multimedia, ⁹media_id, ˟mime_type, ˟size_in_bytes, ˟date_uploaded, ˟date_taken
To actually download the media files to your computer, use [collect_media()].
atlas_taxonomy
provides a way to build taxonomic trees
from one clade down to another using ALA’s internal taxonomy. Specify
which taxonomic level your tree will go down to with
galah_down_to
.
<- galah_call() |>
classes galah_identify("chordata") |>
galah_down_to(class) |>
atlas_taxonomy()
This function is unique within galah
as it is the only
function that returns a data.tree
, rather than a
tibble
.
## levelName
## 1 Chordata
## 2 ¦--Cephalochordata
## 3 ¦ °--Amphioxi
## 4 ¦--Craniata
## 5 ¦ °--Agnatha
## 6 ¦ ¦--Cephalasipidomorphi
## 7 ¦ °--Myxini
## 8 ¦--Tunicata
## 9 ¦ ¦--Appendicularia
## 10 ¦ ¦--Ascidiacea
## 11 ¦ °--Thaliacea
## 12 °--Vertebrata
## 13 °--Gnathostomata
## 14 ¦--Amphibia
## 15 ¦--Aves
## 16 ¦--Mammalia
## 17 ¦--Pisces
## 18 ¦ ¦--Actinopterygii
## 19 ¦ ¦--Chondrichthyes
## 20 ¦ ¦--Cephalaspidomorphi
## 21 ¦ °--Sarcopterygii
## 22 °--Reptilia
Although the tree format is useful, converting to a
data.frame
is straightforward.
::ToDataFrameTypeCol(classes, type = "rank") |> head() data.tree
## rank_phylum rank_subphylum rank_superclass rank_informal rank_class
## 1 Chordata Cephalochordata <NA> <NA> Amphioxi
## 2 Chordata Craniata Agnatha <NA> Cephalasipidomorphi
## 3 Chordata Craniata Agnatha <NA> Myxini
## 4 Chordata Tunicata <NA> <NA> Appendicularia
## 5 Chordata Tunicata <NA> <NA> Ascidiacea
## 6 Chordata Tunicata <NA> <NA> Thaliacea
galah
Various aspects of the galah package can be customized. To preserve
configuration for future sessions, set profile_path
to a
location of a .Rprofile
file.
To download occurrence records, you will need to provide an email address registered with the ALA. You can create an account here. Once an email is registered with the ALA, it should be stored in the config:
galah_config(email = "myemail@gmail.com")
galah
can cache most results to local files. This means
that if the same code is run multiple times, the second and subsequent
iterations will be faster.
By default, this caching is session-based, meaning that the local files are stored in a temporary directory that is automatically deleted when the R session is ended. This behaviour can be altered so that caching is permanent, by setting the caching directory to a non-temporary location.
galah_config(cache_directory = "example/dir")
By default, caching is turned off. To turn caching on, run
galah_config(caching = FALSE)
ALA requires that you provide a reason when downloading occurrence
data (via the galah atlas_occurrences()
function). The
reason is set as “scientific research” by default, but you can change
this using galah_config()
. See
show_all_reasons()
for valid download reasons.
galah_config(download_reason_id = your_reason_id)
If things aren’t working as expected, more detail (particularly about
web requests and caching behaviour) can be obtained by setting the
verbose
configuration option:
galah_config(verbose = TRUE)