The fqar package

Andrew Gard and Alexia Myers

Introduction

The \({\tt fqar}\) packages provides tools for downloading and analyzing floristic quality assessment (FQA) data from universalFQA.org. Two sample data sets, chicago and missouri, are also provided.

Functions in this package fall into four general categories: indexing functions, which produce data frames of current public databases and FQAs from various regions, downloading functions, which download the FQAs themselves, tidying functions, which convert downloaded assessments into a standard format, and analytic functions, which compare species across assessments.

Indexing functions

Each floristic quality assessment is tied to a specific databases of native plants that has been compiled by experts in local flora. A listing of all databases accepted by universalFQA.org can be viewed with the index_fqa_databases() function.

databases <- index_fqa_databases()
head(databases)
#> # A tibble: 6 × 4
#>   database_id region                                                year descr…¹
#>         <dbl> <chr>                                                <dbl> <chr>  
#> 1         206 "Allegheny Plateau, Glaciated"                        2021 Faber-…
#> 2          70 "Appalachian Mtn (EPA Ecoregions 66,67,68,69) of KY…  2013 Gianop…
#> 3         108 "Atlantic Coastal Pine Barrens (8.5.4) "              2017 NEIWPC…
#> 4         136 "Atlantic Coastal Pine Barrens (8.5.4)"               2018 Nature…
#> 5         204 "Atlantic Coastal Pine Barrens (8.5.4)"               2021 Faber-…
#> 6           1 "Chicago Region"                                      1994 Swink,…
#> # … with abbreviated variable name ¹​description

To see a listing of all public floristic quality assessments using a given database, use the index_fqa_assessments() function.

missouri_fqas <- index_fqa_assessments(database_id = 63)
head(missouri_fqas)
#> # A tibble: 6 × 5
#>      id assessment                          date       site              pract…¹
#>   <dbl> <chr>                               <date>     <chr>             <chr>  
#> 1 28176 Nodaway River                       2022-12-08 Nodaway River     CHESTE…
#> 2 27647 MLTNS Prairie                       2022-10-27 MLTNS             Nathan…
#> 3 28275 Schmidt Prairie Species List - 2022 2022-10-27 Schmidt Prairie   Justin…
#> 4 27618 US 60                               2022-10-25 DuckettCreekTP4   Alex T…
#> 5 27518 Seed list                           2022-10-20 Shaw Nature Rese… Mike S…
#> 6 26735 Oakwood Bottoms Survey              2022-08-18 Oakwood Bottoms   Jonath…
#> # … with abbreviated variable name ¹​practitioner

Similarly, the index_fqa_transects() function returns a listing of all public transect assessments using the specified database.

missouri_transects <- index_fqa_transects(database_id = 63)
head(missouri_transects)
#> # A tibble: 6 × 5
#>      id assessment            date       site          practitioner 
#>   <dbl> <chr>                 <date>     <chr>         <chr>        
#> 1  7753 Pinery Plot 91 - 2022 2022-09-13 Pinery (MTNF) Justin Thomas
#> 2  7755 Pinery Plot 8 - 2022  2022-09-13 Pinery (MTNF) Justin Thomas
#> 3  7756 Pinery Plot 10 - 2022 2022-09-13 Pinery (MTNF) Justin Thomas
#> 4  7757 Pinery Plot 97 - 2022 2022-09-13 Pinery (MTNF) Justin Thomas
#> 5  7758 Pinery Plot 58 - 2022 2022-09-13 Pinery (MTNF) Justin Thomas
#> 6  7762 Pinery Plot 87 - 2022 2022-09-13 Pinery (MTNF) Justin Thomas

Downloading functions

Floristic quality assessments can be downloaded individually by id number or in batches according to specified search criteria using the download_assessment() and download_assessment_list() functions, respectively.

The first of these accepts an assessment ID number as its sole input and returns a data frame. For instance, the Grasshopper Hollow survey has assessment_id = 25961 according to the listing obtained using index_fqa_assessments(). The following code downloads this assessment.

grasshopper <- download_assessment(assessment_id = 25961)

Multiple assessments from a specified database can be downloaded simultaneously using download_assessment_list(), which makes use of dplyr::filter syntax on the variables id, assessment, date, site and practitioner. For instance, the following code downloads all assessments performed using the 2015 Missouri database at the Ambrose Farm site.

ambrose <- download_assessment_list(database_id = 63,
                                    site == "Ambrose Farm")

For even mid-sized requests, this command may run slowly due to the limited speed of the universalFQA.org website. For this reason, a progress bar has been added to the download_assessment_list() function when \(n\ge 5\).

As the name suggests, the output of download_assessment_list() is a list of data frames.

class(ambrose)
#> [1] "list"
length(ambrose)
#> [1] 3

Transect assessment data data stored on universalFQA.org is accessible to analysts using the \({\tt fqar}\) package via the functions download_transect() and download_transect_list(), which work exactly like their counterparts, download_assessment() and download_assessment_list().

rock_garden <- download_transect(transect_id = 6875)
golden <- download_transect_list(database_id = 63,
                                 site == "Golden Prairie")

Tidying functions

The data frames obtained from these downloading functions are all highly untidy, respecting the default structure of the website from which they are obtained. The \({\tt fqar}\) package provides tools for efficiently re-formatting these sets.

Each floristic quality assessments on universalFQA.org includes two types of information: details about the species observed during data collection and summary information about the assessment as a whole. The \({\tt fqar}\) functions assessment_inventory() and assessment_glance() extract and tidy these two types of information.

For instance, the following code creates a data frame of species found in the 2021 Grasshopper Hollow survey downloaded earlier.

grasshopper_species <- assessment_inventory(grasshopper)
glimpse(grasshopper_species)
#> Rows: 317
#> Columns: 9
#> $ scientific_name <chr> "Acer rubrum var. rubrum", "Acer saccharum subsp. sacc…
#> $ family          <chr> "Sapindaceae", "Sapindaceae", "Asteraceae", "Acoraceae…
#> $ acronym         <chr> "ACERUR", "ACESUG", "ACHMIL", "ACOCAL", "ACTPAC", "AGE…
#> $ nativity        <chr> "native", "native", "native", "non-native", "native", …
#> $ c               <dbl> 5, 5, 1, 0, 8, 2, 5, 4, 4, 0, 2, 7, 6, 4, 5, 4, 8, 5, …
#> $ w               <dbl> 0, 3, 3, -5, 3, 3, -3, 5, 3, -3, 3, -5, 3, 0, 0, 3, 5,…
#> $ physiognomy     <chr> "tree", "tree", "forb", "forb", "forb", "forb", "forb"…
#> $ duration        <chr> "perennial", "perennial", "perennial", "perennial", "p…
#> $ common_name     <chr> "red maple", "sugar maple", "yarrow", "sweet flag", "w…

A tidy summary of the assessment can be obtained with assessment_glance(). The output is a data frame with a single row and 53 columns, including native_mean_c, native_species, and native_fqi.

grasshopper_summary <- assessment_glance(grasshopper)
names(grasshopper_summary)
#>  [1] "title"                     "date"                     
#>  [3] "site_name"                 "city"                     
#>  [5] "county"                    "state"                    
#>  [7] "country"                   "fqa_db_region"            
#>  [9] "fqa_db_publication_year"   "fqa_db_description"       
#> [11] "custom_fqa_db_name"        "custom_fqa_db_description"
#> [13] "practitioner"              "latitude"                 
#> [15] "longitude"                 "weather_notes"            
#> [17] "duration_notes"            "community_type_notes"     
#> [19] "other_notes"               "private_public"           
#> [21] "total_mean_c"              "native_mean_c"            
#> [23] "total_fqi"                 "native_fqi"               
#> [25] "adjusted_fqi"              "c_value_zero"             
#> [27] "c_value_low"               "c_value_mid"              
#> [29] "c_value_high"              "native_tree_mean_c"       
#> [31] "native_shrub_mean_c"       "native_herbaceous_mean_c" 
#> [33] "total_species"             "native_species"           
#> [35] "non_native_species"        "mean_wetness"             
#> [37] "native_mean_wetness"       "tree"                     
#> [39] "shrub"                     "vine"                     
#> [41] "forb"                      "grass"                    
#> [43] "sedge"                     "rush"                     
#> [45] "fern"                      "bryophyte"                
#> [47] "annual"                    "perennial"                
#> [49] "biennial"                  "native_annual"            
#> [51] "native_perennial"          "native_biennial"

The tidy format provided by assessment_glance() is most useful when applied to multiple data sets at once, for instance in the situation where the analyst wants to consider statistics from many different assessments simultaneously. The assessment_list_glance() function provides a shortcut when those data frames are housed in a list like that returned by download_assessment_list(). For instance, the following code returns a data frame with 52 columns and 3 rows, one per assessment.

ambrose_summary <- assessment_list_glance(ambrose)

The \({\tt fqar}\) package also provides functions for handling transect assessment data. transect_inventory(), transect_glance(), andtransect_list_glance()work just like their counterparts,assessment_inventory(),assessment_glance(), andassessment_list_glance()`.

rock_garden_species <- transect_inventory(rock_garden)
rock_garden_summary <- transect_glance(rock_garden)
golden_summary <- transect_list_glance(golden)

Additionally, transect assessments usually include physiognometric metrics like relative frequency and relative coverage. These can be extracted with the trasect_phys() function.

rock_garden_phys <- transect_phys(rock_garden)
glimpse(rock_garden_phys)
#> Rows: 6
#> Columns: 6
#> $ physiognomy                       <chr> "Native forb", "Native grass", "Nati…
#> $ frequency                         <dbl> 115, 53, 20, 6, 4, 1
#> $ coverage                          <dbl> 628, 413, 180, 125, 78, 1
#> $ relative_frequency_percent        <dbl> 51.6, 23.8, 9.0, 2.7, 1.8, 0.4
#> $ relative_coverage_percent         <dbl> 26.1, 17.2, 7.5, 5.2, 3.2, 0.0
#> $ relative_importance_value_percent <dbl> 38.9, 20.5, 8.3, 4.0, 2.5, 0.2

Analytic functions

The \({\tt fqar}\) package provides tools for analyzing species co-occurrence across multiple floristic quality assessments. A typical workflow consists of downloading a list of assessments, extracting inventories from each, then enumerating and summarizing co-occurrences of species of interest.

# Obtain a tidy data frame of all co-occurrences in the 1995 Southern Ontario database:
ontario <- download_assessment_list(database = 2)
#> Downloading...

# Extract inventories as a list:
ontario_invs <- assessment_list_inventory(ontario)

# Enumerate all co-occurrences in this database:
ontario_cooccurrences <- assessment_cooccurrences(ontario_invs)

# Summarize co-occurrences in this database, one row per target species:
ontario_cooccurrences <- assessment_cooccurrences_summary(ontario_invs)

Of particular note is the species_profile() function, which returns the frequency distribution of C-values of co-occurring species for a given target species. Users may specify the optional native argument to include only native species in the profile. The species_profile_plot() function takes identical arguments but returns an elegant plot instead of a data frame

For instance, Aster lateriflorus (C=3) has the following native profile in the Southern Ontario database.

aster_profile <- species_profile("Aster lateriflorus", 
                                 ontario_invs,
                                 native = TRUE)
aster_profile
#> # A tibble: 11 × 4
#>    species            target_c cospecies_c cospecies_n
#>    <chr>                 <dbl>       <dbl>       <dbl>
#>  1 Aster lateriflorus        3           0         176
#>  2 Aster lateriflorus        3           1          58
#>  3 Aster lateriflorus        3           2         139
#>  4 Aster lateriflorus        3           3         209
#>  5 Aster lateriflorus        3           4         212
#>  6 Aster lateriflorus        3           5         186
#>  7 Aster lateriflorus        3           6         127
#>  8 Aster lateriflorus        3           7          83
#>  9 Aster lateriflorus        3           8          26
#> 10 Aster lateriflorus        3           9           9
#> 11 Aster lateriflorus        3          10          15

species_profile_plot("Aster lateriflorus", 
                     ontario_invs,
                     native = TRUE)

Data sets

Two tidy data sets of floristic quality data, chicago and missouri, are included with the \({\tt fqar}\) package. Produced with assessment_list_glance(), these show summary information for every floristic quality assessment that used databases 63 and 149, respectively, prior to August 14, 2022. These sets may be useful for visualization or machine-learning purposes. For instance, one might consider the relationship between richness and native mean C in sites assessed using the 2015 Missouri database:

ggplot(missouri, aes(x = native_species, 
                     y = native_mean_c)) +
  geom_point() +
  geom_smooth() +
  scale_x_continuous(trans = "log10") +
  labs(x = "Native Species (logarithmic scale)",
       y = "Native Mean C") +
  theme_minimal()
#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'