Skip to contents

Checks whether a data.frame or tibble conforms to Darwin Core Standard and suggests how to standardise a data frame that is not standardised to minimum Darwin Core requirements. This is intended as users' go-to function for figuring out how to get started standardising their data.

Output provides a summary to users about which column names match valid Darwin Core terms, the minimum required column names/terms (and which ones are missing), and a suggested workflow to add any missing terms.

Usage

suggest_workflow(.df)

Arguments

.df

A data.frame/tibble against which checks should be run

Value

Invisibly returns the input data.frame/tibble, but primarily called for the side-effect of running check functions on that input.

Examples

df <- tibble::tibble(
  scientificName = c("Callocephalon fimbriatum", "Eolophus roseicapilla"),
  latitude = c(-35.310, "-35.273"), # deliberate error for demonstration purposes
  longitude = c(149.125, 149.133),
  eventDate = c("14-01-2023", "15-01-2023"),
  status = c("present", "present")
)

# Summarise whether your data conforms to Darwin Core Standard.
# See a suggested workflow to amend or add missing information.
df |>
  suggest_workflow()
#> 
#> ⠙ Checking 1 column: scientificName

#> ── Matching Darwin Core terms ──────────────────────────────────────────────────
#> ⠙ Checking 1 column: scientificName

#> 
#> Matched 2 of 5 column names to DwC terms:
#> ⠙ Checking 1 column: scientificName

#> 
#>  Matched: eventDate scientificName
#> ⠙ Checking 1 column: scientificName

#>  Unmatched: latitude, longitude, status
#> ⠙ Checking 1 column: scientificName

#> 
#> ⠙ Checking 1 column: scientificName

#> ── Minimum required Darwin Core terms ──────────────────────────────────────────
#> ⠙ Checking 1 column: scientificName

#> 
#>   Type                      Matched term(s)  Missing term(s)                                                                
#>  Scientific name           scientificName   -                                                                               
#>  Date/Time                 eventDate        -                                                                               
#>  Identifier (at least one) -                occurrenceID, catalogNumber, recordNumber                                       
#>  Record type               -                basisOfRecord                                                                   
#>  Location                  -                decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters 
#> 
#> 
#> ⠙ Checking 1 column: scientificName

#> ── Suggested workflow ──────────────────────────────────────────────────────────
#> ⠙ Checking 1 column: scientificName

#> 
#> To make your data Darwin Core compliant, use the following workflow:
#> 
#> df |>
#> ⠙ Checking 1 column: scientificName

#>   set_occurrences() |> 
#> ⠙ Checking 1 column: scientificName

#>   set_coordinates()
#> ⠙ Checking 1 column: scientificName

#> 
#> ⠙ Checking 1 column: scientificName

#> ── Additional functions 
#> ⠙ Checking 1 column: scientificName

#>  See all `set_` functions at
#>   http://corella.ala.org.au/reference/index.html#add-rename-or-edit-columns-to-match-darwin-core-terms
#> ⠙ Checking 1 column: scientificName