
Suggest a workflow to make data comply with Darwin Core Standard
Source:R/suggest_workflow.R
suggest_workflow.Rd
Checks whether a data.frame
or tibble
conforms to Darwin
Core Standard and suggests how to standardise a data frame that is not
standardised to minimum Darwin Core requirements. This is intended as
users' go-to function for figuring out how to get started standardising
their data.
Output provides a summary to users about which column names match valid Darwin Core terms, the minimum required column names/terms (and which ones are missing), and a suggested workflow to add any missing terms.
Value
Invisibly returns the input data.frame
/tibble
, but primarily
called for the side-effect of running check functions on that input.
Examples
df <- tibble::tibble(
scientificName = c("Callocephalon fimbriatum", "Eolophus roseicapilla"),
latitude = c(-35.310, "-35.273"), # deliberate error for demonstration purposes
longitude = c(149.125, 149.133),
eventDate = c("14-01-2023", "15-01-2023"),
status = c("present", "present")
)
# Summarise whether your data conforms to Darwin Core Standard.
# See a suggested workflow to amend or add missing information.
df |>
suggest_workflow()
#>
#> ⠙ Checking 1 column: scientificName
#> ── Matching Darwin Core terms ──────────────────────────────────────────────────
#> ⠙ Checking 1 column: scientificName
#>
#> Matched 2 of 5 column names to DwC terms:
#> ⠙ Checking 1 column: scientificName
#>
#> ✔ Matched: eventDate scientificName
#> ⠙ Checking 1 column: scientificName
#> ✖ Unmatched: latitude, longitude, status
#> ⠙ Checking 1 column: scientificName
#>
#> ⠙ Checking 1 column: scientificName
#> ── Minimum required Darwin Core terms ──────────────────────────────────────────
#> ⠙ Checking 1 column: scientificName
#>
#> Type Matched term(s) Missing term(s)
#> ✔ Scientific name scientificName -
#> ✔ Date/Time eventDate -
#> ✖ Identifier (at least one) - occurrenceID, catalogNumber, recordNumber
#> ✖ Record type - basisOfRecord
#> ✖ Location - decimalLatitude, decimalLongitude, geodeticDatum, coordinateUncertaintyInMeters
#>
#>
#> ⠙ Checking 1 column: scientificName
#> ── Suggested workflow ──────────────────────────────────────────────────────────
#> ⠙ Checking 1 column: scientificName
#>
#> To make your data Darwin Core compliant, use the following workflow:
#>
#> df |>
#> ⠙ Checking 1 column: scientificName
#> set_occurrences() |>
#> ⠙ Checking 1 column: scientificName
#> set_coordinates()
#> ⠙ Checking 1 column: scientificName
#>
#> ⠙ Checking 1 column: scientificName
#> ── Additional functions
#> ⠙ Checking 1 column: scientificName
#> ℹ See all `set_` functions at
#> http://corella.ala.org.au/reference/index.html#add-rename-or-edit-columns-to-match-darwin-core-terms
#> ⠙ Checking 1 column: scientificName