Set, create or modify columns with occurrence-specific information

Format fields uniquely identify each occurrence record and specify the type of record. occurrenceID and basisOfRecord are necessary fields of information for occurrence records, and should be appended to a data set to conform to Darwin Core Standard prior to submission.

In practice this is no different from using mutate(), but gives some informative errors, and serves as a useful lookup for fields in the Darwin Core Standard.

Usage

set_occurrences(
  .df,
  occurrenceID = NULL,
  basisOfRecord = NULL,
  occurrenceStatus = NULL,
  .keep = "unused",
  .keep_composite = "all",
  .messages = TRUE
)

Arguments

.df

a data.frame or tibble that the column should be appended to.

occurrenceID

A character string. Every occurrence should have an occurrenceID entry. Ideally IDs should be persistent to avoid being lost in future updates. They should also be unique, both within the dataset, and (ideally) across all other datasets.

basisOfRecord

Record type. Only accepts camelCase, for consistency with field names. Accepted basisOfRecord values are one of:

"humanObservation", "machineObservation", "livingSpecimen", "preservedSpecimen", "fossilSpecimen", "materialCitation"

occurrenceStatus

Either "present" or "absent".

.keep

Control which columns from .df are retained in the output. Note that unlike dplyr::mutate(), which defaults to "all" this defaults to "unused"; i.e. only keeps Darwin Core columns, and not those columns used to generate them.

.keep_composite

Control which columns from .df are kept when composite_id() is used to assign values to occurrenceID, defaulting to "all". This has a different default from .keep because composite identifiers often contain information that is valuable in other contexts, meaning that deleting these columns by default is typically unwise.

.messages

Logical: Should progress message be shown? Defaults to TRUE.

Value

A tibble with the requested columns added/reformatted.

Details

Examples of occurrenceID values:

000866d2-c177-4648-a200-ead4007051b9
http://arctos.database.museum/guid/MSB:Mamm:233627

Accepted basisOfRecord values are one of:

"humanObservation", "machineObservation", "livingSpecimen", "preservedSpecimen", "fossilSpecimen", "materialCitation"

Examples

df <- tibble::tibble(
  scientificName = c("Crinia Signifera", "Crinia Signifera", "Litoria peronii"),
  latitude = c(-35.27, -35.24, -35.83),
  longitude = c(149.33, 149.34, 149.34),
  eventDate = c("2010-10-14", "2010-10-14", "2010-10-14")
  )

# Add occurrence information
df |>
  set_occurrences(
    occurrenceID = composite_id(random_id(), eventDate), # add composite ID
    basisOfRecord = "humanObservation"
    )
#> ⠙ Checking 2 columns: occurrenceID and basisOfRecord
#> ⠹ Checking 2 columns: occurrenceID and basisOfRecord
#> ✔ Checking 2 columns: occurrenceID and basisOfRecord [631ms]
#> 
#> # A tibble: 3 × 6
#>   scientificName   latitude longitude eventDate  occurrenceID      basisOfRecord
#>   <chr>               <dbl>     <dbl> <chr>      <chr>             <chr>        
#> 1 Crinia Signifera    -35.3      149. 2010-10-14 066cc3d0-4689-11… humanObserva…
#> 2 Crinia Signifera    -35.2      149. 2010-10-14 066cc3da-4689-11… humanObserva…
#> 3 Litoria peronii     -35.8      149. 2010-10-14 066cc3db-4689-11… humanObserva…