set_events#
For this function, we are only checking the events
dataframe. This function will check specifically for:
eventID
: ID of all your events. Can be constructed, or can be generated bycorella
usinguuid
.parentEventID
: linked ID of all your events. Events are in a hierarchy, which we will discuss below.eventType
: what type of event is it (i.e. Survey, BioBlitz, Site Visit etc.)eventDate
: date of the eventEvent
: name of the eventsamplingProtocol
: how did you record your data (i.e. Observation, etc.)
Adding event-specific information#
For events, we will start by specifying the information we know in the events
data: eventType
,
samplingProtocol
and the Event
itself.
>>> corella.set_events(dataframe=events,
... eventType='type',
... samplingProtocol='Observation',
... Event='name')
Traceback (most recent call last):
File "/Users/buy003/Documents/GitHub/corella-python/docs/source/corella_user_guide/longitudinal_studies/events_workflow.py", line 39, in <module>
new_events = corella.set_events(dataframe=events,eventType='type',samplingProtocol='Observation',
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/buy003/anaconda3/envs/galaxias-dev/lib/python3.11/site-packages/corella/set_events.py", line 116, in set_events
raise ValueError("Please provide column names for eventID and parentEventID. Or, provide an event_hierarchy dictionary for automatic ID generation.")
ValueError: Please provide column names for eventID and parentEventID. Or, provide an event_hierarchy dictionary for automatic ID generation.
However, for the events
file, each "event"
has a few things.
event hierarchy#
>>> corella.set_events(dataframe=events,
... eventType='type',
... samplingProtocol='Observation',
... Event='name',
... event_hierarchy={1: "Site Visit", 2: "Sample", 3: "Observation"})
random
eventID parentEventID eventType location date Event samplingProtocol
0 594ad0b3-2fb7-4741-9200-2e53d4be89c3 Site Visit Cannonvale 3/1/2023 bird survey local park honeyeater lookout point Observation
1 1f798ae6-eab9-4b28-be20-d41e7ac3ca4c 594ad0b3-2fb7-4741-9200-2e53d4be89c3 Sample Cannonvale 3/1/2023 bird survey local park honeyeater lookout point Observation
2 19f8541e-ddbd-47e7-83f1-e2aa400dc39b 1f798ae6-eab9-4b28-be20-d41e7ac3ca4c Observation Cannonvale 3/1/2023 bird survey local park honeyeater lookout point Observation
3 5c412291-f247-4161-b98e-0f87a080febf Site Visit Cannonvale 17/1/2023 bird survey local park honeyeater lookout point Observation
4 ce612b80-d659-41fa-8e5e-c09b309fee89 5c412291-f247-4161-b98e-0f87a080febf Sample Cannonvale 17/1/2023 bird survey local park honeyeater lookout point Observation
what does check_data
and suggest_workflow
say now?#
Note: each of the set_*
functions checks your data for compliance with the
Darwin core standard, but it’s always good to double-check your data.
Now, we can check that our data column do comply with the Darwin Core standard.
>>> corella.check_data(occurrences=occurrences,
... events=events)
random
Number of Errors Pass/Fail Column name
------------------ ----------- ----------------
0 ✓ eventID
0 ✓ parentEventID
0 ✓ eventType
0 ✓ Event
0 ✓ samplingProtocol
══ Results ════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
Errors: 0 | Passes: 5
✗ Data does not meet minimum Darwin core requirements
Use corella.suggest_workflow()
However, since we don’t have all of the required columns, we can run suggest_workflow()
again to see how our data is doing this time round.
>>> corella.suggest_workflow(events=events)
random
── Darwin Core terms ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
── All DwC terms ──
Matched 5 of 12 column names to DwC terms:
✓ Matched: eventID, parentEventID, eventType, Event, samplingProtocol
✗ Unmatched: Latitude, Collection_date, Species, Longitude, number_birds, location, date
── Minimum required DwC terms occurrences ──
Type Matched term(s) Missing term(s)
------------------------- ----------------- ------------------------------------------------
Identifier (at least one) - occurrenceID OR catalogNumber OR recordNumber
Record type - basisOfRecord
Scientific name - scientificName
Location - decimalLatitude, decimalLongitude, geodeticDatum
Date/Time - eventDate
Associated event ID - eventID
── Minimum required DwC terms events ──
Type Matched term(s) Missing term(s)
--------------------- ----------------- -----------------
Identifier eventID -
Linking identifier parentEventID -
Type of Event eventType -
Name of Event Event -
How data was acquired samplingProtocol -
Date of Event - eventDate
── Suggested workflow ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
── Occurrences ──
To make your occurrences Darwin Core compliant, use the following workflow:
corella.set_occurrences()
corella.set_scientific_name()
corella.set_coordinates()
corella.set_datetime()
Additional functions: set_abundance(), set_collection(), set_individual_traits(), set_license(), set_locality(), set_taxonomy()
── Events ──
To make your events Darwin Core compliant, use the following workflow:
corella.set_datetime()
Other functions#
To learn more about how to use these functions, go to
Optional functions: