Skip to content

curated_metadata_to_bold

Info

Data contract for the metadata_to_bold table in the curated layer of the Biocloud. Contains specimen metadata transformed to BOLD's schema format for export, including taxonomic, geographic, and collection information. - name: Metadata to BOLD - version: 0.0.1 - status: active

Terms of Use

Purpose

Data contract for the metadata_to_bold table in the curated layer of the Biocloud. Contains specimen metadata transformed to BOLD's schema format for export, including taxonomic, geographic, and collection information.

Servers

Name Type Attributes
production databricks No description.
environment: production
roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}]
catalog: dna_production
host: dbc-2030845a-6c3b.cloud.databricks.com
schema_: curated
development databricks No description.
environment: development
roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}]
catalog: dna_development
host: dbc-03ed8bbb-c0ec.cloud.databricks.com
schema_: curated

Schema

metadata_to_bold

Contains specimen metadata transformed to BOLD's upload template schema.

Field Type Attributes
SAMPLEID string Sample identifier, derived from catalog_number. Primary key for BOLD metadata records.
primaryKey
primaryKeyPosition: 1
required
FIELDID string Field identifier. Currently empty placeholder.
MUSEUMID string Museum identifier, derived from catalog_number.
required
COLLECTION_CODE string Collection code. Currently empty placeholder.
INST string Institution name, always "Naturalis Biodiversity Center".
required
FUNDING_SRC string Funding source, always "Dutch Research Council (NWO)".
PHYLUM string Phylum name from resolved taxonomy.
required
CLASS string Class name from resolved taxonomy.
ORDER string Order name from resolved taxonomy.
FAMILY string Family name from resolved taxonomy.
SUBFAMILY string Subfamily name. Currently empty placeholder.
TRIBE string Tribe name. Currently empty placeholder.
GENUS string Genus name from resolved taxonomy.
SPECIES string Species name from verbatim identification (only species-level matches included).
required
SUBSPECIES string Subspecies name. Currently empty placeholder.
IDENTIFIED_BY string Person or entity who identified the specimen.
IDENTIFICATION_METHOD string Method used for identification (e.g. Morphology).
TAXONOMY_NOTES string Additional taxonomy notes, includes original verbatim identification from collector.
SEX string Sex of the specimen.
examples: ['male', 'female', 'hermaphrodite', 'mixed']
REPRODUCTION string Reproduction information. Currently empty placeholder.
LIFE_STAGE string Life stage of the specimen.
examples: ['larve', 'pupa', 'juvenile', 'adult', 'imago']
SHORT_NOTE string Short notes. Currently empty placeholder.
NOTES string Additional notes. Currently empty placeholder.
VOUCHER_TYPE string Type of voucher, always "Vouchered:Registered Collection".
required
TISSUE_TYPE string Type of tissue sample. Currently empty placeholder.
SPECIMEN_LINKOUT string External link to specimen record. Currently empty placeholder.
ASSOCIATED_TAXA string Associated taxa from synecology data, concatenation of type and name.
ASSOCIATED_SPECIMENS string Associated specimens. Currently empty placeholder.
COLLECTORS string Name(s) of the collector(s), derived from recorded_by field.
COLLECTION_DATE_START string Start date of collection event in ISO format.
COLLECTION_DATE_END string End date of collection event in ISO format.
COUNTRY/OCEAN string Country or ocean where specimen was collected.
required
PROVINCE/STATE string Province or state. Currently empty placeholder.
REGION string Geographic region. Currently empty placeholder.
SECTOR string Geographic sector. Currently empty placeholder.
SITE string Collection site, derived from locality or locality_verbatim.
COORD:LAT string Latitude in decimal degrees.
COORD:LON string Longitude in decimal degrees.
ELEV string Elevation in meters, represented as single value or range.
DEPTH string Depth in meters, represented as single value or range.
ELEV_ACCURACY string Elevation accuracy. Currently empty placeholder.
DEPTH_ACCURACY string Depth accuracy. Currently empty placeholder.
COORD_SOURCE string Source of coordinates. Currently empty placeholder.
COORD_ACCURACY string Coordinate accuracy/uncertainty in meters.
COLLECTION_TIME string Collection time, represented as single value or range.
HABITAT string Habitat description where specimen was collected.
examples: ['inclosure dune-area', 'exclosure dune-area', 'sandy beach', 'forest']
SAMPLING_PROTOCOL string Sampling protocol or method used.
examples: ['malaise trap', 'pitfall trap', 'leaf litter']
COLLECTION_NOTES string Additional collection notes. Currently empty placeholder.
SITE_CODE string Site code identifier. Currently empty placeholder.
COLLECTION_EVENT_ID string Collection event identifier. Currently empty placeholder.
verbatim_kingdom string Kingdom in lowercase, used for partitioning exports. Used for grouping/filtering in Clickhouse, not included in BOLD export file.
required
exported_at string Timestamp when record was marked for export in format yyyy-MM-ddTHH-mm-ss. Used for grouping/filtering in Clickhouse, not included in BOLD export file.
required

SLA Properties