Skip to content

curated_sequences_to_bold

Info

Data contract for the sequences_to_bold table in the curated layer of the Biocloud. Contains DNA sequences formatted in FASTA format for BOLD export, linked to exported metadata records. - name: Sequences to BOLD - version: 0.0.2 - status: active

Terms of Use

Purpose

Data contract for the sequences_to_bold table in the curated layer of the Biocloud. Contains DNA sequences formatted in FASTA format for BOLD export, linked to exported metadata records.

Servers

Name Type Attributes
production databricks No description.
environment: production
roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}]
catalog: dna_production
host: dbc-2030845a-6c3b.cloud.databricks.com
schema_: curated
development databricks No description.
environment: development
roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}]
catalog: dna_development
host: dbc-03ed8bbb-c0ec.cloud.databricks.com
schema_: curated

Schema

sequences_to_bold

Contains DNA sequences in FASTA format for BOLD export. Only includes sequences whose metadata has been marked for export.

Field Type Attributes
consensus_sequence_id bigint Unique identifier for the consensus sequence. Not included in BOLD export file.
primaryKey
primaryKeyPosition: 1
required
fasta_sequence string DNA sequence in FASTA format with catalog_number as description line, followed by newline and consensus sequence.
required
examples: ['>RMNH.5143366\nATGCGTACGTA...']
COLLECTORS string Name(s) of the collector(s). Used for grouping/filtering in Clickhouse, not included in BOLD export file.
SAMPLEID string Sample identifier matching the metadata_to_bold table, derived from catalog_number. Used for grouping/filtering in Clickhouse, not included in BOLD export file.
required
verbatim_kingdom string Kingdom in lowercase. Used for grouping/filtering in Clickhouse, not included in BOLD export file.
required
exported_at string Timestamp when record was marked for export in format yyyy-MM-ddTHH-mm-ss. Used for grouping/filtering in Clickhouse, not included in BOLD export file.
required

SLA Properties