curated_sequences_to_bold¶
Info¶
Data contract for the sequences_to_bold table in the curated layer of the Biocloud. Contains DNA sequences formatted in FASTA format for BOLD export, linked to exported metadata records. - name: Sequences to BOLD - version: 0.0.2 - status: active
Terms of Use¶
Purpose¶
Data contract for the sequences_to_bold table in the curated layer of the Biocloud. Contains DNA sequences formatted in FASTA format for BOLD export, linked to exported metadata records.
Servers¶
| Name | Type | Attributes |
|---|---|---|
| production | databricks | No description. • environment: production • roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}] • catalog: dna_production • host: dbc-2030845a-6c3b.cloud.databricks.com • schema_: curated |
| development | databricks | No description. • environment: development • roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}] • catalog: dna_development • host: dbc-03ed8bbb-c0ec.cloud.databricks.com • schema_: curated |
Schema¶
sequences_to_bold¶
Contains DNA sequences in FASTA format for BOLD export. Only includes sequences whose metadata has been marked for export.
| Field | Type | Attributes |
|---|---|---|
| consensus_sequence_id | bigint | Unique identifier for the consensus sequence. Not included in BOLD export file. • primaryKey• primaryKeyPosition: 1 • required |
| fasta_sequence | string | DNA sequence in FASTA format with catalog_number as description line, followed by newline and consensus sequence. • required• examples: ['>RMNH.5143366\nATGCGTACGTA...'] |
| COLLECTORS | string | Name(s) of the collector(s). Used for grouping/filtering in Clickhouse, not included in BOLD export file. |
| SAMPLEID | string | Sample identifier matching the metadata_to_bold table, derived from catalog_number. Used for grouping/filtering in Clickhouse, not included in BOLD export file. • required |
| verbatim_kingdom | string | Kingdom in lowercase. Used for grouping/filtering in Clickhouse, not included in BOLD export file. • required |
| exported_at | string | Timestamp when record was marked for export in format yyyy-MM-ddTHH-mm-ss. Used for grouping/filtering in Clickhouse, not included in BOLD export file. • required |