Skip to content

raw_consensuses

Info

Data contract for ingesting Nanopore consensuses data from the landing zone into raw. - name: Consensuses - version: 0.0.6 - status: active

Terms of Use

Purpose

Data contract for ingesting Nanopore consensuses data from the landing zone into raw.

Servers

Name Type Attributes
production databricks No description.
environment: production
roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}]
catalog: raw_production
host: dbc-2030845a-6c3b.cloud.databricks.com
schema_: nanopore
development databricks No description.
environment: development
roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}]
catalog: raw_development
host: dbc-03ed8bbb-c0ec.cloud.databricks.com
schema_: nanopore

Schema

consensuses

Table storing the consensuses validated data from Nanopore in the raw layer of the Biocloud.

Field Type Attributes
amplicon string Target DNA region amplified
required
barcoding_run string Identifier of the barcoding run
base_count bigint Number of bases in the sequence
created_at string Timestamp when record was created
demultiplexed_file string Path to demultiplexed read file
generated_at string Timestamp when sequence was generated
id string Unique identifier of the sequence
primaryKey
primaryKeyPosition: 1
required
is_valid_protein string yes if the sequencing is a valid protein sequenced (no stop-codons), 'no' if it has stop-codon;
not translated if the sequence was not translated

quality: [{'description': "Data Quality Gate Checks:\n - Value should be in ['yes', 'no', 'not translated']\nIf rows fail on these quality gate checks, they will be written to Quarantine\n", 'type': 'text'}]
label string Descriptive label for the sequence
modified_at string Timestamp when record was last modified
read_ratio double Ratio of reads supporting the sequence
sequence string Nucleotide sequence string
required
sha256_hexdigest string SHA-256 checksum of the sequence
supporting_read_count bigint Number of reads supporting this sequence
url string URL to access the sequence data
inserted_ts_utc timestamp UTC timestamp when inserted
updated_ts_utc timestamp UTC timestamp when last updated

SLA Properties