raw_blast_results¶
Info¶
Data contract for ingesting blast_results data from the landing zone into raw. - name: Blast_results - version: 0.0.6 - status: active
Terms of Use¶
Purpose¶
Data contract for ingesting blast_results data from the landing zone into raw.
Servers¶
| Name | Type | Attributes |
|---|---|---|
| production | databricks | No description. • environment: production • roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}] • catalog: raw_production • host: dbc-2030845a-6c3b.cloud.databricks.com • schema_: bioinformaticians |
| development | databricks | No description. • environment: development • roles: [{'role': 'Admins', 'description': 'Access to all the data and settings'}] • catalog: raw_development • host: dbc-03ed8bbb-c0ec.cloud.databricks.com • schema_: bioinformaticians |
Schema¶
blast_results¶
Table storing the blast_results validated data in the raw layer of the Biocloud.
| Field | Type | Attributes |
|---|---|---|
| id | string | Biocloud generated hash as unique identifier for the blast_results • primaryKey |
| consensus_sequence_id | string | Identifier of the consensus sequence |
| identification | string | Identification assigned based on the BLAST result |
| is_valid | boolean | Indicates whether the BLAST result is valid |
| blast_hit_bitscore | long | Bitscore of the BLAST hit |
| blast_hit_blast_hit_id | string | Identifier of the BLAST hit |
| blast_hit_confidence | double | Confidence score of the BLAST hit |
| blast_hit_coverage | long | Coverage value of the BLAST hit |
| blast_hit_evalue | double | E-value of the BLAST hit |
| blast_hit_identity_percentage | double | Identity percentage of the BLAST hit |
| blast_hit_lineage_class | string | Taxonomic class of the BLAST hit lineage |
| blast_hit_lineage_family | string | Taxonomic family of the BLAST hit lineage |
| blast_hit_lineage_genus | string | Taxonomic genus of the BLAST hit lineage |
| blast_hit_lineage_kingdom | string | Taxonomic kingdom of the BLAST hit lineage |
| blast_hit_lineage_order | string | Taxonomic order of the BLAST hit lineage |
| blast_hit_lineage_phylum | string | Taxonomic phylum of the BLAST hit lineage |
| blast_hit_lineage_scientific_name | string | Scientific name of the BLAST hit lineage |
| blast_hit_lineage_species | string | Taxonomic species of the BLAST hit lineage |
| blast_hit_lineage_taxon_id | string | Taxon identifier of the BLAST hit lineage |
| blast_hit_sequence_id | string | Identifier of the matched BLAST hit sequence |
| blast_hit_source | string | Source of the BLAST hit |
| query_config_db | string | Database used for the BLAST query configuration |
| query_config_mi | bigint | Parameter mi used in the BLAST query configuration |
| timestamp | string | Timestamp associated with the BLAST result record |
| inserted_ts_utc | timestamp | UTC timestamp when the record was inserted |
| updated_ts_utc | timestamp | UTC timestamp when the record was last updated |
| nuc | string | Nucleotide sequence associated with the BLAST query |
| blast_hit_taxon_id | string | Taxon identifier for the BLAST hit |