Gene¶
This element represents an identifier for a gene. It can be used to transmit the information that the gene is thought to play a causative role in the disease phenotypes being described in cases where the exact variant cannot be transmitted, either for privacy reasons or because it is unknown.
Data model
Field | Type | Status | Description |
---|---|---|---|
id | string | required | Official identifier of the gene |
alternate_ids | repeated string | optional | Alternative identifier(s) of the gene |
symbol | string | required | Official gene symbol |
Example
{
"id": "HGNC:347"
"symbol": "ETF1"
}
Optionally, with alternative identifiers:
{
"id": "HGNC:347",
"alternate_ids": ["ensembl:ENSRNOG00000019450", "ncbigene:307503"],
"symbol": "ETF1"
}
id¶
The id represents the accession number of comparable identifier for the gene.
It SHOULD be a CURIE identifier with a prefix that is used by the official organism gene nomenclature committee. In the case of Humans, this is the HGNC e.g. HGNC:347
alternate_ids¶
This field can be used to provide identifiers to alternative resources where this gene is used or catalogued. For example, the NCBI and Ensemble both provide alternative identifiers for genes where they catalogue the transcripts for a gene e.g. ncbigene:2107, ensembl:ENSG00000120705 These identifiers SHOULD be represented in CURIE form with a corresponding Resource.
symbol¶
This SHOULD use official gene symbol as designated by the organism gene nomenclature committee. In the case of human this is the HUGO Gene Nomenclature Committee e.g. ETF1.
Model Organisms¶
Model organisms represented by the Alliance of Genome Resources should use the primary identifier and symbol provided. e.g. for Mus musculus gene eukaryotic translation termination factor 1
{
"id": "MGI:2385071",
"alternate_ids": ["ensembl:ENSMUSG00000024360", "ncbigene:225363"],
"symbol": "Etf1"
}