Biosample¶
A Biosample refers to a unit of biological material from which the substrate molecules (e.g. genomic DNA, RNA, proteins) for molecular analyses (e.g. sequencing, array hybridisation, mass-spectrometry) are extracted. Examples would be a tissue biopsy, a single cell from a culture for single cell genome sequencing or a protein fraction from a gradient centrifugation. Several instances (e.g. technical replicates) or types of experiments (e.g. genomic array as well as RNA-seq experiments) may refer to the same Biosample.
Data model¶
Field Type Multiplicity Description id string 1..1 Arbitrary identifier. REQUIRED. individual_id string 0..1 Arbitrary identifier. RECOMMENDED. derived_from_id string 0..1 id of the biosample from which the current biosample was derived (if applicable) description string 0..1 arbitrary text sampled_tissue OntologyClass 0..1 Tissue from which the sample was taken sample_type OntologyClass 0..1 type of material, e.g., RNA, DNA, Cultured cells phenotypic_features PhenotypicFeature (List) 0..* List of phenotypic abnormalities of the sample. RECOMMENDED. measurements Measurement (List) 0..* List of measurements of the sample taxonomy OntologyClass 0..1 Species of the sampled individual time_of_collection TimeElement 0..1 Age of the proband at the time the sample was taken. RECOMMENDED. histological_diagnosis OntologyClass 0..1 Disease diagnosis that was inferred from the histological examination. RECOMMENDED. tumor_progression OntologyClass 0..1 Indicates primary, metastatic, recurrent. RECOMMENDED. tumor_grade OntologyClass 0..1 Term representing the tumor grade pathological_stage OntologyClass 0..1 Pathological stage, if applicable. RECOMMENDED. pathological_tnm_finding OntologyClass (List) 0..* Pathological TNM findings, if applicable. RECOMMENDED. diagnostic_markers OntologyClass (List) 0..* Clinically relevant biomarkers. RECOMMENDED. procedure Procedure 0..1 The procedure used to extract the biosample. RECOMMENDED. files File (List) 0..* list of files related to the biosample, e.g. VCF or other high-throughput sequencing files material_sample OntologyClass 0..1 Status of specimen (tumor tissue, normal control, etc.). RECOMMENDED. sample_processing OntologyClass 0..1 how the specimen was processed sample_storage OntologyClass 0..1 how the specimen was stored
Example¶
The staging system most often used for bladder cancer is the American Joint Committee on Cancer (AJCC) TNM system. The overall stage is assigned based on the T, N, and M categories (Cancer stage grouping). For instance, stage II (pathological staging) is defined as T2a or T2b, N0, and M0, meaning the cancer has spread into the wall of the bladder.
biosample:
id: "sample1"
individualId: "patient1"
description: "Additional information can go here"
sampledTissue:
id: "UBERON_0001256"
label: "wall of urinary bladder"
histologicalDiagnosis:
id: "NCIT:C39853"
label: "Infiltrating Urothelial Carcinoma"
tumorProgression:
id: "NCIT:C84509"
label: "Primary Malignant Neoplasm"
tumorGrade:
id: "NCIT:C36136"
label: "Grade 2 Lesion"
procedure:
code:
id: "NCIT:C5189"
label: "Radical Cystoprostatectomy"
files:
- uri: "file:///data/genomes/urothelial_ca_wgs.vcf.gz"
individualToFileIdentifiers:
patient1: "NA12345"
fileAttributes:
description: "Urothelial carcinoma sample"
htsFormat: "VCF"
genomeAssembly: "GRCh38"
materialSample:
id: "EFO:0009655"
label: "abnormal sample"
timeOfCollection:
age:
iso8601duration: "P52Y2M"
pathologicalStage:
id: "NCIT:C28054"
label: "Stage II"
pathologicalTnmFinding:
- id: "NCIT:C48726"
label: "T2b Stage Finding"
- id: "NCIT:C48705"
label: "N0 Stage Finding"
- id: "NCIT:C48699"
label: "M0 Stage Finding"
Explanations¶
id¶
The Biosample id. This is unique in the context of the server instance.
individual_id¶
The id of the Individual this biosample was derived from. It is recommended, but not necessary to provide this information here if the Biosample is being transmitted as a part of a Phenopacket.
derived_from_id¶
The id of the parent biosample this biosample was derived from.
description¶
The biosample’s description. This attribute contains human readable text. The “description” attributes should not contain any structured data.
sampled_tissue¶
On OntologyClass describing the tissue from which the specimen was collected.
We recommend the use of UBERON. The
PDX MI mapping is Specimen tumor tissue
.
sample_type¶
RNA, DNA, Cultured cells. We recommend use of EFO term to describe the sample, for instance, genomic DNA (EFO:0008479).
phenotypic_features¶
The phenotypic characteristics of the BioSample, for example histological findings of a biopsy. See PhenotypicFeature for further information.
measurements¶
Measurements (usually quantitative) performed on the sample. See Measurement for further information.
taxonomy¶
For resources where there may be more than one organism being studied it is advisable to indicate the taxonomic identifier of that organism, to its most specific level. We advise using the codes from the NCBI Taxonomy resource. For instance, NCBITaxon:9606 is human (homo sapiens sapiens) and or NCBITaxon:9615 is dog.
individual_age_at_collection¶
An age object describing the age of the individual this biosample was derived from at the time of collection. The Age object allows the encoding of the age either as ISO8601 duration or time interval (preferred), or as ontology term object. See TimeElement for further information.
histological_diagnosis¶
This is the pathologist’s diagnosis and may often represent a refinement of the clinical diagnosis (which could be reported in the Phenopacket that contains this Biosample). Normal samples would be tagged with the term “NCIT:C38757”, “Negative Finding”. See OntologyClass for further information.
tumor_progression¶
This field can be used to indicate if a specimen is from the primary tumor, a metastasis or a recurrence. There are multiple ways of representing this using ontology terms, and the terms chosen should have a specific meaning that is application specific.
For example a term from the following NCIT terms from the Neoplasm by Special Category can be chosen.
tumor_grade¶
This should be a child term of NCIT:C28076 (Disease Grade Qualifier) or equivalent. See the tumor grade fact sheet.
diagnostic_markers¶
Clinically relevant bio markers. Most of the assays such as immunohistochemistry (IHC) are covered by the NCIT under the sub-hierarchy NCIT:C25294 (Laboratory Procedure), e.g. NCIT:C68748 (HER2/Neu Positive), NCIT:C131711 (Human Papillomavirus-18 Positive).
procedure¶
The clinical procedure performed on the subject in order to extract the biosample. See Procedure for further information.
files¶
This element contains a list of pointers to relevant file(s) for the biosample. For example, the results of a high-throughput sequencing experiment. See File for further information.
material_sample¶
This element can be used to specify the status of the sample. For instance, a status may be used as a normal control, often in combination with another sample that is thought to contain a pathological finding. We recommend use of ontology terms such as
sample_processing¶
The technique used to process the sample.
sample_storage¶
How the sample was stored.