Biosample

A Biosample refers to a unit of biological material from which the substrate molecules (e.g. genomic DNA, RNA, proteins) for molecular analyses (e.g. sequencing, array hybridisation, mass-spectrometry) are extracted. Examples would be a tissue biopsy, a single cell from a culture for single cell genome sequencing or a protein fraction from a gradient centrifugation. Several instances (e.g. technical replicates) or types of experiments (e.g. genomic array as well as RNA-seq experiments) may refer to the same Biosample.

Data model

Field Type Multiplicity Description
id string 1..1 Arbitrary identifier. REQUIRED.
individual_id string 0..1 Arbitrary identifier. RECOMMENDED.
derived_from_id string 0..1 id of the biosample from which the current biosample was derived (if applicable)
description string 0..1 arbitrary text
sampled_tissue OntologyClass 0..1 Tissue from which the sample was taken
sample_type OntologyClass 0..1 type of material, e.g., RNA, DNA, Cultured cells
phenotypic_features PhenotypicFeature (List) 0..* List of phenotypic abnormalities of the sample. RECOMMENDED.
measurements Measurement (List) 0..* List of measurements of the sample
taxonomy OntologyClass 0..1 Species of the sampled individual
time_of_collection TimeElement 0..1 Age of the proband at the time the sample was taken. RECOMMENDED.
histological_diagnosis OntologyClass 0..1 Disease diagnosis that was inferred from the histological examination. RECOMMENDED.
tumor_progression OntologyClass 0..1 Indicates primary, metastatic, recurrent. RECOMMENDED.
tumor_grade OntologyClass 0..1 Term representing the tumor grade
pathological_stage OntologyClass 0..1 Pathological stage, if applicable. RECOMMENDED.
pathological_tnm_finding OntologyClass (List) 0..* Pathological TNM findings, if applicable. RECOMMENDED.
diagnostic_markers OntologyClass (List) 0..* Clinically relevant biomarkers. RECOMMENDED.
procedure Procedure 0..1 The procedure used to extract the biosample. RECOMMENDED.
files File (List) 0..* list of files related to the biosample, e.g. VCF or other high-throughput sequencing files
material_sample OntologyClass 0..1 Status of specimen (tumor tissue, normal control, etc.). RECOMMENDED.
sample_processing OntologyClass 0..1 how the specimen was processed
sample_storage OntologyClass 0..1 how the specimen was stored

Example

The staging system most often used for bladder cancer is the American Joint Committee on Cancer (AJCC) TNM system. The overall stage is assigned based on the T, N, and M categories (Cancer stage grouping). For instance, stage II (pathological staging) is defined as T2a or T2b, N0, and M0, meaning the cancer has spread into the wall of the bladder.

biosample:
  id: "sample1"
  individualId: "patient1"
  description: "Additional information can go here"
  sampledTissue:
      id: "UBERON_0001256"
      label: "wall of urinary bladder"
  histologicalDiagnosis:
      id: "NCIT:C39853"
      label: "Infiltrating Urothelial Carcinoma"
  tumorProgression:
      id: "NCIT:C84509"
      label: "Primary Malignant Neoplasm"
  tumorGrade:
      id: "NCIT:C36136"
      label: "Grade 2 Lesion"
  procedure:
      code:
          id: "NCIT:C5189"
          label: "Radical Cystoprostatectomy"
  files:
      - uri: "file:///data/genomes/urothelial_ca_wgs.vcf.gz"
      individualToFileIdentifiers:
          patient1: "NA12345"
      fileAttributes:
          description: "Urothelial carcinoma sample"
          htsFormat: "VCF"
          genomeAssembly: "GRCh38"
  materialSample:
      id: "EFO:0009655"
      label: "abnormal sample"
  timeOfCollection:
      age:
          iso8601duration: "P52Y2M"
  pathologicalStage:
      id: "NCIT:C28054"
      label: "Stage II"
  pathologicalTnmFinding:
  - id: "NCIT:C48726"
      label: "T2b Stage Finding"
  - id: "NCIT:C48705"
      label: "N0 Stage Finding"
  - id: "NCIT:C48699"
      label: "M0 Stage Finding"

Explanations

id

The Biosample id. This is unique in the context of the server instance.

individual_id

The id of the Individual this biosample was derived from. It is recommended, but not necessary to provide this information here if the Biosample is being transmitted as a part of a Phenopacket.

derived_from_id

The id of the parent biosample this biosample was derived from.

description

The biosample’s description. This attribute contains human readable text. The “description” attributes should not contain any structured data.

sampled_tissue

On OntologyClass describing the tissue from which the specimen was collected. We recommend the use of UBERON. The PDX MI mapping is Specimen tumor tissue.

sample_type

RNA, DNA, Cultured cells. We recommend use of EFO term to describe the sample, for instance, genomic DNA (EFO:0008479).

phenotypic_features

The phenotypic characteristics of the BioSample, for example histological findings of a biopsy. See PhenotypicFeature for further information.

measurements

Measurements (usually quantitative) performed on the sample. See Measurement for further information.

taxonomy

For resources where there may be more than one organism being studied it is advisable to indicate the taxonomic identifier of that organism, to its most specific level. We advise using the codes from the NCBI Taxonomy resource. For instance, NCBITaxon:9606 is human (homo sapiens sapiens) and or NCBITaxon:9615 is dog.

individual_age_at_collection

An age object describing the age of the individual this biosample was derived from at the time of collection. The Age object allows the encoding of the age either as ISO8601 duration or time interval (preferred), or as ontology term object. See TimeElement for further information.

histological_diagnosis

This is the pathologist’s diagnosis and may often represent a refinement of the clinical diagnosis (which could be reported in the Phenopacket that contains this Biosample). Normal samples would be tagged with the term “NCIT:C38757”, “Negative Finding”. See OntologyClass for further information.

tumor_progression

This field can be used to indicate if a specimen is from the primary tumor, a metastasis or a recurrence. There are multiple ways of representing this using ontology terms, and the terms chosen should have a specific meaning that is application specific.

For example a term from the following NCIT terms from the Neoplasm by Special Category can be chosen.

tumor_grade

This should be a child term of NCIT:C28076 (Disease Grade Qualifier) or equivalent. See the tumor grade fact sheet.

diagnostic_markers

Clinically relevant bio markers. Most of the assays such as immunohistochemistry (IHC) are covered by the NCIT under the sub-hierarchy NCIT:C25294 (Laboratory Procedure), e.g. NCIT:C68748 (HER2/Neu Positive), NCIT:C131711 (Human Papillomavirus-18 Positive).

procedure

The clinical procedure performed on the subject in order to extract the biosample. See Procedure for further information.

files

This element contains a list of pointers to relevant file(s) for the biosample. For example, the results of a high-throughput sequencing experiment. See File for further information.

material_sample

This element can be used to specify the status of the sample. For instance, a status may be used as a normal control, often in combination with another sample that is thought to contain a pathological finding. We recommend use of ontology terms such as

sample_processing

The technique used to process the sample.

sample_storage

How the sample was stored.