Requirement Levels

The schema is formally defined using protobuf3. In protobuf3, all elements are optional, and so there is no mechanism within protobuf to declare that a certain field is required. The Phenopacket schema does require some fields to be present and in some cases additionally requires that these fields have a certain format (syntax) or intended meaning (semantics). Software that uses Phenopackets should check the validity of the data with other means. We provide a Java implementation called Phenopacket Validator that tests Phenopackets (and related messages including Family, Cohort, and Biosample messages) for validity. Application code may additionally check for application-specific criteria.

Hierarchical requirements

The requirement levels that are shown for the various elements of the Phenopacket only apply if the element is used. For instance, the Quantity shows that the unit and value fields are required (the multiplicity is exactly 1 and the word REQUIRED is shown in the description). In contrast the field reference_range is optional (the multiplicity may be 0 or 1 and neither REQUIRED nor RECOMMENDED is used in the description). The requirements only apply if a Quantity is used in a Phenopacket. For instance, Phenopackets that do not contain Measurement or Treatment elements do not contain Quantity elements, and so the requirements for the fields of Quantity do not apply.

Multiplicity

The explanations for the various elements of the Phenopacket show the required multiplcities.

  • 0..1: The element may be absent (0) or present (1), i.e., the element is optional. Elements with multiplicity 0..1 may be marked RECOMMENDED, otherwise they are OPTIONAL.
  • 1..1: The element must be present (1), i.e., the element is REQUIRED
  • 0..*: There may be from zero to an arbitrary number of elements, i.e., a potentially empty list
  • 1..*: There may be from one to an arbitrary number of elements, i.e., a list that must not be empty

Levels

The Phenopacket schema uses three requirement levels. The required/recommended/optional designations are phenopacket-specific extensions used in the schema only (not code) and are not supported by protobuf.

Required

If a field is required, its presence is an absolute requirement of the specification, failing which the entire phenopacket is regarded as malformed. This corresponds to the key words MUST, REQUIRED, and SHALL in RFC2119.

Validation software must emit an error if a required field is missing. We note that natively protobuf messages never return a null pointer, and so if a field is missing it will be an empty string, a zero, or default instance depending on the datatype. Therefore, in practice, validation software does not need to check for null pointers.

Optional

A field is truly optional. This category can be applied to fields that are only useful for a certain type of data. For instance, the Biosample field of the Phenopacket message is only used for Phenopackets that have an associated biosample(s).

The general-purpose validator must not emit a warning about these fields whether or not they are present. It may be appropriate for application-specific validators to emit a warning or even an error if a certain optional field is not present.