Working with Phenopackets in C++¶
Here we provide some guidance on how to work with Phenopackets in C++.
Generating the C++ files¶
The maven build generates Java, C++, and Python code that can be directly used in other projects. Therefore, if you have maven set up on your machine, the easiest way to generate the C++ files is
$ mvn compile
$ mvn package
This will generate four files in the following location.
$ ls target/generated-sources/protobuf/cpp/
base.pb.cc phenopackets.pb.cc
base.pb.h phenopackets.pb.h
The other option is to use Google’s protoc
tool to generate the C++ files (The tool can be obtained
from the Protobuf website Install the tool
using commands appropriate to your system). The following commands
will generate identical files in a new directory called gen
.
$ mkdir gen
$ protoc \
--proto_path=src/main/proto/ \
--cpp_out=gen/ \
src/main/proto/phenopackets.proto src/main/proto/base.proto
The protoc
command specifies the directory where the protobuf files are located (–proto_path), the
location of the directory to which the corresponding C++ files are to be written, and then passes the two
protobuf files.
Compiling and building Phenopackets¶
The phenopacket code can be compiled and built using standard tools. Here we present a small example of a C++ program that reads in a phenopacket JSON file from the command line and prints our some of the information contained in it to the shell. The classes defined by the phenopacket are located within namespace declarations that mirror the Java package names, and thus are extremly unlikely to collide with other C++ identifiers.
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <google/protobuf/message.h>
#include <google/protobuf/util/json_util.h>
#include "phenopackets.pb.h"
using namespace std;
int main(int argc, char ** argv) {
// check that user has passed a file.
if (argc!=2) {
cerr << "usage: ./phenopacket_demo phenopacket-file.json\n";
exit(EXIT_FAILURE);
}
string fileName=argv[1];
GOOGLE_PROTOBUF_VERIFY_VERSION;
stringstream sstr;
ifstream inFile;
inFile.open(fileName);
if (! inFile.good()) {
cerr << "Could not open Phenopacket file at " << fileName <<"\n";
return EXIT_FAILURE;
}
sstr << inFile.rdbuf();
string JSONstring = sstr.str();
::google::protobuf::util::JsonParseOptions options;
::org::phenopackets::schema::v1::Phenopacket phenopacket;
::google::protobuf::util::JsonStringToMessage(JSONstring,&phenopacket,options);
cout << "\n::: Reading Phenopacket at: " << fileName << " ::::\n\n";
cout << "\tsubject.id: "<<phenopacket.subject().id() << "\n";
// print age if available
if (phenopacket.subject().has_age_at_collection()) {
::org::phenopackets::schema::v1::core::Age age = phenopacket.subject().age_at_collection();
if (! age.age().empty()) {
cout <<"\tsubject.age: " << age.age() << "\n";
}
cout <<"\tsubject.sex: " ;
org::phenopackets::schema::v1::core::Sex sex = phenopacket.subject().sex();
switch (sex) {
case ::org::phenopackets::schema::v1::core::UNKNOWN_SEX : cout << " unknown"; break;
case ::org::phenopackets::schema::v1::core::FEMALE : cout <<"female"; break;
case ::org::phenopackets::schema::v1::core::MALE: cout <<"male"; break;
case ::org::phenopackets::schema::v1::core::OTHER_SEX:
default:
cout <<"other"; break;
}
cout << "\n";
}
cout <<"\n\tPhenotypes:\n";
for (auto i = 0; i < phenopacket.phenotypes_size(); i++) {
const ::org::phenopackets::schema::v1::core::PhenotypicFeature& phenotype = phenopacket.phenotypes(i);
const ::org::phenopackets::schema::v1::core::OntologyClass type = phenotype.type();
cout << "\tid: " << type.id() << ": " << type.label() << "\n";
}
cout <<"\n";
}
The Makefile for this little program is as follows.
CXX=g++
CXXFLAGS=-Wall -g -O0 --std=c++17 -pthread
LIBS=-lprotobuf
TARGET=phenopacket_demo
all:$(TARGET)
OBJS=phenopackets.pb.o base.pb.o
$(TARGET):main.cpp $(OBJS)
$(CXX) $< $(OBJS) $(CXXFLAGS) ${LIBS} -o $@
%.o: %.cpp
$(CXX) $(CXXFLAGS) -o $@ -c $<
.PHONY: clean
clean:
rm -f $(OBJS) $(TARGET)
The executable can be generated by calling make
.
Running it on a simple phenopacket would lead to the following output.
$ ./phenopacket_demo Gebbia-1997-ZIC3.json
::: Reading Phenopacket at: Gebbia-1997-ZIC3.json ::::
subject.id: III-1
subject.age: 7W
subject.sex: male
Phenotypes:
id: HP:0002139: Arrhinencephaly
id: HP:0001750: Single ventricle
id: HP:0001643: Patent ductus arteriosus
id: HP:0001746: Asplenia
id: HP:0004971: Pulmonary artery hypoplasia
id: HP:0001674: Complete atrioventricular canal defect
id: HP:0001669: Transposition of the great arteries
id: HP:0012890: Posteriorly placed anus
id: HP:0001629: Ventricular septal defect
id: HP:0012262: Abnormal ciliary motility
id: HP:0004935: Pulmonary artery atresia
id: HP:0003363: Abdominal situs inversus
More information about using C++ with Protobuf is available at the Protobuf website.
phenotools¶
A more complete C++ implementation that performs Q/C is being developed as phenotools.