Convert-Pheno

Convert-Pheno is a toolkit for interconverting standard clinical and phenotypic data models

Supported formats include BFF, PXF, OMOP CDM, REDCap, CDISC-ODM, CSV, and experimental openEHR canonical input

Quick Start

Typical CLI usage:

convert-pheno -ipxf pxf.json -obff individuals.json
convert-pheno -ipxf pxf.json -obff --entities individuals biosamples datasets cohorts --out-dir out/
convert-pheno -ibff individuals.json -opxf phenopackets.json
convert-pheno -iomop dump.sql.gz -obff individuals.json.gz --stream --ohdsi-db

For backward compatibility, the -iomop ... -obff form still keeps the individuals-only BFF output behavior.

Note: openEHR support is currently experimental and currently limited to canonical composition input with BFF or PXF output. See the CLI documentation for the current experimental openEHR usage details.

Internally, most conversions use BFF as the target model before continuing to other output formats when needed.

Multi-Entity Output

BFF output can now be entity-aware through --entities.

Current support:

individuals as the default BFF output entity
biosamples as first-class BFF output for -ipxf when the input contains biosample data
datasets and cohorts synthesized from the normalized individuals collection

Example:

convert-pheno -ipxf pxf.json -obff --entities individuals biosamples datasets cohorts --out-dir out/

This can write:

out/individuals.json
out/biosamples.json
out/datasets.json
out/cohorts.json

For mapping-file workflows such as csv2bff, redcap2bff, and cdisc2bff, synthesized datasets and cohorts can be customized through the top-level beacon section of the mapping file

Mapping Files

Mapping-file based tabular conversions now use an entity-aware layout

project keeps project-level metadata
beacon.individuals contains the semantic mapping rules for Beacon individuals
beacon.datasets, beacon.cohorts, and beacon.biosamples can provide metadata or defaults for emitted Beacon entities

This makes the mapping structure consistent with multi-entity BFF output while keeping individuals as the central normalized model

Selected CLI Features

Useful recent options include:

--default-vital-status to control the fallback subject.vitalStatus.status in bff2pxf
--search-audit-tsv to write a TSV report of ontology lookup results for mapping-file conversions
generic -i/-o syntax in addition to the format-specific shortcuts
--out-name key=file to customize filenames in multi-file BFF or OMOP output

Installation

Detailed installation instructions live in dedicated Markdown docs:

Repository installs that run cpanm --installdeps . may also need system libraries such as libssl-dev for the SSL/JSONLD dependency chain.

Published documentation:

https://cnag-biomedical-informatics.github.io/convert-pheno

CLI Documentation

The CLI now keeps concise built-in help in bin/convert-pheno.

Long-form CLI documentation lives in Markdown:

Examples

Repository fixtures under t/ double as runnable examples.

Useful examples:

bin/convert-pheno -ipxf t/pxf2bff/in/pxf.json -obff individuals.json
bin/convert-pheno -ipxf t/pxf2bff/in/pxf_biosamples.json -obff --entities individuals biosamples datasets cohorts --out-dir out/
bin/convert-pheno -icsv t/csv2bff/in/csv_data.csv --mapping-file t/csv2bff/in/csv_mapping.yaml --search-audit-tsv search-audit.tsv -obff individuals.json
bin/convert-pheno -ibff t/bff2pxf/in/individuals.json -opxf phenopackets.json --default-vital-status UNKNOWN_STATUS
bin/convert-pheno -iomop t/omop2bff/in/omop_cdm_eunomia.sql -opxf phenopackets.json
bin/convert-pheno -iomop t/omop2bff/in/gz/omop_cdm_eunomia.sql.gz -obff individuals.json.gz --stream --omop-tables DRUG_EXPOSURE

Citation

If you use Convert-Pheno in published work, please cite:

Rueda, M et al. (2024). Convert-Pheno: A software toolkit for the interconversion of standard data models for phenotypic data. Journal of Biomedical Informatics. https://doi.org/10.1016/j.jbi.2023.104558

Author

Manuel Rueda, PhD. CNAG: https://www.cnag.eu

Name		Name	Last commit message	Last commit date
Latest commit History 880 Commits
.github/workflows		.github/workflows
api		api
bin		bin
docker		docker
docs-site		docs-site
lib		lib
nb		nb
non-containerized		non-containerized
share		share
t		t
xt		xt
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Changes		Changes
LICENSE		LICENSE
MANIFEST.SKIP		MANIFEST.SKIP
Makefile.PL		Makefile.PL
README.md		README.md
VERSION		VERSION
cpanfile		cpanfile
makefile.docker		makefile.docker
makefile.install		makefile.install

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Table of contents

Convert-Pheno

Quick Start

Multi-Entity Output

Mapping Files

Selected CLI Features

Installation

CLI Documentation

Examples

Citation

Author

About

Releases

Packages

Used by

Contributors

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Table of contents

Convert-Pheno

Quick Start

Multi-Entity Output

Mapping Files

Selected CLI Features

Installation

CLI Documentation

Examples

Citation

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages

Used by

Contributors

Languages