As a scientific software developer and postdoctoral researcher in bioinformatics at the University of Oslo, I design and build high-performance computational tools to solve complex problems in genomics and model the 3D genome organization.
I develop and maintain several open-source tools, primarily written in modern C++ and Python, for:
- Hi-C and multi-omics data processing
- Modeling of the 3D genome structure
- Bioinformatics workflows for reproducible analysis
- CLI tools and libraries for large-scale genomics
When it comes to bioinformatics tools, I deeply care about:
- Performance β through efficient algorithms, parallelism, and modern systems programming
- Correctness β via rigorous testing and well-thought-out code architecture
- Usability β achieved with clear CLI design, intuitive APIs, and comprehensive documentation
A CLI and C++/Python/R library to efficiently handle data from Hi-C experiments
- Designed to enable seamless operations on files in
.hic
andcooler
formats, streamlining the analysis of 3D genomic data - Convert
.hic
files to.mcool
up to 36x faster than existing tools - Zero-copy in-memory data sharing between C++ and Python/R using Arrow and Eigen
- Published in OUP Bioinformatics (2024): doi.org/10.1093/bioinformatics/btae408
- Languages:
C++17
(core),Python
,R
- Tech stack:
Arrow
,Conan
,CMake
,Docker
,GitHub Actions
,nanobind
,OpenTelemetry
,Rcpp
,Sphinx
- Key Algorithms & Techniques: Test-driven development, fuzzy testing, streaming algorithms, visitor and iterator patterns, multi-threading
- Role: Lead developer
A high-performance stochastic modeling of DNA loop extrusion interactions
- Developed to simulate genome-wide loop extrusion in vertebrates, providing insights into chromatin dynamics relevant to gene regulation and disease mechanisms
- Orders of magnitude faster than traditional MD-based models: simulate loop extrusion on the human genome using consumer hardware in a few minutes
- Published in Genome Biology (2022): doi.org/10.1186/s13059-022-02815-7
- Tech stack:
C++17
,Conan
,CMake
,Docker
,GitHub Actions
- Key Algorithms & Techniques: unit-testing, consumer-producer architecture, multi-threading, concurrent data structures
- Role: Lead developer
3D Genome Analysis of Breast Cancer Progression in MCF10A and related cell lines
- Full pipeline and analysis scripts for integrative Hi-C, ChIP-Seq, and RNA-Seq data
- Preprint available on bioRxiv (2023): doi.org/10.1101/2023.11.26.568711
- Tools:
Bash
,Dash
,Docker
,Jupyter
,Nextflow
,Python
,R
- Role: Lead developer. Responsible for all data processing and most of the data analysis
Detection of architectural stripes in Hi-C contact maps
- Identifies structural features linked to active transcription and regulatory regions
- Format-agnostic and easy to use. Augments stripes with several descriptive statistics
- Preprint available on bioRxiv (2024): doi.org/10.1101/2024.12.20.629789
- Language:
Python
- Key Algorithms & Techniques: unit-testing, multiprocessing, shared memory, asynchronous programming, structured logging
- Role: Primary code contributor. Provided inputs regarding the algorithm design
Additional Tools & Workflows (Click to expand)
The projects presented in this section aim to simplify and automate common bioinformatics analysis workflows using Nextflow and containers:
- compress-nfcore-hic-output β Automates post-processing of nf-core Hi-C output files
- chrom3d-nf β Reproducible Chrom3D modeling pipeline using Nextflow
- call_tad_cliques β Graph-theoretic approach to detect nested TAD structures
- generate_higlass_gene_track β Prepares gene annotations for HiGlass visualization
All the above workflows are implemented using Nextflow
(mainly using DSL2).
Each repository is structured to leverage GitHub Actions
and the GitHub Container Registry (GHCR.io)
to build and host custom Docker images to enable reproducible data analysis without relying on third-party images.
The code used in, and called by Nextflow
processes is written using Bash
, Python
, and R
.
In addition to major projects, I contribute to the broader bioinformatics software ecosystem:
- Package maintenance:
- Bug fixes and patches:
- π Based in Oslo, Norway
- βοΈ Work email
- π ORCID | LinkedIn | Twitter
Do not hesitate to connect with me on LinkedIn or reach out via email for collaborations or opportunities.