Skip to content
View robomics's full-sized avatar

Block or report robomics

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
robomics/README.md

Roberto Rossini – Scientific Software Developer & Bioinformatics Researcher

As a scientific software developer and postdoctoral researcher in bioinformatics at the University of Oslo, I design and build high-performance computational tools to solve complex problems in genomics and model the 3D genome organization.

I develop and maintain several open-source tools, primarily written in modern C++ and Python, for:

  • Hi-C and multi-omics data processing
  • Modeling of the 3D genome structure
  • Bioinformatics workflows for reproducible analysis
  • CLI tools and libraries for large-scale genomics

When it comes to bioinformatics tools, I deeply care about:

  • Performance β€” through efficient algorithms, parallelism, and modern systems programming
  • Correctness β€” via rigorous testing and well-thought-out code architecture
  • Usability β€” achieved with clear CLI design, intuitive APIs, and comprehensive documentation

πŸ”¬ Selected Projects

hictk β€’ hictkpy β€’ hictkR

A CLI and C++/Python/R library to efficiently handle data from Hi-C experiments

  • Designed to enable seamless operations on files in .hic and cooler formats, streamlining the analysis of 3D genomic data
  • Convert .hic files to .mcool up to 36x faster than existing tools
  • Zero-copy in-memory data sharing between C++ and Python/R using Arrow and Eigen
  • Published in OUP Bioinformatics (2024): doi.org/10.1093/bioinformatics/btae408
  • Languages: C++17 (core), Python, R
  • Tech stack: Arrow, Conan, CMake, Docker, GitHub Actions, nanobind, OpenTelemetry, Rcpp, Sphinx
  • Key Algorithms & Techniques: Test-driven development, fuzzy testing, streaming algorithms, visitor and iterator patterns, multi-threading
  • Role: Lead developer

A high-performance stochastic modeling of DNA loop extrusion interactions

  • Developed to simulate genome-wide loop extrusion in vertebrates, providing insights into chromatin dynamics relevant to gene regulation and disease mechanisms
  • Orders of magnitude faster than traditional MD-based models: simulate loop extrusion on the human genome using consumer hardware in a few minutes
  • Published in Genome Biology (2022): doi.org/10.1186/s13059-022-02815-7
  • Tech stack: C++17, Conan, CMake, Docker, GitHub Actions
  • Key Algorithms & Techniques: unit-testing, consumer-producer architecture, multi-threading, concurrent data structures
  • Role: Lead developer

3D Genome Analysis of Breast Cancer Progression in MCF10A and related cell lines

  • Full pipeline and analysis scripts for integrative Hi-C, ChIP-Seq, and RNA-Seq data
  • Preprint available on bioRxiv (2023): doi.org/10.1101/2023.11.26.568711
  • Tools: Bash, Dash, Docker, Jupyter, Nextflow, Python, R
  • Role: Lead developer. Responsible for all data processing and most of the data analysis

Detection of architectural stripes in Hi-C contact maps

  • Identifies structural features linked to active transcription and regulatory regions
  • Format-agnostic and easy to use. Augments stripes with several descriptive statistics
  • Preprint available on bioRxiv (2024): doi.org/10.1101/2024.12.20.629789
  • Language: Python
  • Key Algorithms & Techniques: unit-testing, multiprocessing, shared memory, asynchronous programming, structured logging
  • Role: Primary code contributor. Provided inputs regarding the algorithm design

Additional Tools & Workflows (Click to expand)

Reproducible computational pipelines

The projects presented in this section aim to simplify and automate common bioinformatics analysis workflows using Nextflow and containers:

All the above workflows are implemented using Nextflow (mainly using DSL2).
Each repository is structured to leverage GitHub Actions and the GitHub Container Registry (GHCR.io) to build and host custom Docker images to enable reproducible data analysis without relying on third-party images.
The code used in, and called by Nextflow processes is written using Bash, Python, and R.


πŸ”§ Community & Ecosystem Contributions

In addition to major projects, I contribute to the broader bioinformatics software ecosystem:


πŸ“« Contact & Links

Do not hesitate to connect with me on LinkedIn or reach out via email for collaborations or opportunities.

Pinned Loading

  1. paulsengroup/hictk paulsengroup/hictk Public

    Blazing fast toolkit to work with .hic and .cool files

    C++ 29 1

  2. paulsengroup/StripePy paulsengroup/StripePy Public

    StripePy recognizes architectural stripes in 3C and Hi-C contact maps using geometric reasoning

    Python 6 3

  3. paulsengroup/hictkpy paulsengroup/hictkpy Public

    Python bindings for hictk: read and write .cool and .hic files directly from Python

    C++ 15

  4. paulsengroup/modle paulsengroup/modle Public

    High-performance stochastic modeling of DNA loop extrusion interactions

    C++ 17 2