Skip to content

Share genomic and exomic datasets and request analyses. Deployed for Care4Rare.

Notifications You must be signed in to change notification settings

ccmbioinfo/stager

 
 

Repository files navigation

Stager

backend CI frontend CI codecov CodeQL

Stager is a web application that enables information management for large-scale genomics and other omics projects. It allows users to input genomic metadata and edit annotations, link them to the relevant data sets, request specific analyses and track their status. It was initially developed for the needs of Care4Rare — a pan-Canadian collaborative team of clinicians, bioinformaticians, scientists, and researchers, focused on improving the care of Rare Disease patients in Canada and around the world. Led out of the Children’s Hospital of Eastern Ontario (CHEO) Research Institute in Ottawa, Canada, Care4Rare includes 21 academic sites across Canada, and is recognized internationally as a pioneer in the field of genomics and personalized medicine. Genomics4RD is a research initiative by Care4Rare with the mission to create the first data lake for rare diseases in Canada. Genomics4RD is a bioinformatics ecosystem that consists of multiple projects, with Stager enabling its data maintenance, storage and analysis capabilities.

The Centre for Computational Medicine at SickKids supports this collaboration through developing bioinformatics pipelines and analyzing genomic datasets. The analyses are monitored by Stager and users can retrieve aggregated reports on detected genomic variants. The integrated system supports both structured and unstructured data types through the combined use of MinIO and Stager. MinIO offers a number of security features for user identity and access management. In particular, the data sets from separate sources or subprojects are deposited into their specialized data folders, called buckets, where secure access to each bucket is provided only to the relevant project participants. Once the data sets are transferred, analyses are enabled using a suite of bioinformatics pipelines for the corresponding data types.

Access to Stager is restricted to team members who provide data and metadata or perform data analysis. Key features of Stager are:

  • Viewing and editing metadata for participants and datasets
  • Uploading new metadata for participants and datasets
  • Uploading dataset files and linking them to the metadata
  • Requesting analyses for datasets
  • Viewing aggregated analysis reports in the variant viewer

Tech stack

The browser single-page application frontend is written in TypeScript with the React library, bootstrapped via Create React App, and uses Material-UI for theming.

The backend is containerized with Docker and written in Python 3.9 with the Flask microframework and SQLAlchemy object-relational mapper, presenting a RESTful API to the frontend.

A MySQL 8.0 database stores the aforementioned dataset and analysis metadata

MinIO, an S3-compatible object storage server, is used to store the actual datasets uploaded and the results of their analyses

The first-party frontend and backend components are built automatically on the GitHub Actions continuous integration system. The frontend is transpiled for the web and deployed as a set of static files. The backend is deployed with Docker Compose.

For more developer documentation, see docs/

Required tools and editors