Skip to content
/ mmcomp Public

Snakemake workflow for yield-normalized comparative genome-centric metagenomics

License

Notifications You must be signed in to change notification settings

Serka-M/mmcomp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 

Repository files navigation

mmcomp

Snakemake workflow for yield-normalized comparative genome-centric metagenomics

⚠️Note: at the moment, the workflow is only suitable for reproducing results from the MFD-LR project. A more generic and distributable version of the workflow is planned for a future release

Workflow description

  • The workflow is designed to take in metagenomic sequencing datasets, subsample the reads to specified depths and perform microbial genome recovery with mmlong2
  • Afterwards, additional analysis is performed to investigate possible reasons behind differences in genome recovery efficiency, which includes micro-diversity analysis, assessment of non-prokaryotic DNA, and community composition analysis
  • Multiple bioinformatics tools are used in the analysis to correct for possible tool-related biases

Tentative usage

  • Download the repo and update the config/config.yaml file to point to the correct Conda environments, read datasets, and databases
  • The workflow can use a local Conda executable and environments, bypassing some Snakemake compatibility issues
  • Update the mmcomp.sh script to be compatible with the server and job scheduler set up, as desired
  • It is recommended to run the workflow with multiple retries turned on, as each retry will be submitted with increased resource allocation

About

Snakemake workflow for yield-normalized comparative genome-centric metagenomics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published