Repository for scripts and resources used for the recovery and analysis of metagenome assembled genomes (MAGs) from the Microflora Danica deep, long-read sequencing project (MFD-LR)
- MAGs from Nanopore long-read sequencing data were recovered using mmlong2
- Yield-normalized comparisons between different soil habitats were performed with mmcomp
- Automated MAG phylogeny workflow used in the project is available here
- De-replicated MAGs, sequenced reads and raw Nanopore data can be downloaded from ENA
- For convenience, the genome catalogs are also available for download from Zenodo (dereplicated and all MAGs)
- The main project datasets and their documentation is available here
- Files too large to be hosted on GitHub are available at Zenodo
The repo is structured in a way that the folders and the subfolders store the contet in the least ambiguous way possible.
Folder | Content |
---|---|
scripts/ | The code used to analyse the data and plot the figures. |
analysis/ | The results produced by the scripts (processed datasets, figures, etc.). |
├ datasets/ | Main datasets used in the project and their documentation |
└ figures/ | Figures used in the manuscript |
data/ | The input for the project. |
├ MFD-LR/ | Data related to analysing MAGs from this study |
├ MFD-SR/ | Relevant data from the Microflora Danica 10,000 metagenome study |
├ GTDB/ | Data for comparing MAGs from this study to GTDB |
├ catalogs/ | Data for analysis and comparisons of different genome catalogs |
└ mmcomp/ | Data for yield-normalized metagenomics comparisons |
envs/ | Description of R software environments used to analyse the data |
README.md | The explanation of the project, workflow and results, written in a flavored markdown syntax |
LICENSE | The license for the repo. |