Repository for scripts and resources used for the recovery and analysis of metagenome assembled genomes (MAGs) from the Microflora Danica deep, long-read sequencing project (MFD-LR)
- MAGs from Nanopore long-read sequencing data were recovered using mmlong2
- Yield-normalized comparisons between different soil habitats were performed with mmcomp
- Automated MAG phylogeny workflow used in the project is available here
- De-replicated MAGs, sequenced reads and raw Nanopore data can be downloaded from ENA
- For convenience, the genome catalogs are also available for download from Zenodo (dereplicated and all MAGs)
- The main project datasets and their documentation is available here
- Files too large to be hosted on GitHub are available at Zenodo
The repo is structured in a way that the folders and the subfolders store the contet in the least ambiguous way possible.
| Folder | Content |
|---|---|
| scripts/ | The code used to analyse the data and plot the figures. |
| analysis/ | The results produced by the scripts (processed datasets, figures, etc.). |
| ├ datasets/ | Main datasets used in the project and their documentation |
| ├ figures/ | Figures used in the manuscript |
| └ source/ | Source data for the main figures |
| data/ | The input for the project. |
| ├ MFD-LR/ | Data related to analysing MAGs from this study |
| ├ MFD-SR/ | Relevant data from the Microflora Danica 10,000 metagenome study |
| ├ GTDB/ | Data for comparing MAGs from this study to GTDB |
| ├ catalogs/ | Data for analysis and comparisons of different genome catalogs |
| └ mmcomp/ | Data for yield-normalized metagenomics comparisons |
| envs/ | Description of R software environments used to analyse the data |
| README.md | The explanation of the project, workflow and results, written in a flavored markdown syntax |
| LICENSE | The license for the repo. |