Skip to content

Commit

Permalink
Adding documentation for static site
Browse files Browse the repository at this point in the history
  • Loading branch information
skchronicles committed Mar 16, 2022
1 parent 3df6a75 commit acf7d30
Show file tree
Hide file tree
Showing 11 changed files with 425 additions and 0 deletions.
32 changes: 32 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Build documentation

> **Please Note:** When a commit is pushed to the `docs/` directory, it triggers a [github actions workflow](https://github.com/OpenOmics/metavirs/actions) to build the static-site and push it to the gh-pages branch.
### Installation
```bash
# Clone the Repository
git clone https://github.com/OpenOmics/metavirs.git
# Create a virtual environment
python3 -m venv .venv
# Activate the virtual environment
. .venv/bin/activate
# Update pip
pip install --upgrade pip
# Download Dependencies
pip install -r docs/requirements.txt
```

### Preview while editing
MkDocs includes a previewing server, so you can view your updates live and as you write your documentation. The server will automatically rebuild the site upon editing and saving a file.
```bash
# Activate the virtual environment
. .venv/bin/activate
# Start serving your documentation
mkdocs serve
```

### Build static site
Once you are content with your changes, you can build the static site:
```bash
mkdocs build
```
Binary file added docs/assets/favicon/favicon.ico
Binary file not shown.
9 changes: 9 additions & 0 deletions docs/assets/icons/doc-book.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions docs/css/extra.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
@keyframes heart {
0%, 40%, 80%, 100% {
transform: scale(1);
}
20%, 60% {
transform: scale(1.15);
}
}

.heart {
animation: heart 1500ms infinite;
}
4 changes: 4 additions & 0 deletions docs/faq/questions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Frequently Asked Questions

This page is still under construction. If you need immediate help, please [open an issue](https://github.com/OpenOmics/metavirs/issues) on Github!

30 changes: 30 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# metavirs 🔬 [![docs](https://github.com/OpenOmics/metavirs/workflows/docs/badge.svg)](https://github.com/OpenOmics/metavirs/actions) [![GitHub issues](https://img.shields.io/github/issues/OpenOmics/metavirs?color=brightgreen)](https://github.com/OpenOmics/metavirs/issues) [![GitHub license](https://img.shields.io/github/license/OpenOmics/metavirs)](https://github.com/OpenOmics/metavirs/blob/main/LICENSE)

> **_Metagenomics Viral Sequencing Pipeline_**. This is the home of the pipeline, metavirs. Its long-term goals: to assemble, annotate, and classify enviromental samples like no pipeline before!
---
## Overview
Welcome to metavirs' documentation! This guide is the main source of documentation for users that are getting started with the [Viral metagenomics pipeline](https://github.com/OpenOmics/metavirs/).

The **`./metavirs`** pipeline is composed several inter-related sub commands to setup and run the pipeline across different systems. Each of the available sub commands perform different functions:

* [<code>metavirs <b>run</b></code>](usage/run.md): Run the GATK4 WES pipeline with your input files.
* [<code>metavirs <b>unlock</b></code>](usage/unlock.md): Unlocks a previous runs output directory.
* [<code>metavirs <b>cache</b></code>](usage/cache.md): Cache remote resources locally, coming soon!

metavirs is a comprehensive viral metagenomics pipeline assemble, annotate, and classify enviromental samples. It relies on technologies like [Singularity<sup>1</sup>](https://singularity.lbl.gov/) to maintain the highest-level of reproducibility. The pipeline consists of a series of data processing and quality-control steps orchestrated by [Snakemake<sup>2</sup>](https://snakemake.readthedocs.io/en/stable/), a flexible and scalable workflow management system, to submit jobs to a cluster.

The pipeline is compatible with data generated from Illumina short-read sequencing technologies. As input, it accepts a set of FastQ files and can be run locally on a compute instance, on-premise using a cluster, or on the cloud (feature coming soon!). A user can define the method or mode of execution. The pipeline can submit jobs to a cluster using a job scheduler like SLURM, or run on AWS using Tibanna (feature coming soon!). A hybrid approach ensures the pipeline is accessible to all users.

Before getting started, we highly recommend reading through the [usage](usage/run.md) section of each available sub command.

For more information about issues or trouble-shooting a problem, please checkout our [FAQ](faq/questions.md) prior to [opening an issue on Github](https://github.com/OpenOmics/metavirs/issues).

## Contribute

This site is a living document, created for and by members like you. metavirs is maintained by the members of NCBR and is improved by continous feedback! We encourage you to contribute new content and make improvements to existing content via pull request to our [GitHub repository :octicons-heart-fill-24:{ .heart }](https://github.com/OpenOmics/metavirs).


## References
<sup>**1.** Kurtzer GM, Sochat V, Bauer MW (2017). Singularity: Scientific containers for mobility of compute. PLoS ONE 12(5): e0177459.</sup>
<sup>**2.** Koster, J. and S. Rahmann (2018). "Snakemake-a scalable bioinformatics workflow engine." Bioinformatics 34(20): 3600.</sup>
21 changes: 21 additions & 0 deletions docs/license.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# MIT License

*Copyright (c) 2022 OpenOmics*

<sub>Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:</sub>

<sub>The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.</sub>

<sub>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.</sub>
34 changes: 34 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Babel==2.8.0
click==7.1.2
future==0.18.2
gitdb==4.0.5
GitPython==3.1.7
htmlmin==0.1.12
importlib-metadata==1.7.0
Jinja2==2.11.2
joblib==0.16.0
jsmin==3.0.0
livereload==2.6.1
lunr==0.5.8
Markdown==3.2.2
MarkupSafe==1.1.1
mkdocs==1.1.2
mkdocs-awesome-pages-plugin==2.2.1
mkdocs-git-revision-date-localized-plugin==0.7
mkdocs-material
mkdocs-material-extensions
mkdocs-minify-plugin==0.3.0
mkdocs-redirects==1.0.1
nltk==3.5
Pygments==2.6.1
pymdown-extensions==7.1
pytz==2020.1
PyYAML==5.3.1
regex==2020.7.14
six==1.15.0
smmap==3.0.4
tornado==6.0.4
tqdm==4.48.2
zipp==3.1.0
mkdocs-git-revision-date-plugin
mike
72 changes: 72 additions & 0 deletions docs/usage/cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# <code>metavirs <b>cache</b></code>

## 1. About
The `metavirs` executable is composed of several inter-related sub commands. Please see `metavirs -h` for all available options.

This part of the documentation describes options and concepts for <code>metavirs <b>cache</b></code> sub command in more detail. With minimal configuration, the **`cache`** sub command enables you to cache remote resources for the metavirs pipeline. Caching remote resources allows the pipeline to run in an offline mode. The cache sub command can also be used to pull our pre-built reference bundles onto a new cluster or target system.

The cache sub command creates local cache on the filesysytem for resources hosted on DockerHub or AWS S3. These resources are normally pulled onto the filesystem when the pipeline runs; however, due to network issues or DockerHub pull rate limits, it may make sense to pull the resources once so a shared cache can be created and re-used. It is worth noting that a singularity cache cannot normally be shared across users. Singularity strictly enforces that its cache is owned by the user. To get around this issue, the cache subcommand can be used to create local SIFs on the filesystem from images on DockerHub.

## 2. Synopsis

Coming Soon!

<!-- ```text
$ ./metavirs cache [-h] --sif-cache SIF_CACHE \
[--resource-bundle RESOURCE_BUNDLE] \
[--dry-run]
```
The synopsis for each command shows its parameters and their usage. Optional parameters are shown in square brackets.
A user **must** provide a directory to cache remote Docker images via the `--sif-cache` argument. Once the cache has pipeline completed, the local sif cache can be passed to the `--sif-cache` option of the <code>metavirs <b>build</b></code> and <code>metavirs <b>run</b></code> subcomand. This enables the build and run pipeline to run in an offline mode.
Use you can always use the `-h` option for information on a specific command.
### 2.1 Required Arguments
`--sif-cache SIF_CACHE`
> **Path where a local cache of SIFs will be stored..**
> *type: string*
>
> Any images defined in *config/containers/images.json* will be pulled into the local filesystem. The path provided to this option can be passed to the `--sif-cache` option of the <code>metavirs <b>build</b></code> and <code>metavirs <b>run</b></code> subcomand. This allows for running the build and run pipelines in an offline mode where no requests are made to exteexomel sources. This is useful for avoiding network issues or DockerHub pull rate limits. Please see metavirs build and run for more information.
>
> ***Example:*** `--sif-cache /data/$USER/cache`
### 2.2 Options
Each of the following arguments are optional and do not need to be provided.
`-h, --help`
> **Display Help.**
> *type: boolean*
>
> Shows command's synopsis, help message, and an example command
>
> ***Example:*** `--help`
---
`--dry-run`
> **Dry run the pipeline.**
> *type: boolean*
>
> Displays what steps in the pipeline remain or will be run. Does not execute anything!
>
> ***Example:*** `--dry-run`
## 3. Example
```bash
# Step 0.) Grab an interactive node (do not run on head node)
srun -N 1 -n 1 --time=12:00:00 -p interactive --mem=8gb --cpus-per-task=4 --pty bash
module purge
module load singularity snakemake
# Step 1.) Dry run cache to see what will be pulled
./metavirs cache --sif-cache /scratch/$USER/cache \
--dry-run
# Step 2.) Cache remote resources locally
./metavirs cache --sif-cache /scratch/$USER/cache
```
-->
155 changes: 155 additions & 0 deletions docs/usage/run.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# <code>metavirs <b>run</b></code>

## 1. About
The `metavirs` executable is composed of several inter-related sub commands. Please see `metavirs -h` for all available options.

This part of the documentation describes options and concepts for <code>metavirs <b>run</b></code> sub command in more detail. With minimal configuration, the **`run`** sub command enables you to start running metavirs pipeline.

Setting up the metavirs pipeline is fast and easy! In its most basic form, <code>metavirs <b>run</b></code> only has *two required inputs*.

## 2. Synopsis
```text
$ metavirs run [--help] [--mode <slurm,local>] \
[--job-name JOB_NAME] [--dry-run] [--silent] \
[--singularity-cache SINGULARITY_CACHE] \
[--sif-cache SIF_CACHE] \
[--tmpdir TMP_DIR] \
[--threads THREADS] \
--input INPUT [INPUT ...] \
--output OUTPUT
```

The synopsis for each command shows its arguments and their usage. Optional arguments are shown in square brackets.

A user **must** provide a list of FastQ (globbing is supported) to analyze via `--input` argument and an output directory to store results via `--output` argument.

Use you can always use the `-h` option for information on a specific command.

### 2.1 Required Arguments

Each of the following arguments are required. Failure to provide a required argument will result in a non-zero exit-code.

`--input INPUT [INPUT ...]`
> **Input FastQ or BAM file(s).**
> *type: file(s)*
>
> One or more FastQ files can be provided. The pipeline does NOT support single-end data. From the command-line, each input file should seperated by a space. Globbing is supported! This makes selecting FastQ files easy. Input FastQ files should always be gzipp-ed.
>
> ***Example:*** `--input .tests/*.R?.fastq.gz`
---
`--output OUTPUT`
> **Path to an output directory.**
> *type: path*
>
> This location is where the pipeline will create all of its output files, also known as the pipeline's working directory. If the provided output directory does not exist, it will be created automatically.
>
> ***Example:*** `--output /data/$USER/metavirs_out`
### 2.2 Options

Each of the following arguments are optional, and do not need to be provided.

`-h, --help`
> **Display Help.**
> *type: boolean flag*
>
> Shows command's synopsis, help message, and an example command
>
> ***Example:*** `--help`
---
`--dry-run`
> **Dry run the pipeline.**
> *type: boolean flag*
>
> Displays what steps in the pipeline remain or will be run. Does not execute anything!
>
> ***Example:*** `--dry-run`
---
`--silent`
> **Silence standard output.**
> *type: boolean flag*
>
> Reduces the amount of information directed to standard output when submitting master job to the job scheduler. Only the job id of the master job is returned.
>
> ***Example:*** `--silent`
---
`--mode {slurm,local}`
> **Execution Method.**
> *type: string*
> *default: slurm*
>
> Execution Method. Defines the mode or method of execution. Vaild mode options include: slurm or local.
>
> ***slurm***
> The slurm execution method will submit jobs to the [SLURM workload manager](https://slurm.schedmd.com/). It is recommended running metavirs in this mode as execution will be significantly faster in a distributed environment. This is the default mode of execution.
>
> ***local***
> Local executions will run serially on compute instance. This is useful for testing, debugging, or when a users does not have access to a high performance computing environment. If this option is not provided, it will default to a local execution mode.
>
> ***Example:*** `--mode slurm`
---
`--job-name JOB_NAME`
> **Set the name of the pipeline's master job.**
> *type: string*
> *default: pl:metavirs*
>
> When submitting the pipeline to a job scheduler, like SLURM, this option always you to set the name of the pipeline's master job. By default, the name of the pipeline's master job is set to "pl:metavirs".
>
> ***Example:*** `--job-name pl_id-42`
---
`--singularity-cache SINGULARITY_CACHE`
> **Overrides the $SINGULARITY_CACHEDIR environment variable.**
> *type: path*
> *default: `--output OUTPUT/.singularity`*
>
> Singularity will cache image layers pulled from remote registries. This ultimately speeds up the process of pull an image from DockerHub if an image layer already exists in the singularity cache directory. By default, the cache is set to the value provided to the `--output` argument. Please note that this cache cannot be shared across users. Singularity strictly enforces you own the cache directory and will return a non-zero exit code if you do not own the cache directory! See the `--sif-cache` option to create a shareable resource.
>
> ***Example:*** `--singularity-cache /data/$USER/.singularity`
---
`--sif-cache SIF_CACHE`
> **Path where a local cache of SIFs are stored.**
> *type: path*
>
> Uses a local cache of SIFs on the filesystem. This SIF cache can be shared across users if permissions are set correctly. If a SIF does not exist in the SIF cache, the image will be pulled from Dockerhub and a warning message will be displayed. The `metavirs cache` subcommand can be used to create a local SIF cache. Please see `metavirs cache` for more information. This command is extremely useful for avoiding DockerHub pull rate limits. It also remove any potential errors that could occur due to network issues or DockerHub being temporarily unavailable. We recommend running metavirs with this option when ever possible.
>
> ***Example:*** `--singularity-cache /data/$USER/SIFs`
---
`--threads THREADS`
> **Max number of threads for each process.**
> *type: int*
> *default: 2*
>
> Max number of threads for each process. This option is more applicable when running the pipeline with `--mode local`. It is recommended setting this vaule to the maximum number of CPUs available on the host machine.
>
> ***Example:*** `--threads 12`
## 3. Example
```bash
# Step 1.) Grab an interactive node
# Do not run on head node!
sinteractive --mem=8g --cpus-per-task=4
module purge
/data/CCBR_Pipeliner/db/PipeDB/Conda/bin/conda activate base
module load singularity snakemake

# Step 2A.) Dry-run the pipeline
./metavirs run --input .tests/*.gz \
--output /data/$USER/metavirs_out \
--mode slurm \
--dry-run

# Step 2B.) Run the viral metagenomics pipeline
# The slurm mode will submit jobs to the cluster.
# It is recommended running metavirs in this mode.
./metavirs run --input .tests/*.gz \
--output /data/$USER/metavirs_out \
--mode slurm
```
Loading

0 comments on commit acf7d30

Please sign in to comment.