Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,12 @@ jobs:
parallelism: 8
steps:
- checkout
- run:
name: Skip tests if last commit is a chore commit
command: |
cd ~/repo
last_commit="$(git log -1 --pretty=%B | grep chore || true)"
if [ ${#last_commit} -gt 0 ]; then circleci-agent step halt; fi
- run:
name: Create AWS credentials manually
command: |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ toolchest.kraken2(
)
```

For a list of available tools, see the [documentation](https://docs.trytoolchest.com/docs#-tools).
For a list of available tools, see the [documentation](https://docs.trytoolchest.com/tool-reference/about/).

## Configuration

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tool-reference/aligners/bowtie-2.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Function Arguments
| :----------------- | :------------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `inputs` | `-U` | Path to one or more files to use as input. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_path` | `-S` | (optional) Path (directory) to where the output files will be downloaded. If omitted, skips download. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Bowtie 2. This should be a string of arguments like the command line. See [Supported Additional Arguments](https://docs.trytoolchest.com/docs/bowtie-2#supported-additional-arguments) for more details. |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Bowtie 2. This should be a string of arguments like the command line. See [Supported Additional Arguments](#supported-additional-arguments) for more details. |
| `database_name` | `-x`\* | (optional) Name of database to use for Bowtie 2 alignment. Defaults to `"GRCh38_noalt_as"` (human genome). |
| `database_version` | `-x`\* | (optional) Version of database to use for Bowtie 2 alignment. Defaults to `"1"`. |
| `is_async` | | Whether to run a job asynchronously. See [Async Runs](../../feature-reference/async-runs.md) for more. |
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tool-reference/aligners/clustal-omega.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ See the Notes section below for more details.
| :------------ | :------------------ | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `inputs` | `-i` | Path to one or more files to use as input. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_path` | `-o` | (optional) Path (directory) to where the output files will be downloaded. If omitted, skips download. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Clustal Omega. This should be a string of arguments like the command line. See [Supported Additional Arguments](https://docs.trytoolchest.com/docs/clustal-omega#supported-additional-arguments) for more details. |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Clustal Omega. This should be a string of arguments like the command line. See [Supported Additional Arguments](#supported-additional-arguments) for more details. |
| `is_async` | | Whether to run a job asynchronously. See [Async Runs](../../feature-reference/async-runs.md) for more. |

Tool Versions
Expand Down
16 changes: 8 additions & 8 deletions docs/docs/tool-reference/aligners/rapsearch2.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ Function Arguments

See the Notes section below for more details.

| Argument | Use in place of: | Description |
| :----------------- | :------------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `inputs` | `-q` | Path to one or more files to use as input. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_path` | `-o` | (optional) Path (directory) to where the output files will be downloaded. If omitted, skips download. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `database_name` | `-d`\* | (optional) Name of database to use for RAPSearch2 alignment. Defaults to `"GRCh38"` (human genome). |
| `database_version` | `-d`\* | (optional) Version of database to use for RAPSearch2 alignment. Defaults to `"1"`. |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to RAPSearch2. This should be a string of arguments like the command line. See [Supported Additional Arguments](https://docs.trytoolchest.com/docs/rapsearch-2#supported-additional-arguments) for more details. |
| `is_async` | | Whether to run a job asynchronously. See [Async Runs](../../feature-reference/async-runs.md) for more. |
| Argument | Use in place of: | Description |
| :----------------- | :------------------ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `inputs` | `-q` | Path to one or more files to use as input. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_path` | `-o` | (optional) Path (directory) to where the output files will be downloaded. If omitted, skips download. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `database_name` | `-d`\* | (optional) Name of database to use for RAPSearch2 alignment. Defaults to `"GRCh38"` (human genome). |
| `database_version` | `-d`\* | (optional) Version of database to use for RAPSearch2 alignment. Defaults to `"1"`. |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to RAPSearch2. This should be a string of arguments like the command line. See [Supported Additional Arguments](#supported-additional-arguments) for more details. |
| `is_async` | | Whether to run a job asynchronously. See [Async Runs](../../feature-reference/async-runs.md) for more. |

\*See the [Databases](#databases) section for more details.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tool-reference/assemblers/unicycler.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Function Arguments
| `read_two` | `-2` | (optional) Path to R2 of paired-end short read input files. The file can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `long_reads` | `-l` | (optional) Path to the file containing long reads. The file can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_path` | `-o` | (optional) Path (directory) to where the output files will be downloaded. If omitted, skips download. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Unicycler. This should be a string of arguments like the command line. See [Supported Additional Arguments](https://docs.trytoolchest.com/docs/unicycler#supported-additional-arguments) for more details. |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Unicycler. This should be a string of arguments like the command line. See [Supported Additional Arguments](#supported-additional-arguments) for more details. |
| `is_async` | | Whether to run a job asynchronously. See [Async Runs](../../feature-reference/async-runs.md) for more. |

Notes
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tool-reference/post-processing/bracken.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ tc.bracken(
| `kraken2_report` | Kraken 2 report file input | Path to Kraken 2 report file. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_path` | `-o` directory name | (optional) Path (directory) to where the output files will be downloaded. If omitted, skips download. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_primary_name` | `-o` file name | (Optional) Name of Bracken output file. Defaults to `"output.bracken"`. |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Bracken. This should be a string of arguments like the command line. See [Supported Additional Arguments](https://docs.trytoolchest.com/docs/kraken-2#supported-additional-arguments) for more details. |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Bracken. This should be a string of arguments like the command line. See [Supported Additional Arguments](#supported-additional-arguments) for more details. |
| `database_name` | `-d` | (optional) Name of database that was used for Kraken 2 alignment. Defaults to `"standard"`. |
| `database_version` | `-d` | (optional) Version of database that was used for Kraken 2 alignment. Defaults to `"1"`. |
| `remote_database_path` | `-d` | (optional) AWS S3 URI to a directory with your custom database that was used with Kraken 2 alignment. |
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tool-reference/structure-prediction/alphafold.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,4 +52,4 @@ Toolchest supports the following arguments for AlphaFold:
- `--max_template_date`
- `--model_preset`

However, these should be specified via specific argument values in the function call, rather than through a generic `tool_args` argument (like other Toolchest tools). See [Function Arguments](https://docs.trytoolchest.com/docs/alphafold#function-arguments) for more details.
However, these should be specified via specific argument values in the function call, rather than through a generic `tool_args` argument (like other Toolchest tools). See [Function Arguments](#function-arguments) for more details.
110 changes: 110 additions & 0 deletions docs/docs/tool-reference/taxonomic-classifiers/centrifuge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
**Centrifuge** is a rapid and memory-efficient classifier of DNA sequences from microbial samples.
Centrifuge requires a relatively small genome index (e.g., 4.3 GB for ~4,100 bacterial genomes) and can process a
typical DNA sequencing run within an hour. For more information,
see the tool's [website](https://ccb.jhu.edu/software/centrifuge/) and
[GitHub repo](https://github.com/DaehwanKimLab/centrifuge).

Function Call
=============

```python
tc.centrifuge(
output_path=None,
tool_args="",
database_name="centrifuge_refseq_bacteria_archaea_viral_human",
database_version="1",
read_one=None,
read_two=None,
unpaired=None,
is_async=False,
)
```

Function Arguments
------------------

| Argument | Use in place of: | Description |
|:-------------------|:------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `read_one` | `-1` | (optional) Path(s) to R1 of paired-end read input files. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `read_two` | `-2` | (optional) Path(s) to R2 of paired-end read input files. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `unpaired` | `-U` | (optional) Path(s) to unpaired input files. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `output_path` | output arguments (`-S`, `--report`) | (optional) Path (directory) to where the output files will be downloaded. If omitted, skips download. The files can be a local or remote, see [Using Files](../../getting-started/using-files.md). |
| `tool_args` | all other arguments | (optional) Additional arguments to be passed to Centrifuge. This should be a string of arguments like the command line. See [Supported Additional Arguments](#supported-additional-arguments) for more details. |
| `database_name` | `-x`\* | (optional) Name of database to use for Centrifuge classification. Defaults to `"centrifuge_refseq_bacteria_archaea_viral_human"` (Refseq bacteria / archaea / viral / human). |
| `database_version` | `-x`\* | (optional) Version of database to use for Centrifuge classification. Defaults to `"1"`. |
| `is_async` | | Whether to run a job asynchronously. See [Async Runs](../../feature-reference/async-runs.md) for more. |

*See the [Databases](#databases) section for more details.

Output Files
------------

A Centrifuge run will output these files into `output_path`:

- `centrifuge_output.txt`: Centrifuge output (captured from `stdout`), from the `-S` argument.
- `centrifuge_report.tsv`: Centrifuge report file, from the `--report` argument.

Notes
-----

### Paired-end reads

For each paired-end input, make sure the corresponding read is in the same position in the input list. For example, two
pairs of paired-end files – `one_R1.fastq`, `one_R2.fastq`, `two_R1.fastq`, `two_R2.fastq` – should be passed to
Toolchest as:

```python
tc.centrifuge(
read_one=["one_R1.fastq", "two_R1.fastq"],
read_two=["one_R2.fastq", "two_R2.fastq"],
...
)
```

Tool Versions
=============

Toolchest currently supports version **1.0.4** of Centrifuge.

Databases
=========

Toolchest currently supports the following databases for Bowtie 2:

| `database_name` | `database_version` | Description |
|:-------------------------------------------------------| :----------------- |:-------------------------------------------------------------------|
| `centrifuge_refseq_bacteria_archaea_viral_human` | `1` | RefSeq, bacteria / archaea / viral / human, JHU source<sup>1</sup> |

<sup>1</sup>These database indexes were generated by [the Langmead Lab at Johns Hopkins](https://langmead-lab.org/) and can be found on [the lab's database index page](https://benlangmead.github.io/aws-indexes/centrifuge).

Supported Additional Arguments
==============================

Most additional arguments not related to input, output, or multithreading are supported:
- \-q
- \--qseq
- \-f
- \-r
- \-c
- \-s, \--skip
- \-u, \--upto
- \-5, \--trim5
- \-3, \--trim3
- \--phred33
- \--phred64
- \--int-quals
- \--ignore-quals
- \--nofw
- \--norc
- \--min-hitlen
- \-k
- \--host-taxids
- \--exclude-taxids
- \--out-fmt
- \--tab-fmt-cols
- \-t, \--time
- \--qc-filter
- \--seed
- \--non-deterministic

Additional arguments can be specified under the `tool_args` argument.
Loading