forked from ejseqera/showcase
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
313 additions
and
237 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Datasets | ||
|
||
Most bioinformatics pipelines will require an input of some sort, typically a samplesheet where each row consists of a sample, the location of files for that sample (such as fastq files), and other sample details. | ||
|
||
Datasets in Seqera Platform are CSV (comma-separated values) and TSV (tab-separated values) files stored in a workspace. They are used as inputs to pipelines to simplify data management, minimize user data-input errors, and facilitate reproducible workflows. | ||
|
||
When running pipelines on the Cloud, this samplesheet has to be made available in Cloud storage or a remote location. Instead of doing this, we can upload a samplesheet we have locally, as a Dataset to the Platform to specify as input to our pipeline. | ||
|
||
## 1. Download the nf-core/rnaseq test samplesheet | ||
|
||
The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline works with input datasets (samplesheets) containing sample names, fastq file locations, and indications of strandedness. The Seqera Community Showcase sample dataset for _nf-core/rnaseq_ looks like this: | ||
|
||
**Example rnaseq dataset** | ||
|
||
| sample | fastq_1 | fastq_2 | strandedness | | ||
| ------------------- | ------------------------------------ | ------------------------------------ | ------------ | | ||
| WT_REP1 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse | | ||
| WT_REP1 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse | | ||
| WT_REP2 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse | | ||
| RAP1_UNINDUCED_REP1 | s3://nf-core-awsmegatests/rnaseq/... | | reverse | | ||
| RAP1_UNINDUCED_REP2 | s3://nf-core-awsmegatests/rnaseq/... | | reverse | | ||
| RAP1_UNINDUCED_REP2 | s3://nf-core-awsmegatests/rnaseq/... | | reverse | | ||
| RAP1_IAA_30M_REP1 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse | | ||
|
||
Download the nf-core/rnaseq [samplesheet_test.csv](samplesheet_test.csv) provided in this repository on to your computer. | ||
|
||
## 2. Add a Dataset | ||
|
||
Go to the 'Datasets' tab and click 'Add Dataset'. | ||
|
||
![Adding a Dataset](docs/images/sp-cloud-add-a-dataset.gif) | ||
|
||
Specify a name for the dataset such as 'nf-core-rnaseq-test-dataset', description, include the first row as header, and upload the CSV file provided in this repository. This CSV file specifies the paths to 7 small FASTQ files for a sub-sampled Yeast RNAseq dataset. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Add a Pipeline to the Launchpad | ||
|
||
The Launchpad allows you to launch and manage Nextflow pipelines and associated compute that your pipelines will be executed on. Using the Launchpad, you can create a curated set of pipelines (including variations of the same pipeline) that are ready to be executed on the associated compute environments, while allowing the user to customize the pipeline-level parameters if needed. | ||
|
||
## 1. Add a Pipeline | ||
|
||
To add a pipeline, click on the **'Add Pipeline'** button. As an example, we will add the [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline to the Launchpad. | ||
|
||
![Adding nf-core/rnaseq pipeline](docs/images/sp-cloud-add-rnaseq.gif) | ||
|
||
Specify a name, description, and click on pre-existing AWS compute environment to execute on. | ||
|
||
## 2. Specify a repository URL and revision | ||
|
||
In the repository URL, specify the nf-core/rnaseq repository: | ||
|
||
```bash | ||
https://github.com/nf-core/rnaseq | ||
``` | ||
|
||
Additionally, specify a version of the pipeline as the 'Revision number'. You can use `3.12.0`. | ||
|
||
## 3. Parameters and Nextflow Configuration | ||
|
||
Pipeline parameters and Nextflow configuration settings can also be specified as you add the pipeline to the Launchpad. | ||
|
||
For example, a pipeline can be pre-populated to run with specific parameters on the Launchpad. | ||
![Adding pipeline parameters](docs/images/sp-cloud-pipeline-params.gif) | ||
|
||
## 4. Pre-run script and additional options | ||
|
||
You can run custom code either before or after the execution of the Nextflow script. These text fields allow you to enter shell commands. | ||
|
||
Pre-run scripts are executed in the nf-launch script prior to invoking Nextflow processes. Pre-run scripts are useful for executor setup (e.g., use a specific version of Nextflow) and troubleshooting. | ||
![Specify NF version in pre-run script](docs/images/sp-cloud-pre-run-options.gif) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# Data Explorer | ||
|
||
With Data Explorer, you can browse and interact with remote data repositories from organization workspaces in Seqera Platform. It supports AWS S3, Azure Blob Storage, and Google Cloud Storage repositories. | ||
|
||
## 1. Data Explorer features | ||
|
||
- View bucket details | ||
To view bucket details such as the cloud provider, bucket address, and credentials, select the information icon next to a bucket in the Data Explorer list. | ||
|
||
- Search and filter buckets | ||
Search for buckets by name and region (e.g., region:eu-west-2) in the search field, and filter by provider. | ||
|
||
- Hide buckets from list view | ||
Workspace maintainers can hide buckets from the Data Explorer list view. Select multiple buckets, then select Hide in the Data Explorer toolbar. To hide buckets individually, select Hide from the options menu of a bucket in the list. | ||
|
||
The Data Explorer list filter defaults to Only visible. Select Only hidden or All from the filtering menu to view hidden buckets in the list. You can Unhide a bucket from its options menu in the list view. | ||
|
||
- View bucket contents | ||
Select a bucket name from the Data Explorer list to view the contents of that bucket. From the View cloud bucket page, you can browse directories and search for objects by name in a particular directory. The file type, size, and path of objects are displayed in columns to the right of the object name. To view bucket details such as the cloud provider, bucket address, and credentials, select the information icon. | ||
|
||
- Preview and download files | ||
From the View cloud bucket page, you can preview and download files. Select the download icon in the Actions column to download a file directly from the list view. Select a file to open a preview window that includes a Download button. | ||
|
||
## 2. View Run outputs in Data Explorer | ||
|
||
Data Explorer can be used to view the outputs of your pipelines. | ||
|
||
From the View cloud bucket page, you can: | ||
|
||
1. Preview and download files: Select the download icon in the 'Actions' column to download a file directly from the list view. Select a file to open a preview window that includes a Download button. | ||
2. Copy bucket/object paths: Select the Path of an object on the cloud bucket page to copy its absolute path to the clipboard. Use these object paths to specify input data locations during pipeline launch, or add them to a dataset for pipeline input. | ||
|
||
![Data Explorer bucket](docs/images/sp-cloud-data-explorer.gif) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
# seqeralabs/showcase Demo | ||
|
||
This guide provides a walkthrough of a standard Seqera Platform demonstration. The demonstration will describe how to add a pipeline to the Launchpad, launch a workflow with pipeline parameters, monitor a run, and examine the run details in several different parts. | ||
|
||
More specifically, this demonstration will focus on using the [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline as an example and executing the workflow on AWS Batch. | ||
|
||
## Requirements | ||
|
||
- An [AWS Batch Compute Environment created on the Platform](https://docs.seqera.io/platform/23.3.0/compute-envs/aws-batch) | ||
- The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline repository | ||
- Samplesheet to create a Dataset on the Platform used to run minimal test RNAseq data (see [samplesheet_test.csv](./samplesheet_test.csv) file in this repository) | ||
|
||
## Sections | ||
|
||
1. [Overview of the Platform](#1-login-to-seqeraio) | ||
2. [Add a Pipeline to the Launchpad](add_a_pipeline.md) | ||
3. [Add a Dataset to Seqera Platform](add_a_dataset.md) | ||
4. [Launch your Pipeline](launch_pipeline.md) | ||
5. [Runs and Monitoring your workflow](monitor_run.md) | ||
6. [Examine the run and task details](run_details.md) | ||
7. [Resume a pipeline run](resume_pipeline.md) | ||
8. [Data Explorer](data_explorer.md) | ||
9. [Optimize your Pipeline](pipeline_optimization.md) | ||
|
||
## Walkthrough of demonstration | ||
|
||
### 1. Login to seqera.io | ||
|
||
Log into Seqera Platform, either through a GitHub account, Google account, or an email address. | ||
|
||
If an email address is provided, Seqera Cloud will send an authentication link to the email address to login with. | ||
|
||
![Seqera Platform Cloud login](docs/images/sp-cloud-signin.gif) | ||
|
||
### 2. Navigate into the seqeralabs/showcase Workspace | ||
|
||
All resources in Seqera Platform live inside a Workspace, which in turn belong to an Organisation. Typically, teams of colleagues or collaborators will share one or more workspaces. All resources in a Workspace (i.e. pipelines, compute environments, datasets) are shared by members of that workspace. | ||
|
||
Navigate into the `seqeralabs/showcase` Workspace. | ||
|
||
![Seqera Labs Showcase Workspace](docs/images/go-to-workspace.gif) | ||
|
||
### 3. User settings | ||
|
||
# TODO |
Oops, something went wrong.