Skip to content

Commit

Permalink
docs: split up walkthrough doc
Browse files Browse the repository at this point in the history
  • Loading branch information
ejseqera committed Jan 26, 2024
1 parent bad3428 commit 910cc66
Show file tree
Hide file tree
Showing 10 changed files with 313 additions and 237 deletions.
33 changes: 33 additions & 0 deletions demo/add_a_dataset.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Datasets

Most bioinformatics pipelines will require an input of some sort, typically a samplesheet where each row consists of a sample, the location of files for that sample (such as fastq files), and other sample details.

Datasets in Seqera Platform are CSV (comma-separated values) and TSV (tab-separated values) files stored in a workspace. They are used as inputs to pipelines to simplify data management, minimize user data-input errors, and facilitate reproducible workflows.

When running pipelines on the Cloud, this samplesheet has to be made available in Cloud storage or a remote location. Instead of doing this, we can upload a samplesheet we have locally, as a Dataset to the Platform to specify as input to our pipeline.

## 1. Download the nf-core/rnaseq test samplesheet

The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline works with input datasets (samplesheets) containing sample names, fastq file locations, and indications of strandedness. The Seqera Community Showcase sample dataset for _nf-core/rnaseq_ looks like this:

**Example rnaseq dataset**

| sample | fastq_1 | fastq_2 | strandedness |
| ------------------- | ------------------------------------ | ------------------------------------ | ------------ |
| WT_REP1 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse |
| WT_REP1 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse |
| WT_REP2 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse |
| RAP1_UNINDUCED_REP1 | s3://nf-core-awsmegatests/rnaseq/... | | reverse |
| RAP1_UNINDUCED_REP2 | s3://nf-core-awsmegatests/rnaseq/... | | reverse |
| RAP1_UNINDUCED_REP2 | s3://nf-core-awsmegatests/rnaseq/... | | reverse |
| RAP1_IAA_30M_REP1 | s3://nf-core-awsmegatests/rnaseq/... | s3://nf-core-awsmegatests/rnaseq/... | reverse |

Download the nf-core/rnaseq [samplesheet_test.csv](samplesheet_test.csv) provided in this repository on to your computer.

## 2. Add a Dataset

Go to the 'Datasets' tab and click 'Add Dataset'.

![Adding a Dataset](docs/images/sp-cloud-add-a-dataset.gif)

Specify a name for the dataset such as 'nf-core-rnaseq-test-dataset', description, include the first row as header, and upload the CSV file provided in this repository. This CSV file specifies the paths to 7 small FASTQ files for a sub-sampled Yeast RNAseq dataset.
35 changes: 35 additions & 0 deletions demo/add_a_pipeline.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Add a Pipeline to the Launchpad

The Launchpad allows you to launch and manage Nextflow pipelines and associated compute that your pipelines will be executed on. Using the Launchpad, you can create a curated set of pipelines (including variations of the same pipeline) that are ready to be executed on the associated compute environments, while allowing the user to customize the pipeline-level parameters if needed.

## 1. Add a Pipeline

To add a pipeline, click on the **'Add Pipeline'** button. As an example, we will add the [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline to the Launchpad.

![Adding nf-core/rnaseq pipeline](docs/images/sp-cloud-add-rnaseq.gif)

Specify a name, description, and click on pre-existing AWS compute environment to execute on.

## 2. Specify a repository URL and revision

In the repository URL, specify the nf-core/rnaseq repository:

```bash
https://github.com/nf-core/rnaseq
```

Additionally, specify a version of the pipeline as the 'Revision number'. You can use `3.12.0`.

## 3. Parameters and Nextflow Configuration

Pipeline parameters and Nextflow configuration settings can also be specified as you add the pipeline to the Launchpad.

For example, a pipeline can be pre-populated to run with specific parameters on the Launchpad.
![Adding pipeline parameters](docs/images/sp-cloud-pipeline-params.gif)

## 4. Pre-run script and additional options

You can run custom code either before or after the execution of the Nextflow script. These text fields allow you to enter shell commands.

Pre-run scripts are executed in the nf-launch script prior to invoking Nextflow processes. Pre-run scripts are useful for executor setup (e.g., use a specific version of Nextflow) and troubleshooting.
![Specify NF version in pre-run script](docs/images/sp-cloud-pre-run-options.gif)
33 changes: 33 additions & 0 deletions demo/data_explorer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Data Explorer

With Data Explorer, you can browse and interact with remote data repositories from organization workspaces in Seqera Platform. It supports AWS S3, Azure Blob Storage, and Google Cloud Storage repositories.

## 1. Data Explorer features

- View bucket details
To view bucket details such as the cloud provider, bucket address, and credentials, select the information icon next to a bucket in the Data Explorer list.

- Search and filter buckets
Search for buckets by name and region (e.g., region:eu-west-2) in the search field, and filter by provider.

- Hide buckets from list view
Workspace maintainers can hide buckets from the Data Explorer list view. Select multiple buckets, then select Hide in the Data Explorer toolbar. To hide buckets individually, select Hide from the options menu of a bucket in the list.

The Data Explorer list filter defaults to Only visible. Select Only hidden or All from the filtering menu to view hidden buckets in the list. You can Unhide a bucket from its options menu in the list view.

- View bucket contents
Select a bucket name from the Data Explorer list to view the contents of that bucket. From the View cloud bucket page, you can browse directories and search for objects by name in a particular directory. The file type, size, and path of objects are displayed in columns to the right of the object name. To view bucket details such as the cloud provider, bucket address, and credentials, select the information icon.

- Preview and download files
From the View cloud bucket page, you can preview and download files. Select the download icon in the Actions column to download a file directly from the list view. Select a file to open a preview window that includes a Download button.

## 2. View Run outputs in Data Explorer

Data Explorer can be used to view the outputs of your pipelines.

From the View cloud bucket page, you can:

1. Preview and download files: Select the download icon in the 'Actions' column to download a file directly from the list view. Select a file to open a preview window that includes a Download button.
2. Copy bucket/object paths: Select the Path of an object on the cloud bucket page to copy its absolute path to the clipboard. Use these object paths to specify input data locations during pipeline launch, or add them to a dataset for pipeline input.

![Data Explorer bucket](docs/images/sp-cloud-data-explorer.gif)
45 changes: 45 additions & 0 deletions demo/demo_overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# seqeralabs/showcase Demo

This guide provides a walkthrough of a standard Seqera Platform demonstration. The demonstration will describe how to add a pipeline to the Launchpad, launch a workflow with pipeline parameters, monitor a run, and examine the run details in several different parts.

More specifically, this demonstration will focus on using the [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline as an example and executing the workflow on AWS Batch.

## Requirements

- An [AWS Batch Compute Environment created on the Platform](https://docs.seqera.io/platform/23.3.0/compute-envs/aws-batch)
- The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline repository
- Samplesheet to create a Dataset on the Platform used to run minimal test RNAseq data (see [samplesheet_test.csv](./samplesheet_test.csv) file in this repository)

## Sections

1. [Overview of the Platform](#1-login-to-seqeraio)
2. [Add a Pipeline to the Launchpad](add_a_pipeline.md)
3. [Add a Dataset to Seqera Platform](add_a_dataset.md)
4. [Launch your Pipeline](launch_pipeline.md)
5. [Runs and Monitoring your workflow](monitor_run.md)
6. [Examine the run and task details](run_details.md)
7. [Resume a pipeline run](resume_pipeline.md)
8. [Data Explorer](data_explorer.md)
9. [Optimize your Pipeline](pipeline_optimization.md)

## Walkthrough of demonstration

### 1. Login to seqera.io

Log into Seqera Platform, either through a GitHub account, Google account, or an email address.

If an email address is provided, Seqera Cloud will send an authentication link to the email address to login with.

![Seqera Platform Cloud login](docs/images/sp-cloud-signin.gif)

### 2. Navigate into the seqeralabs/showcase Workspace

All resources in Seqera Platform live inside a Workspace, which in turn belong to an Organisation. Typically, teams of colleagues or collaborators will share one or more workspaces. All resources in a Workspace (i.e. pipelines, compute environments, datasets) are shared by members of that workspace.

Navigate into the `seqeralabs/showcase` Workspace.

![Seqera Labs Showcase Workspace](docs/images/go-to-workspace.gif)

### 3. User settings

# TODO
Loading

0 comments on commit 910cc66

Please sign in to comment.