Skip to content

Commit

Permalink
Merge pull request ejseqera#2 from ejseqera/add_demo_docs
Browse files Browse the repository at this point in the history
feat: add missing sections
  • Loading branch information
ejseqera authored May 14, 2024
2 parents 674aafc + 52e4344 commit d2c48e9
Show file tree
Hide file tree
Showing 37 changed files with 437 additions and 61 deletions.
1 change: 1 addition & 0 deletions demo/docs/add_a_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ When running pipelines on the Cloud, this samplesheet has to be made available i
The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline works with input datasets (samplesheets) containing sample names, fastq file locations, and indications of strandedness. The Seqera Community Showcase sample dataset for _nf-core/rnaseq_ looks like this:

**Example rnaseq dataset**

<center>

| sample | fastq_1 | fastq_2 | strandedness |
Expand Down
Binary file added demo/docs/assets/all_runs_view.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/connect-to-studio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/create-a-data-link.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/create-data-studio.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/dashboard_view.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/data-explorer-add-bucket.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/data-explorer-preview-files.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/data-explorer-view-details.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/data-studio-create-jupyter.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/generate-access-token.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions demo/docs/assets/logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/mount-data-into-studio.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/platform-cli.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/send-studio-link.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/seqera-biotech-stack.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/seqera-one-platform.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/start-studio.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/stop-a-studio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/user-settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
67 changes: 67 additions & 0 deletions demo/docs/automation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
Seqera Platform provides multiple methods of programmatic interaction allowing you to automate execution of pipelines, chain pipelines together, and integrate the Platform into third-party services of your choosing.

### 1. Seqera Platform API

The Seqera Platform public API provides endpoints for performing all actions available on the interface, programmatically. The API can be accessed from `https://api.cloud.seqera.io`.

The full list of endpoints is available in Seqera's OpenAPI schema found [here](https://cloud.seqera.io/openapi/index.html). The API requires an authentication token to be specified in every API request. This can be created in your user menu under **Your tokens**.

![Platform access token](./assets/generate-access-token.gif)

The token is only displayed once. Store your token in a safe place. Use this token to authenticate requests to the API via cURL, Postman, or within your code.

For example, to launch the hello pipeline in the seqeralabs/showcase using the `/workflow/launch` endpoint:

```bash
curl -X POST "https://api.cloud.seqera.io/workflow/launch?workspaceId=38659136604200" \
-H "Accept: application/json" \
-H "Authorization: Bearer <your_access_token>" \
-H "Content-Type: application/json" \
-H "Accept-Version:1" \
-d '{
"launch": {
"computeEnvId": "hjE97A8TvD9PklUb0hwEJ",
"runName": "first-time-pipeline-api-byname",
"pipeline": "first-time-pipeline",
"workDir": "s3://nf-ireland",
"revision": "master"
}
}'
```

### 2. Seqera Platform CLI

The command line utility used to manage resources on Seqera Platform such as pipelines, runs, compute environments is called `tw`.

The CLI provides an interface to launch pipelines, manage compute environments, retrieve run metadata, and monitor runs on the Platform. It provides a Nextflow-like experience for bioinformaticians to prefer the CLI, allows you store Seqera resource configuration (i.e. pipelines, compute environments) in Infrastructure-as-code, and is built on top of the [Seqera Platform API](#1-seqera-platform-api). The CLI offers more flexibility and easier interaction with the Platform, than the API.

![Seqera Platform CLI](./assets/platform-cli.png)

For example, to launch the hello pipeline using the CLI:

```bash
tw launch hello --workspace seqeralabs/showcase
```

The `tw` CLI installation and usage details can be obtained from [this](https://github.com/seqeralabs/tower-cli/) Github repository.

### 3. seqerakit

`seqerakit` is a Python wrapper for the Seqera Platform CLI which can be leveraged to automate the creation of all of the entities in Seqera Platform via a simple configuration file in YAML format.

The key features are:

- **Simple configuration**: All of the command-line options available when using the Seqera Platform CLI can be defined in simple YAML format.
- **Infrastructure as Code**: Enable users to manage and provision their infrastructure specifications.
- **Automation**: End-to-end creation of entities within Seqera Platform, all the way from adding an Organization to launching pipeline(s) within that Organization.

For example, to launch the hello pipeline using seqerakit, you can create a YAML file as follows:

```yaml
launch:
- name: "hello-world"
url: "https://github.com/nextflow-io/hello"
workspace: "seqeralabs/showcase"
```
The `seqerakit` installation and usage details are available on [this](https://github.com/seqeralabs/seqera-kit/) Github repository.
45 changes: 27 additions & 18 deletions demo/docs/data_explorer.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,41 @@

With Data Explorer, you can browse and interact with remote data repositories from organization workspaces in Seqera Platform. It supports AWS S3, Azure Blob Storage, and Google Cloud Storage repositories.

## 1. Data Explorer features
## 1. View pipeline outputs in Data Explorer

- View bucket details
To view bucket details such as the cloud provider, bucket address, and credentials, select the information icon next to a bucket in the Data Explorer list.
In Data Explorer, you are able to:

- Search and filter buckets
Search for buckets by name and region (e.g., `region:eu-west-2`) in the search field, and filter by provider.
- **View bucket details**:
The cloud provider, bucket address, and credentials, by selecting the information icon next to a bucket in the Data Explorer list.

- Hide buckets from list view
Workspace maintainers can hide buckets from the Data Explorer list view. Select multiple buckets, then select Hide in the Data Explorer toolbar. To hide buckets individually, select Hide from the options menu of a bucket in the list.
![Bucket details](assets/data-explorer-view-details.gif)

The Data Explorer list filter defaults to Only visible. Select Only hidden or All from the filtering menu to view hidden buckets in the list. You can Unhide a bucket from its options menu in the list view.
- **View bucket contents**
Select a bucket name from the Data Explorer list to view the contents of that bucket.

The file type, size, and path of objects are displayed in columns to the right of the object name. For example, we can take a look at the outputs of our nf-core/rnaseq run.

- View bucket contents
Select a bucket name from the Data Explorer list to view the contents of that bucket. From the View cloud bucket page, you can browse directories and search for objects by name in a particular directory. The file type, size, and path of objects are displayed in columns to the right of the object name. To view bucket details such as the cloud provider, bucket address, and credentials, select the information icon.
![Data Explorer bucket](assets/sp-cloud-data-explorer.gif)

- Preview and download files
From the View cloud bucket page, you can preview and download files. Select the download icon in the Actions column to download a file directly from the list view. Select a file to open a preview window that includes a Download button.
- **Preview files**:
Select a file to open a preview window that includes a Download button. For example, we can use Data Explorer to view the results of the nf-core/rnaseq pipeline that we executed. Specifically, we can take a look at the resultant gene counts of the salmon quantification step:

## 2. View Run outputs in Data Explorer
![Preview pipeline results](assets/data-explorer-preview-files.gif)

Data Explorer can be used to view the outputs of your pipelines.
## 2. Configure a bucket to browser in Data Explorer
Data Explorer also enables you to add public cloud storage buckets to view and use data from resources such as:

From the View cloud bucket page, you can:
- [The Cancer Genome Atlas (TCGA)](https://registry.opendata.aws/tcga/)
- [1000 Genomes Project](https://registry.opendata.aws/1000-genomes/)
- [NCBI SRA](https://registry.opendata.aws/ncbi-sra/)
- [Genome in a Bottle Consortium](https://docs.opendata.aws/giab/readme.html)
- [MSSNG Database](https://cloud.google.com/life-sciences/docs/resources/public-datasets/mssng)
- [Genome Aggregation Database (gnomAD)](https://cloud.google.com/life-sciences/docs/resources/public-datasets/gnomad)

1. Preview and download files: Select the download icon in the 'Actions' column to download a file directly from the list view. Select a file to open a preview window that includes a Download button.
2. Copy bucket/object paths: Select the Path of an object on the cloud bucket page to copy its absolute path to the clipboard. Use these object paths to specify input data locations during pipeline launch, or add them to a dataset for pipeline input.
Select 'Add cloud bucket' from the Data Explorer tab to add individual buckets (or directory paths within buckets).

![Data Explorer bucket](assets/sp-cloud-data-explorer.gif)
Specify the Provider, Bucket path, Name, Credentials, and Description, then select Add. For public cloud buckets, select Public from the Credentials drop-down menu.

![Add public bucket](assets/data-explorer-add-bucket.gif)

You are now able to use this data in your analysis without having to interact with Cloud consoles or CLI tools.
96 changes: 96 additions & 0 deletions demo/docs/data_studios.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
Data Studios is a unified platform where you can perform analysis of your pipeline results after successful execution.

It allows you to host a combination of images and compute environments for interactive analysis using your preferred tools, like Jupyter notebooks, RStudio, and Visual Studio Code IDEs.

Each data studio session is an individual interactive environment that encapsulates the live environment for dynamic data analysis.

## Data Studio Setup

### Create a Data Studio

#### 1. Add a Data Studio

To create a Data Studio, click on the 'Add data studio' button and select from any one of the three currently available templates.

![Add a data studio](assets/create-data-studio.gif)

#### 2. Select a compute environment

Currently, only AWS Batch is supported.

#### 3. Mount data using Data Explorer

Select data to mount into your data studios environment using the Fusion file system in Data Explorer. This data will be available at `/workspace/data/<dataset>`.

For example, to take a look at the results of your nf-core/rnaseq pipeline run, you can mount the value of the `outdir` parameter specified in the [earlier step when launching the pipeline](./launch_pipeline.md).

![Mount data into studio](assets/mount-data-into-studio.gif)

#### 4. Resources for environment

Enter a CPU or memory allocation for your data studios environment (optional). The default is 2 CPUs and 8192 MB of memory.

Then, click Add!

The data studio environment will be available in the Data Studios landing page with the status 'stopped'. Click on the three dots and **Start** to begin running the studio.

![Start a studio](assets/start-studio.gif)

![Connect to a studio](assets/connect-to-studio.png){ .right .image}

### Connect to a Data Studio

To connect to a running data studio session, select the three dots next to the status message and choose **Connect**. A new browser tab will open, displaying the status of the data studio session. Select **Connect**.
<br>
<div style="clear: both;"></div>

### Collaborate in Data Studio

Collaborators can also join a data studios session in your workspace. For example, to share the results of the nf-core/rnaseq pipeline, you can share a link by selecting the three dots next to the status message for the data studio you want to share, then select **Copy data studio URL**. Using this link other authenticated users with the "Connect" role at minimum, can access the session directly.
<div style="clear: both;"></div>

![Stop a studio session](assets/stop-a-studio.png){ .right .image}
### Stop a Data Studio

To stop a running session, click on the three dots next to the status and select **Stop**. Any unsaved analyses or results will be lost.<br>
<div style="clear: both;"></div>

<br>
## Analyse RNAseq data in a Data Studio

Data Studio can be used to perform tertiary analysis of data generated by Nextflow pipeline executions on Seqera Platform. For example, we can take a look at our nf-core/rnaseq pipeline results in a Jupyter notebook to perform additional interactive analyses.

### 1. Create a Data Link
To enable access to our RNAseq analysis data in a Studio, we can create a custom data link pointing to the directory in our AWS S3 bucket where the results are saved.

This can be achieved by using the 'Add cloud bucket' button in Data Explorer and specifying the path to our output directory:

![Stop a studio session](assets/create-a-data-link.png){ .center }


### 2. Create a Jupyter notebook session
When creating our Data Studio, we can mount our newly created Data Link to isolate read/write access to this directory within the studio session.

![Jupyter notebook studio](assets/data-studio-create-jupyter.gif)

### 3. Data exploration in Jupyter
Once created, we can Connect to our Data Studio to open a Jupyter notebook session where we can take a look at the results of our RNAseq analysis.

For example, in the notebook, you may first want to import Python libraries:

```python
import pandas as pd
```

We can load in our data from the analyses. For example, as a start, lets take a look at our transcript counts across the samples when loaded into a Pandas dataframe:

```python
data = pd.read_csv('data/seqeralabs-showcase-rnaseq-results/star_salmon/salmon.merged.gene_counts.tsv', sep='\t', index_col=0)
print(data.head())
```

![Jupyter notebook](assets/data-studio-jupyter-notebook-example.png)


Through Data Studios, you are now able to continue into the next step of your tertiary analyses, using data generated from pipelines executed on Seqera Platform but stored in the Cloud - without having to ever leave the Platform.

15 changes: 12 additions & 3 deletions demo/docs/demo_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,26 @@

Log into Seqera Platform, either through a GitHub account, Google account, or an email address.

If an email address is provided, Seqera Cloud will send an authentication link to the email address to login with.
Upon providing an email address, Seqera Cloud will send an authentication link, enabling login.

![Seqera Platform Cloud login](assets/sp-cloud-signin.gif)

### 2. Navigate into the seqeralabs/showcase Workspace

All resources in Seqera Platform live inside a Workspace, which in turn belong to an Organisation. Typically, teams of colleagues or collaborators will share one or more workspaces. All resources in a Workspace (i.e. pipelines, compute environments, datasets) are shared by members of that workspace.
All resources in Seqera Platform live inside a Workspace, which in turn belong to an Organization. Typically, teams of colleagues or collaborators will share one or more workspaces. All resources in a Workspace (i.e. pipelines, compute environments, datasets) are shared by members of that workspace.

Navigate into the `seqeralabs/showcase` Workspace.

![Seqera Labs Showcase Workspace](assets/go-to-workspace.gif)

### 3. TODO User settings
### 3. User settings

To access or modify your user settings such as your username, or name, click on the avatar icon in the top right corner. You will be able to modify these settings in 'Your profile'.

![User settings](./assets/user-settings.png){ .right .image}

You can specify user specific settings such as:

- **User tokens**: Your personal access token for authentication on the Platform, used in [automation](./automation.md).
- **User credentials**: Credentials for your own personal workspace which can include cloud access keys, repository credentials, Docker credentials.
- **User secrets**: Secrets used in any Nextflow workflows launched in your user workspace.
50 changes: 38 additions & 12 deletions demo/docs/index.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,64 @@
# Seqera Platform: Demonstration Walkthrough

Walkthrough documentation of [Seqera Platform](https://seqera.io/)
---
![](assets/landing_page.png){ .right .image}

[:fontawesome-solid-user: Login to Seqera Platform](https://tower.nf/login){ .md-button }
---
<div style="display: flex; align-items: center; margin-bottom: 20px;">
<div style="margin-right: 10px;">
<a href="https://cloud.seqera.io/login" class="md-button" style="display: block; margin-bottom: 10px;">
<i class="fas fa-user"></i> Login to Seqera Platform
</a>
<a href="https://seqera.io" class="md-button" style="display: block;">
Visit Seqera Main Site
</a>
</div>
<div style="flex: 1; margin-left: 200px;">
<img src="assets/seqera-one-platform.png" alt="Seqera Biotech Stack" style="width: 100%; max-width: 750px;">
</div>
</div>


---


---
## Overview

<!-- ![Seqera biotech stack](assets/seqera-biotech-stack.png){ .right .image} -->
<img src="assets/seqera-biotech-stack.png" alt="Seqera biotech stack" style="float: right; width: 50%; margin-left: 30px; margin-bottom: 20px;">

This guide provides a walkthrough of a standard Seqera Platform demonstration. The demonstration will describe how to add a pipeline to the Launchpad, launch a workflow with pipeline parameters, monitor a Run, and examine the run details in several different parts. The demonstration will also highlight key features such as the Pipeline Optimization, Data Explorer, and Compute Environment creation.

More specifically, this demonstration will focus on using the [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline as an example and executing the workflow on AWS Batch.

<div style="clear: both;"></div>

---

## Requirements

- A [Seqera Platform Cloud](https://seqera.io/login) account
- Access to a Workspace in Seqera Platform
- :fontawesome-brands-aws: An [AWS Batch Compute Environment created in that Workspace](https://docs.seqera.io/platform/23.3.0/compute-envs/aws-batch)
- The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline repository
- Samplesheet to create a Dataset on the Platform used to run minimal test RNAseq data (see [samplesheet_test.csv](./samplesheet_test.csv) file in this repository)
:octicons-checkbox-16: A [Seqera Platform Cloud](https://cloud.seqera.io/login) account

:octicons-checkbox-16: Access to a Workspace in Seqera Platform

:octicons-checkbox-16: :fontawesome-brands-aws: An [AWS Batch Compute Environment created in that Workspace](https://docs.seqera.io/platform/23.3.0/compute-envs/aws-batch)

:octicons-checkbox-16: The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline repository

:octicons-checkbox-16: Samplesheet to create a Dataset on the Platform used to run minimal test RNAseq data (see [samplesheet_test.csv](./samplesheet_test.csv) file in this repository)

---

## Sections
[:material-check-circle:]() [Overview of the Platform](./demo_overview.md) <br/>

[:material-check-circle:]() [Why use Seqera Platform?](./intro.md) <br/>
[:material-check-circle:]() [Overview of the Platform](./demo_overview.md) <br/>
[:material-check-circle:]() [Add a Pipeline to the Launchpad](./add_a_pipeline.md) <br/>
[:material-check-circle:]() [Add a Dataset to Seqera Platform](./add_a_dataset.md) <br/>
[:material-check-circle:]() [Launch a Pipeline](./launch_pipeline.md) <br/>
[:material-check-circle:]() [Runs and Monitoring your workflow](./monitor_run.md) <br/>
[:material-check-circle:]() [Examine the run and task details](./run_details.md) <br/>
[:material-check-circle:]() [Resume a Pipeline](./resume_pipeline.md) <br/>
[:material-check-circle:]() [Data Explorer](./data_explorer.md) <br/>
[:material-check-circle:]() [Data Studios](./data_studios.md) <br/>
[:material-check-circle:]() [Optimize your Pipeline](./pipeline_optimization.md) <br/>
[:material-check-circle:]() [Automation](./automation.md) <br/>
[:material-check-circle:]() [Scaling Science on Seqera Platform](./summary.md) <br/>

Loading

0 comments on commit d2c48e9

Please sign in to comment.