Skip to content

Commit

Permalink
feat: add missing sections
Browse files Browse the repository at this point in the history
  • Loading branch information
ejseqera committed Apr 22, 2024
1 parent d0a25d3 commit 65feb28
Show file tree
Hide file tree
Showing 24 changed files with 272 additions and 31 deletions.
1 change: 1 addition & 0 deletions demo/docs/add_a_dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ When running pipelines on the Cloud, this samplesheet has to be made available i
The [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline works with input datasets (samplesheets) containing sample names, fastq file locations, and indications of strandedness. The Seqera Community Showcase sample dataset for _nf-core/rnaseq_ looks like this:

**Example rnaseq dataset**

<center>

| sample | fastq_1 | fastq_2 | strandedness |
Expand Down
Binary file added demo/docs/assets/all_runs_view.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/connect-to-studio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/create-data-studio.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/dashboard_view.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/generate-access-token.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/mount-data-into-studio.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/platform-cli.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/send-studio-link.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/start-studio.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/stop-a-studio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demo/docs/assets/user-settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
67 changes: 67 additions & 0 deletions demo/docs/automation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
Seqera Platform provides multiple methods of programmatic interaction allowing you to automate execution of pipelines, chain pipelines together, and integrate the Platform into third-party services of your choosing.

### 1. Seqera Platform API

The Seqera Platform public API provides endpoints for performing all actions available on the interface, programmatically. The API can be accessed from `https://api.cloud.seqera.io`.

The full list of endpoints is available in Seqera's OpenAPI schema found [here](https://cloud.seqera.io/openapi/seqera-api-latest.yml). The API requires an authentication token to be specified in every API request. This can be created in your user menu under **Your tokens**.

![Platform access token](./assets/generate-access-token.gif)

The token is only displayed once. Store your token in a safe place. Use this token to authenticate requests to the API via cURL, Postman, or within your code.

For example, to launch the hello pipeline in the seqeralabs/showcase using the `/workflow/launch` endpoint:

```bash
curl -X POST "https://api.cloud.seqera.io/workflow/launch?workspaceId=38659136604200" \
-H "Accept: application/json" \
-H "Authorization: Bearer <your_access_token>" \
-H "Content-Type: application/json" \
-H "Accept-Version:1" \
-d '{
"launch": {
"computeEnvId": "hjE97A8TvD9PklUb0hwEJ",
"runName": "first-time-pipeline-api-byname",
"pipeline": "first-time-pipeline",
"workDir": "s3://nf-ireland",
"revision": "master"
}
}'
```

### 2. Seqera Platform CLI

The command line utility used to manage resources on Seqera Platform such as pipelines, runs, compute environments is called `tw`.

The CLI provides an interface to launch pipelines, manage compute environments, retrieve run metadata, and monitor runs on the Platform.It provides a Nextflow-like experience for bioinformaticians to prefer the CLI, allows you store Seqera resource configuration (i.e. pipelines, compute environments) in Infrastructure-as-code, and is built on top of the [Seqera Platform API](#1-seqera-platform-api). The CLI offers more flexibility and easier interaction with the Platform, than the API.

![Seqera Platform CLI](./assets/platform-cli.png)

For example, to launch the hello pipeline using the CLI:

```bash
tw launch hello --workspace seqeralabs/showcase
```

The `tw` CLI installation and usage details can be obtained from [this](https://github.com/seqeralabs/tower-cli/) Github repository.

### 3. seqerakit

`seqerakit` is a Python wrapper for the Seqera Platform CLI which can be leveraged to automate the creation of all of the entities in Seqera Platform via a simple configuration file in YAML format.

The key features are:

- **Simple configuration**: All of the command-line options available when using the Seqera Platform CLI can be defined in simple YAML format.
- **Infrastructure as Code**: Enable users to manage and provision their infrastructure specifications.
- **Automation**: End-to-end creation of entities within Seqera Platform, all the way from adding an Organization to launching pipeline(s) within that Organization.

For example, to launch the hello pipeline using seqerakit, you can create a YAML file as follows:

```yaml
launch:
- name: "hello-world"
url: "https://github.com/nextflow-io/hello"
workspace: "seqeralabs/showcase"
```
The `seqerakit` installation and usage details are available on [this](https://github.com/seqeralabs/seqerakit/) Github repository.
54 changes: 54 additions & 0 deletions demo/docs/data_studios.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Data Studios is a unified platform where you can perform analysis of your pipeline results after successful execution. It allows you to host a combination of images and compute environments for interactive analysis using your preferred tools, like Jupyter notebooks, RStudio, and Visual Studio Code IDEs. Each data studio session is an individual interactive environment that encapsulates the live environment for dynamic data analysis.

<!--
TODO: update gifs with showcase data studios eventually
TODO: add custom datalink for outdir from nf-core/rnaseq results to mount here
TODO: show example of using jupyter or rstudio with nf-core/rnaseq results
-->

### Create a Data Studio

#### 1. Create a Data Studio

To create a Data Studio, click on the 'Add data studio' button and select from any one of the three currently available templates.

![Add a data studio](./assets/create-data-studio.gif)

#### 2. Select a compute environment

Currently, only AWS Batch is supported.

#### 3. Mount data using Data Explorer

Select data to mount into your data studios environment using the Fusion file system in Data Explorer. This data will be available at `/workspace/data/<dataset>`.

For example, to take a look at the results of your nf-core/rnaseq pipeline run, you can mount the value of the `outdir` parameter specified in the [earlier step when launching the pipeline](./launch_pipeline.md).

![Mount data into studio](./assets/mount-data-into-studio.gif)

#### 4. Resources for environment

Enter a CPU or memory allocation for your data studios environment (optional). The default is 2 CPUs and 8192 MB of memory.

Then, click Add!

The data studio environment will be available in the Data Studios landing page with the status 'stopped'. Click on the three dots and **Start** to begin running the studio.

![Start a studio](./assets/start-studio.gif)

![Connect to a studio](./assets/connect-to-studio.png){ .right .image}

### Connect to a Data Studio

To connect to a running data studio session, select the three dots next to the status message and choose **Connect**. A new browser tab will open, displaying the status of the data studio session. Select **Connect**.
<br>

### Collaborate in Data Studio

Collaborators can also join a data studios session in your workspace. For example, to share the results of the nf-core/rnaseq pipeline, you can share a link by selecting the three dots next to the status message for the data studio you want to share, then select **Copy data studio URL**. Using this link other authenticated users can access the session directly.

![Stop a studio session](./assets/stop-a-studio.png){ .right .image}

### Stop a Data Studio

To stop a running session, click on the three dots next to the status and select **Stop**. Any unsaved analyses or results will be lost.
11 changes: 10 additions & 1 deletion demo/docs/demo_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,14 @@ Navigate into the `seqeralabs/showcase` Workspace.

![Seqera Labs Showcase Workspace](assets/go-to-workspace.gif)

### 3. TODO User settings
### 3. User settings

To access or modify your user settings such as your username, or name, click on the avatar icon in the top right corner. You will be able to modify these settings in 'Your profile'.

![User settings](./assets/user-settings.png){ .right .image}

You can specify user specific settings such as:

- **User tokens**: Your personal access token for authentication on the Platform, used in [automation](./automation.md).
- **User credentials**: Credentials for your own personal workspace which can include cloud access keys, repository credentials, Docker credentials.
- **User secrets**: Secrets used in any Nextflow workflows launched in your user workspace.
14 changes: 9 additions & 5 deletions demo/docs/index.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
# Seqera Platform: Demonstration Walkthrough

Walkthrough documentation of [Seqera Platform](https://seqera.io/)
---
## Walkthrough of [Seqera Platform](https://seqera.io/)

![](assets/landing_page.png){ .right .image}

[:fontawesome-solid-user: Login to Seqera Platform](https://tower.nf/login){ .md-button }
---
## [:fontawesome-solid-user: Login to Seqera Platform](https://tower.nf/login){ .md-button }

---

## Overview

This guide provides a walkthrough of a standard Seqera Platform demonstration. The demonstration will describe how to add a pipeline to the Launchpad, launch a workflow with pipeline parameters, monitor a Run, and examine the run details in several different parts. The demonstration will also highlight key features such as the Pipeline Optimization, Data Explorer, and Compute Environment creation.

More specifically, this demonstration will focus on using the [nf-core/rnaseq](https://github.com/nf-core/rnaseq) pipeline as an example and executing the workflow on AWS Batch.
Expand All @@ -27,12 +27,16 @@ More specifically, this demonstration will focus on using the [nf-core/rnaseq](h
---

## Sections
[:material-check-circle:]() [Overview of the Platform](./demo_overview.md) <br/>

[:material-check-circle:]() [Why use Seqera Platform?](./intro.md) <br/>
[:material-check-circle:]() [Overview of the Platform](./demo_overview.md) <br/>
[:material-check-circle:]() [Add a Pipeline to the Launchpad](./add_a_pipeline.md) <br/>
[:material-check-circle:]() [Add a Dataset to Seqera Platform](./add_a_dataset.md) <br/>
[:material-check-circle:]() [Launch a Pipeline](./launch_pipeline.md) <br/>
[:material-check-circle:]() [Runs and Monitoring your workflow](./monitor_run.md) <br/>
[:material-check-circle:]() [Examine the run and task details](./run_details.md) <br/>
[:material-check-circle:]() [Resume a Pipeline](./resume_pipeline.md) <br/>
[:material-check-circle:]() [Data Explorer](./data_explorer.md) <br/>
[:material-check-circle:]() [Data Studios](./data_studios.md) <br/>
[:material-check-circle:]() [Optimize your Pipeline](./pipeline_optimization.md) <br/>
[:material-check-circle:]() [Automation](./automation.md) <br/>
53 changes: 53 additions & 0 deletions demo/docs/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
### Introduction to Nextflow and Seqera Platform

#### What is Nextflow?

Nextflow is a workflow system for creating scalable, portable, and reproducible workflows.

Nextflow is both a workflow language and an execution runtime that supports a wide range of execution platforms, including popular traditional grid scheduling systems such as Slurm and IBM LSF, and cloud services such as AWS Batch, Google Cloud Batch, Azure Batch and Kubernetes.

#### Running a Nextflow pipeline

Nextflow provides a simple command line interface for managing and executing pipelines.

Let's take a look at how to run a basic, Nextflow pipeline using a simple [Hello World script](https://github.com/nextflow-io/hello).

1. First, ensure Nextflow is installed:

```bash
curl -s https://get.nextflow.io | bash
```

2. With Nextflow installed, you can then run the following on your command-line to start running the hello pipeline:

```bash
nextflow run https://github.com/nextflow-io/hello
```

#### Monitoring and Finding logs

When you run a Nextflow pipeline via the CLI with `nextflow run`, it generates logs that can be used to monitor the execution of the pipeline. The logs are printed to the console, and detailed execution trace can be found in the work directory created by Nextflow. Each execution generates its own directory under work, where logs and output files are stored.

- **Execution Log**: The main log file (`nextflow.log`) is created in the directory where Nextflow is run. This file captures detailed information about the pipeline execution, including system errors and warnings.
- **Command Log**: Within the work directory, each process execution generates a `.command.log` file, which contains the standard output and error streams of the executed command.

#### Limitations of CLI

Monitoring and launching via CLI, though direct, poses challenges, especially with complex or large-scale pipelines that are not as simple as just running Hello World:

- **Scalability**: As the number of tasks increases, manually checking individual log files becomes impractical.
- **Real-Time Tracking**: The CLI does not offer an easy way to visualize real-time progress across multiple parallel tasks.
- **Aggregation**: Collecting and interpreting logs from various processes requires additional tools or scripts, complicating the workflow management.
- **Flexibility**: Switching between environments (i.e. your local computer to HPC, or cloud) requires the setup of access in the form of account keys and credentials to the environment on your CLI, followed by using the appropriate Nextflow configuration settings.

#### Enhancing management of Pipelines in Seqera Platform

Seqera Platform extends the capabilities of Nextflow by providing advanced monitoring, and pipeline and data management tools:

- **Centralized Monitoring Dashboard**: A user-friendly interface displays all critical information, including real-time progress of each pipeline.
- **Easily Accessible Run Details**: Seqera Platform captures every detail about a pipeline run, including the exact parameters and configurations used, ensuring full reproducibility.
- **Resource Usage Metrics**: It provides comprehensive metrics on resource usage for each task, crucial for optimizing cloud executions and managing costs effectively. These metrics are presented in an accessible format, contrasting with the complexity of extracting and interpreting them from CLI logs.
- **Explore and Manage Data**: The Platform makes it easier to manage data across disparate sources for your pipeline executions without having to use Cloud consoles or CLI utilities.
- **Analyze your Data**: Interactive notebooks, RStudios environments, and VSCode streamline the analysis of your data generated from pipeline executions.

This guide will demonstrate the various features of Seqera Platform which makes it easier to build, launch, and manage scalable data pipelines.
Loading

0 comments on commit 65feb28

Please sign in to comment.