Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-1525 Document TerraWorkspaceSink identifier changes #535

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/md/executor.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ And a real-life example for a known method configuration:
}
```

![](./assets/terra-method-configuration.png)
![](assets/executor/terra-method-configuration.png)

The table below summarises the purpose of each attribute in the above request.

Expand Down
48 changes: 32 additions & 16 deletions docs/md/sink.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ looks like:
"name": "Terra Workspace",
"workspace": "{workspace-namespace}/{workspace-name}",
"entityType": "{entity-type-name}",
"identifier": "{output-name}",
"identifier": "{workflow-identifier}",
"fromOutputs": {
"attribute0": "output0",
"attribute1": ["output1", "output2"],
Expand All @@ -30,13 +30,13 @@ looks like:
```
The table below summarises the purpose of each attribute in the above request.

| Attribute | Description |
|---------------|--------------------------------------------------------------|
| `name` | Selects the `Terra Workspace` sink implementation. |
| `workspace` | The Terra Workspace to write pipeline outputs to. |
| `entityType` | The entity type in the `workspace` to write outputs to. |
| `identifier` | Selects the output that will be used as the entity name. |
| `fromOutputs` | Mapping from outputs to attribute names in the `entityType`. |
| Attribute | Description |
|---------------|-----------------------------------------------------------------------------|
| `name` | Selects the `Terra Workspace` sink implementation. |
| `workspace` | The Terra Workspace to write pipeline outputs to. |
| `entityType` | The entity type in the `workspace` to write outputs to. |
| `identifier` | Selects the workflow attribute (output or input) to use as the entity name. |
| `fromOutputs` | Mapping from outputs to attribute names in the `entityType`. |

#### `workspace`
The workspace is a `"{workspace-namespace}/{workspace-name}"` string as it
Expand All @@ -51,18 +51,34 @@ must be a table in the workspace.

#### `identifier`

The `identifier` is the name of a pipeline output
that should be used as the name of each newly created entity.
WFL tries to find a workflow output whose name matches `identifier`,
checking workflow input names as a fallback.
The matching value will be the name of the newly created entity.

Example - Let's say the pipeline you're running has an output called
"sample_name" that uniquely identifies the inputs and outputs to that pipeline.
By setting `"identifier": "sample_name"` in the sink configuration, entities
will be created using the "sample_name" as the entity name.
Both workflow outputs and inputs are checked for matches since
depending on use case, the logical unique identifier may be either.

!!! note
When two sets of pipeline outputs share the same "identifier" value,
!!! warnings
- If an `identifier` has no matching workflow output or input,
WFL will not be able to resolve a workflow to an entity name
and will fail to write its outputs to the workspace data table.
- When two workflows share the same `identifier` value,
the first set of outputs will be overwritten by the second in the workspace.

**Example:**

An eMerge Arrays workflow has an output called "chip_well_barcode_output"
that uniquely identifies its inputs and outputs.

By setting `"identifier": "chip_well_barcode_output"`
in the sink configuration, entities will be created
using the "chip_well_barcode_output" as the entity name.

Below, the outputs for a successful workflow with a "chip_well_barcode_output"
of "204126290052_R01C01" have been written to the destination data table.

![](assets/sink/terra-identifier-in-data-table.png)

#### `fromOutputs`

`fromOutputs` configures how to create new entities from pipeline outputs
Expand Down