Skip to content

Commit a29159c

Browse files
committed
[DOC] Add ingest processors documentation (#4299)
Created new documentation to close content gaps Signed-off-by: Melissa Vagi <vagimeli@amazon.com>
1 parent a100d92 commit a29159c

File tree

17 files changed

+1629
-268
lines changed

17 files changed

+1629
-268
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
---
2+
layout: default
3+
title: Create pipeline
4+
parent: Ingest pipelines
5+
grand_parent: Ingest APIs
6+
nav_order: 10
7+
redirect_from:
8+
- /opensearch/rest-api/ingest-apis/create-update-ingest/
9+
---
10+
11+
# Create pipeline
12+
13+
Use the create pipeline API operation to create or update pipelines in OpenSearch. Note that the pipeline requires you to define at least one processor that specifies how to change the documents.
14+
15+
## Path and HTTP method
16+
17+
Replace `<pipeline-id>` with your pipeline ID:
18+
19+
```json
20+
PUT _ingest/pipeline/<pipeline-id>
21+
```
22+
#### Example request
23+
24+
Here is an example in JSON format that creates an ingest pipeline with two `set` processors and an `uppercase` processor. The first `set` processor sets the `grad_year` to `2023`, and the second `set` processor sets `graduated` to `true`. The `uppercase` processor converts the `name` field to uppercase.
25+
26+
```json
27+
PUT _ingest/pipeline/my-pipeline
28+
{
29+
"description": "This pipeline processes student data",
30+
"processors": [
31+
{
32+
"set": {
33+
"description": "Sets the graduation year to 2023",
34+
"field": "grad_year",
35+
"value": 2023
36+
}
37+
},
38+
{
39+
"set": {
40+
"description": "Sets graduated to true",
41+
"field": "graduated",
42+
"value": true
43+
}
44+
},
45+
{
46+
"uppercase": {
47+
"field": "name"
48+
}
49+
}
50+
]
51+
}
52+
```
53+
{% include copy-curl.html %}
54+
55+
To learn more about error handling, see [Handling pipeline failures]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/pipeline-failures/).
56+
57+
## Request body fields
58+
59+
The following table lists the request body fields used to create or update a pipeline.
60+
61+
Parameter | Required | Type | Description
62+
:--- | :--- | :--- | :---
63+
`processors` | Required | Array of processor objects | An array of processors, each of which transforms documents. Processors are run sequentially in the order specified.
64+
`description` | Optional | String | A description of your ingest pipeline.
65+
66+
## Path parameters
67+
68+
Parameter | Required | Type | Description
69+
:--- | :--- | :--- | :---
70+
`pipeline-id` | Required | String | The unique identifier, or pipeline ID, assigned to the ingest pipeline.
71+
72+
## Query parameters
73+
74+
Parameter | Required | Type | Description
75+
:--- | :--- | :--- | :---
76+
`cluster_manager_timeout` | Optional | Time | Period to wait for a connection to the cluster manager node. Defaults to 30 seconds.
77+
`timeout` | Optional | Time | Period to wait for a response. Defaults to 30 seconds.
78+
79+
## Template snippets
80+
81+
Some processor parameters support [Mustache](https://mustache.github.io/) template snippets. To get the value of a field, surround the field name in three curly braces, for example, `{% raw %}{{{field-name}}}{% endraw %}`.
82+
83+
#### Example: `set` ingest processor using Mustache template snippet
84+
85+
The following example sets the field `{% raw %}{{{role}}}{% endraw %}` with a value `{% raw %}{{{tenure}}}{% endraw %}`:
86+
87+
```json
88+
PUT _ingest/pipeline/my-pipeline
89+
{
90+
"processors": [
91+
{
92+
"set": {
93+
"field": "{% raw %}{{{role}}}{% endraw %}",
94+
"value": "{% raw %}{{{tenure}}}{% endraw %}"
95+
}
96+
}
97+
]
98+
}
99+
```
100+
{% include copy-curl.html %}

_api-reference/ingest-apis/create-update-ingest.md

Lines changed: 0 additions & 79 deletions
This file was deleted.
Lines changed: 13 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,27 @@
11
---
22
layout: default
3-
title: Delete a pipeline
4-
parent: Ingest APIs
5-
nav_order: 14
3+
title: Delete pipeline
4+
parent: Ingest pipelines
5+
grand_parent: Ingest APIs
6+
nav_order: 13
67
redirect_from:
78
- /opensearch/rest-api/ingest-apis/delete-ingest/
89
---
910

10-
# Delete a pipeline
11+
# Delete pipeline
1112

12-
If you no longer want to use an ingest pipeline, use the delete ingest pipeline API operation.
13+
Use the following request to delete a pipeline.
1314

14-
## Example
15+
To delete a specific pipeline, pass the pipeline ID as a parameter:
1516

16-
```
17-
DELETE _ingest/pipeline/12345
17+
```json
18+
DELETE /_ingest/pipeline/<pipeline-id>
1819
```
1920
{% include copy-curl.html %}
2021

21-
## Path and HTTP methods
22-
23-
Delete an ingest pipeline based on that pipeline's ID.
24-
25-
```
26-
DELETE _ingest/pipeline/
27-
```
28-
29-
## URL parameters
30-
31-
All URL parameters are optional.
32-
33-
Parameter | Type | Description
34-
:--- | :--- | :---
35-
master_timeout | time | How long to wait for a connection to the master node.
36-
timeout | time | How long to wait for the request to return.
37-
38-
## Response
22+
To delete all pipelines in a cluster, use the wildcard character (`*`):
3923

4024
```json
41-
{
42-
"acknowledged" : true
43-
}
44-
```
25+
DELETE /_ingest/pipeline/*
26+
```
27+
{% include copy-curl.html %}
Lines changed: 37 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,62 @@
11
---
22
layout: default
3-
title: Get ingest pipeline
4-
parent: Ingest APIs
5-
nav_order: 10
3+
title: Get pipeline
4+
parent: Ingest pipelines
5+
grand_parent: Ingest APIs
6+
nav_order: 12
67
redirect_from:
78
- /opensearch/rest-api/ingest-apis/get-ingest/
89
---
910

10-
## Get ingest pipeline
11+
# Get pipeline
1112

12-
After you create a pipeline, use the get ingest pipeline API operation to return all the information about a specific ingest pipeline.
13+
Use the get ingest pipeline API operation to retrieve all the information about the pipeline.
1314

14-
## Example
15+
## Retrieving information about all pipelines
1516

16-
```
17-
GET _ingest/pipeline/12345
17+
The following example request returns information about all ingest pipelines:
18+
19+
```json
20+
GET _ingest/pipeline/
1821
```
1922
{% include copy-curl.html %}
2023

21-
## Path and HTTP methods
24+
## Retrieving information about a specific pipeline
2225

23-
Return all ingest pipelines.
26+
The following example request returns information about a specific pipeline, which for this example is `my-pipeline`:
2427

28+
```json
29+
GET _ingest/pipeline/my-pipeline
2530
```
26-
GET _ingest/pipeline
27-
```
28-
29-
Returns a single ingest pipeline based on the pipeline's ID.
30-
31-
```
32-
GET _ingest/pipeline/{id}
33-
```
34-
35-
## URL parameters
36-
37-
All parameters are optional.
38-
39-
Parameter | Type | Description
40-
:--- | :--- | :---
41-
master_timeout | time | How long to wait for a connection to the master node.
31+
{% include copy-curl.html %}
4232

43-
## Response
33+
The response contains the pipeline information:
4434

4535
```json
4636
{
47-
"pipeline-id" : {
48-
"description" : "A description for your pipeline",
49-
"processors" : [
37+
"my-pipeline": {
38+
"description": "This pipeline processes student data",
39+
"processors": [
5040
{
51-
"set" : {
52-
"field" : "field-name",
53-
"value" : "value"
41+
"set": {
42+
"description": "Sets the graduation year to 2023",
43+
"field": "grad_year",
44+
"value": 2023
45+
}
46+
},
47+
{
48+
"set": {
49+
"description": "Sets graduated to true",
50+
"field": "graduated",
51+
"value": true
52+
}
53+
},
54+
{
55+
"uppercase": {
56+
"field": "name"
5457
}
5558
}
5659
]
5760
}
5861
}
59-
```
62+
```

_api-reference/ingest-apis/index.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,13 @@ redirect_from:
99

1010
# Ingest APIs
1111

12-
Before you index your data, OpenSearch's ingest APIs help transform your data by creating and managing ingest pipelines. Pipelines consist of **processors**, customizable tasks that run in the order they appear in the request body. The transformed data appears in your index after each of the processor completes.
12+
Ingest APIs are a valuable tool for loading data into a system. Ingest APIs work together with [ingest pipelines]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/ingest-pipelines/) and [ingest processors]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/ingest-processors/) to process or transform data from a variety of sources and in a variety of formats.
1313

14-
Ingest pipelines in OpenSearch can only be managed using ingest API operations. When using ingest in production environments, your cluster should contain at least one node with the node roles permission set to `ingest`. For more information on setting up node roles within a cluster, see [Cluster Formation]({{site.url}}{{site.baseurl}}/opensearch/cluster/).
14+
## Ingest pipeline APIs
15+
16+
Simplify, secure, and scale your OpenSearch data ingestion with the following APIs:
17+
18+
- [Create pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/create-ingest/): Use this API to create or update a pipeline configuration.
19+
- [Get pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/get-ingest/): Use this API to retrieve a pipeline configuration.
20+
- [Simulate pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/simulate-ingest/): Use this pipeline to test a pipeline configuration.
21+
- [Delete pipeline]({{site.url}}{{site.baseurl}}/api-reference/ingest-apis/delete-ingest/): Use this API to delete a pipeline configuration.

0 commit comments

Comments
 (0)