Skip to content

Commit 86d1814

Browse files
authored
[DOCS] Clarify ingest attachment example (#65143) (#65159)
1 parent 652f1de commit 86d1814

File tree

1 file changed

+38
-22
lines changed

1 file changed

+38
-22
lines changed

docs/plugins/ingest-attachment.asciidoc

Lines changed: 38 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -30,10 +30,27 @@ include::install_remove.asciidoc[]
3030
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
3131
|======
3232

33-
For example, this:
33+
[discrete]
34+
[[ingest-attachment-json-ex]]
35+
==== Example
36+
37+
If attaching files to JSON documents, you must first encode the file as a base64
38+
string. On Unix-like systems, you can do this using a `base64` command:
39+
40+
[source,shell]
41+
----
42+
base64 -in myfile.rtf
43+
----
44+
45+
The command returns the base64-encoded string for the file. The following base64
46+
string is for an `.rtf` file containing the text `Lorem ipsum dolor sit amet`:
47+
`e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=`.
48+
49+
Use an attachment processor to decode the string and extract the file's
50+
properties:
3451

3552
[source,console]
36-
--------------------------------------------------
53+
----
3754
PUT _ingest/pipeline/attachment
3855
{
3956
"description" : "Extract attachment information",
@@ -45,20 +62,20 @@ PUT _ingest/pipeline/attachment
4562
}
4663
]
4764
}
48-
PUT my-index-00001/_doc/my_id?pipeline=attachment
65+
PUT my-index-000001/_doc/my_id?pipeline=attachment
4966
{
5067
"data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
5168
}
52-
GET my-index-00001/_doc/my_id
53-
--------------------------------------------------
69+
GET my-index-000001/_doc/my_id
70+
----
5471

55-
Returns this:
72+
The document's `attachment` object contains extracted properties for the file:
5673

5774
[source,console-result]
58-
--------------------------------------------------
75+
----
5976
{
6077
"found": true,
61-
"_index": "my-index-00001",
78+
"_index": "my-index-000001",
6279
"_type": "_doc",
6380
"_id": "my_id",
6481
"_version": 1,
@@ -74,14 +91,13 @@ Returns this:
7491
}
7592
}
7693
}
77-
--------------------------------------------------
94+
----
7895
// TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
7996

80-
81-
To specify only some fields to be extracted:
97+
To extract only certain `attachment` fields, specify the `properties` array:
8298

8399
[source,console]
84-
--------------------------------------------------
100+
----
85101
PUT _ingest/pipeline/attachment
86102
{
87103
"description" : "Extract attachment information",
@@ -94,7 +110,7 @@ PUT _ingest/pipeline/attachment
94110
}
95111
]
96112
}
97-
--------------------------------------------------
113+
----
98114

99115
NOTE: Extracting contents from binary data is a resource intensive operation and
100116
consumes a lot of resources. It is highly recommended to run pipelines
@@ -175,11 +191,11 @@ PUT _ingest/pipeline/attachment
175191
}
176192
]
177193
}
178-
PUT my-index-00001/_doc/my_id?pipeline=attachment
194+
PUT my-index-000001/_doc/my_id?pipeline=attachment
179195
{
180196
"data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
181197
}
182-
GET my-index-00001/_doc/my_id
198+
GET my-index-000001/_doc/my_id
183199
--------------------------------------------------
184200

185201
Returns this:
@@ -188,7 +204,7 @@ Returns this:
188204
--------------------------------------------------
189205
{
190206
"found": true,
191-
"_index": "my-index-00001",
207+
"_index": "my-index-000001",
192208
"_type": "_doc",
193209
"_id": "my_id",
194210
"_version": 1,
@@ -223,12 +239,12 @@ PUT _ingest/pipeline/attachment
223239
}
224240
]
225241
}
226-
PUT my-index-00001/_doc/my_id_2?pipeline=attachment
242+
PUT my-index-000001/_doc/my_id_2?pipeline=attachment
227243
{
228244
"data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
229245
"max_size": 5
230246
}
231-
GET my-index-00001/_doc/my_id_2
247+
GET my-index-000001/_doc/my_id_2
232248
--------------------------------------------------
233249

234250
Returns this:
@@ -237,7 +253,7 @@ Returns this:
237253
--------------------------------------------------
238254
{
239255
"found": true,
240-
"_index": "my-index-00001",
256+
"_index": "my-index-000001",
241257
"_type": "_doc",
242258
"_id": "my_id_2",
243259
"_version": 1,
@@ -309,7 +325,7 @@ PUT _ingest/pipeline/attachment
309325
}
310326
]
311327
}
312-
PUT my-index-00001/_doc/my_id?pipeline=attachment
328+
PUT my-index-000001/_doc/my_id?pipeline=attachment
313329
{
314330
"attachments" : [
315331
{
@@ -322,15 +338,15 @@ PUT my-index-00001/_doc/my_id?pipeline=attachment
322338
}
323339
]
324340
}
325-
GET my-index-00001/_doc/my_id
341+
GET my-index-000001/_doc/my_id
326342
--------------------------------------------------
327343

328344
Returns this:
329345

330346
[source,console-result]
331347
--------------------------------------------------
332348
{
333-
"_index" : "my-index-00001",
349+
"_index" : "my-index-000001",
334350
"_type" : "_doc",
335351
"_id" : "my_id",
336352
"_version" : 1,

0 commit comments

Comments
 (0)