Skip to content

Commit

Permalink
Client helpers (#1107)
Browse files Browse the repository at this point in the history
* Added client helpers

* Updated test

* The search helper should return only the documents

* Added code comments

* Fixed bug

* Updated test

* Removed bulkSize and added flushBytes

* Updated test

* Added concurrency

* Updated test

* Added support for 429 handling in the scroll search helper

* Updated test

* Updated stats count

* Updated test

* Fix test

* Use client maxRetries as default

* Updated type definitions

* Refactored bulk helper to be more consistent with the client api

* Updated test

* Improved error handling, added refreshOnCompletion option and forward additinal options to the bulk api

* Updated type definitions

* Updated test

* Fixed test on Node v8

* Updated test

* Added TODO

* Updated docs

* Added Node v8 note

* Updated scripts

* Removed useless files

* Added helpers to integration test

* Fix cli argument position

* Moar fixes

* Test run elasticsearch in github actions

* Use master action version

* Add vm.max_map_count step

* Test new action setup

* Added Configure sysctl limits step

* Updated action to latest version

* Don't run helpers integration test in jenkins

* Run helpers integratino test also with Node v10

* Updated docs

* Updated docs

* Updated helpers type definitions

* Added test for helpers type definitions

* Added license header
  • Loading branch information
delvedor authored Mar 23, 2020
1 parent 6c82a49 commit d7836a1
Show file tree
Hide file tree
Showing 24 changed files with 7,654 additions and 11 deletions.
4 changes: 2 additions & 2 deletions .ci/run-repository.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ echo -e "\033[1m>>>>> NPM run test:integration >>>>>>>>>>>>>>>>>>>>>>>>>>>>>\033
repo=$(realpath $(dirname $(realpath -s $0))/../)
run_script_args=""
if [[ "$NODE_JS_VERSION" == "8" ]]; then
run_script_args="-- --node-arg=--harmony-async-iteration"
run_script_args="--harmony-async-iteration"
fi

docker run \
Expand All @@ -43,4 +43,4 @@ docker run \
--name elasticsearch-js \
--rm \
elastic/elasticsearch-js \
npm run test:integration ${run_script_args}
node ${run_script_args} test/integration/index.js
36 changes: 36 additions & 0 deletions .github/workflows/nodejs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,42 @@ jobs:
run: |
npm run test:unit -- --node-arg=--harmony-async-iteration
helpers-integration-test:
name: Helpers integration test
runs-on: ubuntu-latest

strategy:
matrix:
node-version: [10.x, 12.x]

steps:
- uses: actions/checkout@v2

- name: Configure sysctl limits
run: |
sudo swapoff -a
sudo sysctl -w vm.swappiness=1
sudo sysctl -w fs.file-max=262144
sudo sysctl -w vm.max_map_count=262144
- name: Runs Elasticsearch
uses: elastic/elastic-github-actions/elasticsearch@master
with:
stack-version: 8.0.0-SNAPSHOT

- name: Use Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v1
with:
node-version: ${{ matrix.node-version }}

- name: Install
run: |
npm install
- name: Integration test
run: |
npm run test:integration:helpers
code-coverage:
name: Code coverage
runs-on: ubuntu-latest
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ We recommend that you write a lightweight proxy that uses this client instead.
- [Observability](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/observability.html)
- [Creating a child client](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/child-client.html)
- [Extend the client](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/extend-client.html)
- [Client helpers](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/client-helpers.html)
- [Typescript support](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/typescript.html)
- [Examples](https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/examples.html)

Expand Down
2 changes: 2 additions & 0 deletions docs/examples/bulk.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
The `bulk` API makes it possible to perform many index/delete operations in a
single API call. This can greatly increase the indexing speed.

NOTE: Did you know that we provide an helper for sending bulk request? You can find it {jsclient}/client-helpers.html[here].

[source,js]
----
'use strict'
Expand Down
2 changes: 2 additions & 0 deletions docs/examples/scroll.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ In order to use scrolling, the initial search request should specify the scroll
parameter in the query string, which tells {es} how long it should keep the
“search context” alive.

NOTE: Did you know that we provide an helper for sending scroll requests? You can find it {jsclient}/client-helpers.html[here].

[source,js]
----
'use strict'
Expand Down
272 changes: 272 additions & 0 deletions docs/helpers.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,272 @@
[[client-helpers]]
== Client Helpers

The client comes with an handy collection of helpers to give you a more comfortable experience with some APIs.

CAUTION: The client helpers are experimental, and the API may change in the next minor releases.
If you are using the client with Node.js v8 you should run your code with the `--harmony-async-iteration` argument. +
eg: `node --harmony-async-iteration index.js`

=== Bulk Helper
Running Bulk requests can be complex due to the shape of the API, this helper aims to provide a nicer developer experience around the Bulk API.

==== Usage
[source,js]
----
const { createReadStream } = require('fs')
const split = require('split2')
const { Client } = require('@elastic/elasticsearch')
const client = new Client({ node: 'http://localhost:9200' })
const result = await client.helpers.bulk({
datasource: createReadStream('./dataset.ndjson').pipe(split()),
onDocument (doc) {
return {
index: { _index: 'my-index' }
}
}
})
console.log(result)
// {
// total: number,
// failed: number,
// retry: number,
// successful: number,
// time: number,
// bytes: number,
// aborted: boolean
// }
----

To create a new instance of the Bulk helper, you should access it as shown in the example above, the configuration options are:
[cols=2*]
|===
|`datasource`
a|An array or a readable stream with the data you need to index/create/update/delete.
It can be an array of strings or objects, but also a stream of json strings or JavaScript objects. +
If it is a stream, we recommend to use the https://www.npmjs.com/package/split2[`split2`] package, that will split the stream on new lines delimiters. +
This parameter is mandatory.
[source,js]
----
const { createReadStream } = require('fs')
const split = require('split2')
const b = client.helpers.bulk({
// if you just use split(), the data will be used as array of strings
datasource: createReadStream('./dataset.ndjson').pipe(split())
// if you need to manipulate the data, you can pass JSON.parse to split
datasource: createReadStream('./dataset.ndjson').pipe(split(JSON.parse))
})
----

|`onDocument`
a|A function that will be called for each document of the datasource. Inside this function you can manipulate the document and you must return the operation you want to execute with the document. Look at the link:{ref}/docs-bulk.html[Bulk API documentation] to see the supported operations. +
This parameter is mandatory.
[source,js]
----
const b = client.helpers.bulk({
onDocument (doc) {
return {
index: { _index: 'my-index' }
}
}
})
----

|`onDrop`
a|A function that will be called for everytime a document can't be indexed and it has reached the maximum amount of retries.
[source,js]
----
const b = client.helpers.bulk({
onDrop (doc) {
console.log(doc)
}
})
----

|`flushBytes`
a|The size of the bulk body in bytes to reach before to send it. Default of 5MB. +
_Default:_ `5000000`
[source,js]
----
const b = client.helpers.bulk({
flushBytes: 1000000
})
----

|`concurrency`
a|How many request will be executed at the same time. +
_Default:_ `5`
[source,js]
----
const b = client.helpers.bulk({
concurrency: 10
})
----

|`retries`
a|How many times a document will be retried before to call the `onDrop` callback. +
_Default:_ Client max retries.
[source,js]
----
const b = client.helpers.bulk({
retries: 3
})
----

|`wait`
a|How much time to wait before retries in milliseconds.+
_Default:_ 5000.
[source,js]
----
const b = client.helpers.bulk({
wait: 3000
})
----

|`refreshOnCompletion`
a|If `true`, at the end of the bulk operation it will run a refresh on all indices or on the specified indices. +
_Default:_ false.
[source,js]
----
const b = client.helpers.bulk({
refreshOnCompletion: true
// or
refreshOnCompletion: 'index-name'
})
----

|===

==== Abort a bulk operation
If needed, you can abort a bulk operation at any time. The bulk helper returns a https://promisesaplus.com/[thenable], which has an `abort` method.

NOTE: The abort method will stop the execution of the bulk operation, but if you are using a concurrency higher than one, the operations that are already running will not be stopped.

[source,js]
----
const { createReadStream } = require('fs')
const split = require('split2')
const { Client } = require('@elastic/elasticsearch')
const client = new Client({ node: 'http://localhost:9200' })
const b = client.helpers.bulk({
datasource: createReadStream('./dataset.ndjson').pipe(split()),
onDocument (doc) {
return {
index: { _index: 'my-index' }
}
},
onDrop (doc) {
b.abort()
}
})
console.log(await b)
----

==== Passing custom options to the Bulk API
You can pass any option supported by the link:{ref}/docs-bulk.html#docs-bulk-api-query-params[Bulk API] to the helper, and the helper will use those options in conjuction with the Bulk
API call.

[source,js]
----
const result = await client.helpers.bulk({
datasource: [...]
onDocument (doc) {
return {
index: { _index: 'my-index' }
}
},
pipeline: 'my-pipeline'
})
----

=== Search Helper
A simple wrapper around the search API. Instead of returning the entire `result` object it will return only the search documents result.

[source,js]
----
const documents = await client.helpers.search({
index: 'stackoverflow',
body: {
query: {
match: {
title: 'javascript'
}
}
}
})
for (const doc of documents) {
console.log(doc)
}
----

=== Scroll Search Helper
This helpers offers a simple and intuitive way to use the scroll search API. Once called, it returns an https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for-await...of[async iterator] which can be used in conjuction with a for-await...of. +
It handles automatically the `429` error and uses the client's `maxRetries` option.

[source,js]
----
const scrollSearch = await client.helpers.scrollSearch({
index: 'stackoverflow',
body: {
query: {
match: {
title: 'javascript'
}
}
}
})
for await (const result of scrollSearch) {
console.log(result)
}
----

==== Clear a scroll search

If needed, you can clear a scroll search by calling `result.clear()`:

[source,js]
----
for await (const result of scrollSearch) {
if (condition) {
await result.clear()
}
}
----

==== Quickly getting the documents

If you only need the documents from the result of a scroll search, you can access them via `result.documents`:

[source,js]
----
for await (const result of scrollSearch) {
console.log(result.documents)
}
----

=== Scroll Documents Helper

It works in the same way as the scroll search helper, but it returns only the documents instead. Note, every loop cycle will return you a single document, and you can't use the `clear` method.

[source,js]
----
const scrollSearch = await client.helpers.scrollDocuments({
index: 'stackoverflow',
body: {
query: {
match: {
title: 'javascript'
}
}
}
})
for await (const doc of scrollSearch) {
console.log(doc)
}
----
1 change: 1 addition & 0 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,6 @@ include::authentication.asciidoc[]
include::observability.asciidoc[]
include::child.asciidoc[]
include::extend.asciidoc[]
include::helpers.asciidoc[]
include::typescript.asciidoc[]
include::examples/index.asciidoc[]
2 changes: 2 additions & 0 deletions index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import {
ApiKeyAuth
} from './lib/pool';
import Serializer from './lib/Serializer';
import Helpers from './lib/Helpers';
import * as RequestParams from './api/requestParams';
import * as errors from './lib/errors';

Expand Down Expand Up @@ -107,6 +108,7 @@ declare class Client extends EventEmitter {
transport: Transport;
serializer: Serializer;
extend: ClientExtends;
helpers: Helpers;
child(opts?: ClientOptions): Client;
close(callback?: Function): Promise<void> | void;
/* GENERATED */
Expand Down
3 changes: 3 additions & 0 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ const debug = require('debug')('elasticsearch')
const Transport = require('./lib/Transport')
const Connection = require('./lib/Connection')
const { ConnectionPool, CloudConnectionPool } = require('./lib/pool')
const Helpers = require('./lib/Helpers')
const Serializer = require('./lib/Serializer')
const errors = require('./lib/errors')
const { ConfigurationError } = errors
Expand Down Expand Up @@ -126,6 +127,8 @@ class Client extends EventEmitter {
opaqueIdPrefix: options.opaqueIdPrefix
})

this.helpers = new Helpers({ client: this, maxRetries: options.maxRetries })

const apis = buildApi({
makeRequest: this.transport.request.bind(this.transport),
result: { body: null, statusCode: null, headers: null, warnings: null },
Expand Down
Loading

0 comments on commit d7836a1

Please sign in to comment.