Skip to content

Commit c7e22b2

Browse files
committed
Add new Elasticsearch API library
1 parent e5d862a commit c7e22b2

File tree

3 files changed

+335
-2
lines changed

3 files changed

+335
-2
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
2+
# vim undo files
3+
*un~

README.md

Lines changed: 129 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,129 @@
1-
# kdb-elasticsearch
2-
kdb interface to Elasticsearch web API
1+
# Elasticsearch API for kdb
2+
3+
This repository provides a kdb library to access Elasticsearch via its Web API. It provides the following features:
4+
5+
* Document retrieval and search
6+
* Single and bulk document uploading
7+
8+
This library has been written for use with the [kdb-common](https://github.com/BuaBook/kdb-common) set of libraries.
9+
10+
## Usage
11+
12+
Once the library is loaded, the Elasticsearch Web API URL must be configured (with `.es.setTargetServer[esInstance]`) before using any other library functions. If you don't, you'll see a `NoElasticsearchUrlException`.
13+
14+
### `.es.setTargetServer[esInstance]`
15+
16+
Configures which Elasticsearch instance to interface with using this library. The URL should be symbol and be either HTTP or HTTPS.
17+
18+
Example:
19+
20+
```
21+
q).es.setTargetInstance `:http://192.168.1.78:9200
22+
2019.07.07 12:24:15.832 INFO pid-229 jas 0 Elasticsearch instance set [ URL: :http://192.168.1.78:9200 ]
23+
```
24+
25+
### `.es.getAllIndices[]`
26+
27+
Provides information on all the indices available in the current Elasticsearch instance
28+
29+
Example:
30+
31+
```
32+
q).es.getAllIndices[]
33+
2019.07.07 12:26:25.279 INFO pid-322 jas 0 Querying for all available indices from Elasticsearch
34+
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
35+
--------------------------------------------------------------------------------------------------------------------------------
36+
"yellow" "open" "kdb-test-index-2019.07.07" "u0gG5DpCRVu56eBvPSMzOA" ,"1" ,"1" ,"2" ,"0" "3.9kb" "3.9kb"
37+
"yellow" "open" "kdb-test-index-2019.07.05" "2UBqNcBkQaSmGQR4PYJZEQ" ,"1" ,"1" ,"1" ,"0" "3.3kb" "3.3kb"
38+
```
39+
40+
### `.es.search[index; searchQuery]`
41+
42+
Allows a search to be performed on the specified index.
43+
44+
NOTE: This search is only a URI search so not all search options are available through this function. See the [URI search](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html) section of the Elasticsearch documentation for more information
45+
46+
Example:
47+
48+
```
49+
/ Insert some data to a new index
50+
q) .es.http.post[`$"kdb-test-index2-2019.07.07"; `; `time`sym`price!(.z.p; `VOD.L; 1000f)];
51+
q) .es.http.post[`$"kdb-test-index2-2019.07.07"; `; `time`sym`price!(.z.p; `VOD.L; 2000f)];
52+
53+
/ Search for entries where price = 2000
54+
q) .es.search[`$"kdb-test-index2-2019.07.07"; "price:2000"]
55+
took | 0f
56+
timed_out| 0b
57+
_shards | `total`successful`skipped`failed!1 1 0 0f
58+
hits | `total`max_score`hits!(`value`relation!(1f;"eq");1f;+`_index`_type`_id`_score`_source!(,"kdb-test-index2-2019.07.07";,"doc";,"UhBszGsBBs67mAIUf-jR";,1f;,`time`sym`price!("2019-07-07T12:3..
59+
```
60+
61+
### `.es.getDocument[index; id]`
62+
63+
Retrieves a specific document from the a specific index
64+
65+
Example:
66+
67+
```
68+
/ Once of the documents added in the .es.search example above
69+
q) .es.getDocument[`$"kdb-test-index2-2019.07.07"; `URBszGsBBs67mAIUYeg1]
70+
_index | "kdb-test-index2-2019.07.07"
71+
_type | "doc"
72+
_id | "URBszGsBBs67mAIUYeg1"
73+
_version | 1f
74+
_seq_no | 0f
75+
_primary_term| 1f
76+
found | 1b
77+
_source | `time`sym`price!("2019-07-07T12:33:05.180098000";"VOD.L";1000f)
78+
79+
```
80+
81+
### `.es.http.get[relativeUrl]`
82+
83+
Allows any Elasticsearch URL to be queried and the results returned. The library expects all returned data to be in JSON format to be oconverted into native kdb+ types
84+
85+
### `.es.http.post[index; id; content]`
86+
87+
Adds a new document to an index
88+
89+
* `index`: The name of the index to add the document to. The index will be automatically appended with today's date (in yyyy.mm.dd format) if there is no date component specified
90+
* `id`: The ID of the entry to add. Generally this should be left as a null symbol to allow Elasticsearch to generate its own
91+
* `content`: The content to be uploaded. It must be either a dictionary or a JSON string
92+
93+
Example:
94+
95+
```
96+
q) .es.http.post[`$"kdb-test-index2"; `; `time`sym`price!(.z.p; `VOD.L; 2000f)]
97+
98+
2019.07.07 12:33:13.060 INFO pid-382 jas 0 No date found in index name. Automatically adding today's date [ Index: kdb-test-index2 ]
99+
100+
_index | "kdb-test-index2-2019.07.07"
101+
_type | "doc"
102+
_id | "UhBszGsBBs67mAIUf-jR"
103+
_version | 1f
104+
result | "created"
105+
_shards | `total`successful`failed!2 1 0f
106+
_seq_no | 1f
107+
_primary_term| 1f
108+
```
109+
110+
### `.es.http.postBulk[index; contentTable]`
111+
112+
Allows multiple documents to be uploaded to an index in one operation via the [Bulk API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html).
113+
114+
* `index`: The name of the index to add the document to. The index will be automatically appended with today's date (in yyyy.mm.dd format) if there is no date component specified
115+
* `contentTable`: The data to upload where each row of the table is a single document
116+
117+
If you want to specify custom IDs for each document to be added, ensure that the table provided has an `id` column
118+
119+
Example:
120+
121+
```
122+
q) tbl:flip `col1`col2`col3!2?/:(`2; 100f; 1b)
123+
124+
q) .es.http.postBulk[`$"kdb-test-index4"; tbl]
125+
2019.07.07 12:46:53.208 INFO pid-411 jas 0 No date found in index name. Automatically adding today's date [ Index: kdb-test-index4 ]
126+
took | 3f
127+
errors| 0b
128+
items | +(,`index)!,(`_index`_type`_id`_version`result`_shards`_seq_no`_primary_term`status!("kdb-test-index4-2019.07.07";"doc";"VhB5zGsBBs67mAIUA-h7";1f;"created";`total`successful`failed!2 1 0f;2..
129+
```

src/es.q

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
// Elasticsearch API
2+
// Copyright (c) 2019 Jaskirat Rajasansir
3+
4+
.require.lib each `type`time`convert;
5+
6+
7+
/ The default MIME type of data that is sent via HTTP POST
8+
.es.cfg.postMimeTypes:()!();
9+
.es.cfg.postMimeTypes[`default]: "application/json";
10+
.es.cfg.postMimeTypes[`bulk]: "application/x-ndjson";
11+
12+
/ The required separator JSON to allow upload of multiple objects in bulk
13+
.es.cfg.bulkRowSeparator:"\n",.j.j[enlist[`index]!enlist ()!()],"\n";
14+
15+
.es.cfg.requiredUrlPrefix:":http*";
16+
17+
/ The target Elasticsearch instance
18+
.es.target:`;
19+
20+
21+
.es.init:{};
22+
23+
24+
/ Configures the URL root of the Elasticsearch instance to use with this API
25+
/ @param esInstance (Symbol) The root endpoint of the Elasticsearch instance (e.g. http://localhost:9200)
26+
/ @throws InvalidElasticSearchUrlException If the instance endpoint specified is not http or https
27+
/ @see .es.cfg.requiredUrlPrefix
28+
/ @see .es.target
29+
.es.setTargetInstance:{[esInstance]
30+
if[not .type.isSymbol esInstance;
31+
'"IllegalArgumentException";
32+
];
33+
34+
if[not esInstance like .es.cfg.requiredUrlPrefix;
35+
.log.error "Unsupported Elasticsearch API URL; must be HTTP or HTTPS [ URL: ",string[esInstance]," ]";
36+
'"InvalidElasticSearchInstanceException";
37+
];
38+
39+
if["/" = last string esInstance;
40+
esInstance:`$-1_ string esInstance;
41+
];
42+
43+
.es.target:esInstance;
44+
45+
.log.info "Elasticsearch instance set [ URL: ",string[.es.target]," ]";
46+
};
47+
48+
/ @returns (Table) All the indices available in the current Elasticsearch instance
49+
/ @see .es.http.get
50+
.es.getAllIndices:{
51+
.log.info "Querying for all available indices from Elasticsearch";
52+
:.es.http.get "_cat/indices?v&format=json";
53+
};
54+
55+
/ Search a specific Elasticsearch instances for data via URL search
56+
/ @param index (Symbol) The index to search within/ Use `$"_all" to search all indices
57+
/ @param searchQuery (String) The search query as per the Elasticsearch documentation
58+
/ @returns (Dict) The search results
59+
/ @see .es.http.get
60+
.es.search:{[index; searchQuery]
61+
if[not .type.isString searchQuery;
62+
'"IllegalArgumentException";
63+
];
64+
65+
:.es.http.get string[index],"/_search?q=",searchQuery;
66+
};
67+
68+
/ @param index (Symbol) The name of the index to retrieve the document from
69+
/ @param id (Symbol) The ID of the document to retrieve
70+
/ @returns The document as specified by the id from the specified index
71+
/ @see .es.http.get
72+
.es.getDocument:{[index; id]
73+
if[(not .type.isSymbol index) | not .type.isSymbol id;
74+
'"IllegalArgumentException";
75+
];
76+
77+
:.es.http.get index,`doc,id;
78+
};
79+
80+
81+
/ HTTP GET interface function to Elasticsearch
82+
/ @param relativeUrl (String|SymbolList) The URL to query on the Elasticsearch web API
83+
/ @returns (Dict) The JSON response parsed into native kdb+ types
84+
/ @see .es.i.buildUrl
85+
/ @see .Q.hg
86+
.es.http.get:{[relativeUrl]
87+
url:.es.i.buildUrl relativeUrl;
88+
89+
.log.debug "Elasticsearch HTTP GET [ URL: ",string[url]," ]";
90+
91+
:.j.k .Q.hg url;
92+
};
93+
94+
/ HTTP POST interface function to Elasticsearch for an indiviudal document
95+
/ NOTE: To upload multiple documents, use .es.http.postBulk
96+
/ @param index (Symbol) The target index to insert the new data. The index will be automatically appended with today's date if there is no date component specified in the index
97+
/ @param id (Symbol) The ID of the entry to add. Set this to null symbol to allow Elasticsearch to generate its own
98+
/ @parmam content (Dict|String) The content to be uploaded. If a dictionary is supplied, it will be converted to JSON. If a string is supplied, it's assumed to already be in JSON and will be uploaded directly
99+
/ @returns (Dict) The JSON response parsed into native kdb+ types
100+
/ @throws InvalidContentException If the content provided is not a string or dictionary type
101+
/ @see .es.i.buildUrl
102+
/ @see .es.i.normaliseIndex
103+
/ @see .es.cfg.postMimeTypes
104+
/ @see .Q.hp
105+
.es.http.post:{[index; id; content]
106+
if[(not .type.isSymbol index) | not .type.isSymbol id;
107+
'"IllegalArgumentException";
108+
];
109+
110+
if[not any (.type.isDict; .type.isString)@\: content;
111+
'"InvalidContentException";
112+
];
113+
114+
if[not .type.isString content;
115+
content:.j.j content;
116+
];
117+
118+
if[.util.isEmpty id;
119+
id:`;
120+
];
121+
122+
url:.es.i.buildUrl .es.i.normaliseIndex[index],`doc,id;
123+
124+
.log.debug "Elasticsearch HTTP POST [ URL: ",string[url]," ]";
125+
126+
:.j.k .Q.hp[url; .es.cfg.postMimeTypes`default; content];
127+
};
128+
129+
/ Bulk HTTP POST interface function to Elasticsearch for multiple documents
130+
/ @param index (Symbol) The target index to insert the new data. The index will be automatically appended with today's date if there is no date component specified in the index
131+
/ @param contentTable (Table) The data to upload in bulk to Elasticsearch. Each row in the table will be a document
132+
/ @returns (Dict) The JSON response parsed into native kdb+ types
133+
/ @see .es.i.buildUrl
134+
/ @see .es.i.normaliseIndex
135+
/ @see .es.cfg.postMimeTypes
136+
/ @see .es.i.http10post
137+
.es.http.postBulk:{[index; contentTable]
138+
if[(not .type.isSymbol index) | not .type.isTable contentTable;
139+
'"IllegalArgumentException";
140+
];
141+
142+
if[0 < count keys contentTable;
143+
'"InvalidContentTableException";
144+
];
145+
146+
url:.es.i.buildUrl .es.i.normaliseIndex[index],`doc,`$"_bulk";
147+
148+
content:.es.cfg.bulkRowSeparator,(.es.cfg.bulkRowSeparator sv .j.j each contentTable),"\n";
149+
150+
.log.debug "Elasticsearch HTTP Bulk POST [ URL: ",string[url]," ] [ Size: ",string[count content]," bytes ]";
151+
152+
:.j.k .es.i.http10post[url; .es.cfg.postMimeTypes`bulk; content];
153+
};
154+
155+
156+
/ Ensure that the specified index has a date component (in yyyy.mm.dd format) within it
157+
/ @param index (Symbol) The index to normalise
158+
/ @returns (Symbol) The index unmodified if it has a date component, otherwise the original index with today's date appended to it
159+
.es.i.normaliseIndex:{[index]
160+
if[not index like "*????.??.??*";
161+
.log.info "No date found in index name. Automatically adding today's date [ Index: ",string[index]," ]";
162+
index:`$"-" sv string (index; .time.today[]);
163+
];
164+
165+
:index;
166+
};
167+
168+
/ Elasticsearch URL builder
169+
/ @param relativeUrl (String|Symbol|SymbolList) The relative URL requested by the calling function
170+
/ @returns (Symbol) A complete URL that can be used with .Q.hg / .Q.hp
171+
/ @see .es.target
172+
/ @throws NoElasticsearchUrlException If the Elasticsearch API URL has not yet been set
173+
/ @throws InvalidElasticSearchUrlException If the URL specified is of an incorrect type
174+
.es.i.buildUrl:{[relativeUrl]
175+
if[null .es.target;
176+
.log.error "Cannot use Elasticsearch API, no instance specified yet [ Request URL: ",.Q.s1[relativeUrl]," ]";
177+
'"NoElasticsearchUrlException";
178+
];
179+
180+
if[.type.isString relativeUrl;
181+
if[not "/" = first relativeUrl;
182+
relativeUrl:"/",relativeUrl;
183+
];
184+
185+
:`$string[.es.target],relativeUrl;
186+
];
187+
188+
if[.type.isSymbol first relativeUrl;
189+
:` sv .es.target,relativeUrl;
190+
];
191+
192+
'"InvalidElasticSearchUrlException";
193+
};
194+
195+
/ HTTP POST downgraded to HTTP/1.0 (instead of HTTP/1.1) to disable "chunked" responses. The function interface matches .Q.hp
196+
/ @see .es.i.http10hmb
197+
.es.i.http10post:{[x;y;z]
198+
:.es.i.http10hmb[x; `POST; (y;z)];
199+
};
200+
201+
/ Modified version of .Q.hmb with the HTTP request downgraded to HTTP/1.0 to disable "chunked" responses due to the default .Q.hmb
202+
/ not correctly reading the header response from Elasticsearch when operating in "chunked" mode
203+
k).es.i.http10hmb:{x:$[10=@x;x;1_$x];p:{$[#y;y;x]}/getenv`$_:\("HTTP";"NO"),\:"_PROXY";u:.Q.hap@x;t:~(~#*p)||/(*":"\:u 2)like/:{(("."=*x)#"*"),x}'","\:p 1;a:$[t;p:.Q.hap@*p;u]1;(4+*r ss d)_r:(`$":",,/($[t;p;u]0 2))($y)," ",$[t;x;u 3]," HTTP/1.0",s,(s/:("Connection: close";"Host: ",u 2),((0<#a)#,$[t;"Proxy-";""],"Authorization: Basic ",.Q.btoa a),$[#z;("Content-type: ",z 0;"Content-length: ",$#z 1);()]),(d:s,s:"\r\n"),$[#z;z 1;""]};

0 commit comments

Comments
 (0)