Skip to content

Commit

Permalink
Fixed guide & added link in USER_GUIDE.md opensearch-project#4
Browse files Browse the repository at this point in the history
Signed-off-by: Djcarrillo6 <djcarrillo6@yahoo.com>

Added a guide on making raw JSON REST requests. (opensearch-project#542)

Signed-off-by: dblock <dblock@amazon.com>

Added document lifecycle guide & sample code.

Signed-off-by: Djcarrillo6 <djcarrillo6@yahoo.com>

Updated CHANGELOG

Signed-off-by: Djcarrillo6 <djcarrillo6@yahoo.com>

Added support for AWS Sigv4 for UrlLib3. (opensearch-project#547)

* WIP: Added support for AWS Sigv4 for UrlLib3.

Signed-off-by: dblock <dblock@amazon.com>

* Refactored common implementation.

Signed-off-by: dblock <dblock@amazon.com>

* Added sigv4 samples.

Signed-off-by: dblock <dblock@amazon.com>

* Updated CHANGELOG.

Signed-off-by: dblock <dblock@amazon.com>

* Add documentation.

Signed-off-by: dblock <dblock@amazon.com>

* Use the correct class in tests.

Signed-off-by: dblock <dblock@amazon.com>

* Renamed samples.

Signed-off-by: dblock <dblock@amazon.com>

* Split up requests and urllib3 unit tests.

Signed-off-by: dblock <dblock@amazon.com>

* Rename AWSV4Signer.

Signed-off-by: dblock <dblock@amazon.com>

* Clarified documentation of when to use Urllib3AWSV4SignerAuth vs. RequestHttpConnection.

Signed-off-by: dblock <dblock@amazon.com>

* Move fetch_url inside the signer class.

Signed-off-by: dblock <dblock@amazon.com>

* Added unit test for Urllib3AWSV4SignerAuth adding headers.

Signed-off-by: dblock <dblock@amazon.com>

* Added unit test for signing to include query string.

Signed-off-by: dblock <dblock@amazon.com>

---------

Signed-off-by: dblock <dblock@amazon.com>

Remove support for Python 2.x. (opensearch-project#548)

Signed-off-by: dblock <dblock@amazon.com>

Fixed guide & added link in USER_GUIDE.md

Signed-off-by: Djcarrillo6 <djcarrillo6@yahoo.com>

Added support for AWS Sigv4 for UrlLib3. (opensearch-project#547)

* WIP: Added support for AWS Sigv4 for UrlLib3.

Signed-off-by: dblock <dblock@amazon.com>

* Refactored common implementation.

Signed-off-by: dblock <dblock@amazon.com>

* Added sigv4 samples.

Signed-off-by: dblock <dblock@amazon.com>

* Updated CHANGELOG.

Signed-off-by: dblock <dblock@amazon.com>

* Add documentation.

Signed-off-by: dblock <dblock@amazon.com>

* Use the correct class in tests.

Signed-off-by: dblock <dblock@amazon.com>

* Renamed samples.

Signed-off-by: dblock <dblock@amazon.com>

* Split up requests and urllib3 unit tests.

Signed-off-by: dblock <dblock@amazon.com>

* Rename AWSV4Signer.

Signed-off-by: dblock <dblock@amazon.com>

* Clarified documentation of when to use Urllib3AWSV4SignerAuth vs. RequestHttpConnection.

Signed-off-by: dblock <dblock@amazon.com>

* Move fetch_url inside the signer class.

Signed-off-by: dblock <dblock@amazon.com>

* Added unit test for Urllib3AWSV4SignerAuth adding headers.

Signed-off-by: dblock <dblock@amazon.com>

* Added unit test for signing to include query string.

Signed-off-by: dblock <dblock@amazon.com>

---------

Signed-off-by: dblock <dblock@amazon.com>

Remove support for Python 2.x. (opensearch-project#548)

Signed-off-by: dblock <dblock@amazon.com>

Fixed guide & added link in USER_GUIDE.md opensearch-project#3

Signed-off-by: Djcarrillo6 <djcarrillo6@yahoo.com>
  • Loading branch information
Djcarrillo6 committed Oct 25, 2023
1 parent d9a7050 commit 200545c
Show file tree
Hide file tree
Showing 35 changed files with 2,182 additions and 1,898 deletions.
3 changes: 0 additions & 3 deletions .ci/test-matrix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@ TEST_SUITE:
- oss

PYTHON_VERSION:
- "2.7"
- "3.4"
- "3.5"
- "3.6"
- "3.7"
- "3.8"
Expand Down
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,12 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
### Added
- Added generating imports and headers to API generator ([#467](https://github.com/opensearch-project/opensearch-py/pull/467))
- Added point-in-time APIs (create_pit, delete_pit, delete_all_pits, get_all_pits) and Security Client APIs (health and update_audit_configuration) ([#502](https://github.com/opensearch-project/opensearch-py/pull/502))
- Added new guide for using index templates with the client ([#531](https://github.com/opensearch-project/opensearch-py/pull/531))
- Added guide on using index templates ([#531](https://github.com/opensearch-project/opensearch-py/pull/531))
- Added `pool_maxsize` for `Urllib3HttpConnection` ([#535](https://github.com/opensearch-project/opensearch-py/pull/535))
- Added benchmarks ([#537](https://github.com/opensearch-project/opensearch-py/pull/537))
- Added guide on making raw JSON REST requests ([#542](https://github.com/opensearch-project/opensearch-py/pull/542))
- Added guide on the document lifecycle API(s) ([#545](https://github.com/opensearch-project/opensearch-py/pull/545))
- Added support for AWS SigV4 for urllib3 ([#547](https://github.com/opensearch-project/opensearch-py/pull/547))
### Changed
- Generate `tasks` client from API specs ([#508](https://github.com/opensearch-project/opensearch-py/pull/508))
- Generate `ingest` client from API specs ([#513](https://github.com/opensearch-project/opensearch-py/pull/513))
Expand All @@ -18,6 +21,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
### Deprecated
- Deprecated point-in-time APIs (list_all_point_in_time, create_point_in_time, delete_point_in_time) and Security Client APIs (health_check and update_audit_config) ([#502](https://github.com/opensearch-project/opensearch-py/pull/502))
### Removed
- Removed leftover support for Python 2.7 ([#548](https://github.com/opensearch-project/opensearch-py/pull/548))
### Fixed
### Security
### Dependencies
Expand Down
16 changes: 14 additions & 2 deletions DEVELOPER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,19 @@ docker run -d -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensear

Tests require a live instance of OpenSearch running in docker.

This will start a new instance and run tests against the latest version of OpenSearch.
If you have one running.

```
python setup.py test
```

To run tests in a specific test file.

```
python setup.py test -s test_opensearchpy/test_connection.py
```

If you want to auto-start one, the following will start a new instance and run tests against the latest version of OpenSearch.

```
./.ci/run-tests
Expand Down Expand Up @@ -76,7 +88,7 @@ You can also run individual tests matching a pattern (`pytest -k [pattern]`).
```
./.ci/run-tests true 1.3.0 test_no_http_compression
test_opensearchpy/test_connection.py::TestUrllib3Connection::test_no_http_compression PASSED [ 33%]
test_opensearchpy/test_connection.py::TestUrllib3HttpConnection::test_no_http_compression PASSED [ 33%]
test_opensearchpy/test_connection.py::TestRequestsConnection::test_no_http_compression PASSED [ 66%]
test_opensearchpy/test_async/test_connection.py::TestAIOHttpConnection::test_no_http_compression PASSED [100%]
```
Expand Down
3 changes: 3 additions & 0 deletions USER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,8 @@ In general, we recommend using a package manager, such as [poetry](https://pytho
In the example below, we create a client, create an index with non-default settings, insert a
document into the index, search for the document, delete the document, and finally delete the index.

You can more find information on the full document lifecycle in [guides/document_lifecycle.md](guides/document_lifecycle.md).

You can find working versions of the code below that can be run with a local instance of OpenSearch in [samples](samples).

### Creating a Client
Expand Down Expand Up @@ -155,6 +157,7 @@ print(response)
- [Using a Proxy](guides/proxy.md)
- [Index Templates](guides/index_template.md)
- [Advanced Index Actions](guides/advanced_index_actions.md)
- [Making Raw JSON REST Requests](guides/json.md)
- [Connection Classes](guides/connection_classes.md)

## Plugins
Expand Down
11 changes: 5 additions & 6 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,13 @@ pytz
numpy; python_version<"3.10"
pandas; python_version<"3.10"

pyyaml>=5.4; python_version>="3.6"
pyyaml==5.3.1; python_version<"3.6"
pyyaml>=5.4

isort
black; python_version>="3.6"
black
twine

# Requirements for testing [async] extra
aiohttp; python_version>="3.6"
pytest-asyncio<=0.21.1; python_version>="3.6"
unasync; python_version>="3.6"
aiohttp
pytest-asyncio<=0.21.1
unasync
13 changes: 9 additions & 4 deletions guides/auth.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
- [Authentication](#authentication)
- [IAM Authentication](#iam-authentication)
- [IAM Authentication with a Synchronous Client](#iam-authentication-with-a-synchronous-client)
- [IAM Authentication with an Async Client](#iam-authentication-with-an-async-client)
- [Kerberos](#kerberos)

Expand All @@ -9,24 +10,28 @@ OpenSearch allows you to use different methods for the authentication via `conne

## IAM Authentication

Opensearch-py supports IAM-based authentication via `AWSV4SignerAuth`, which uses `RequestHttpConnection` as the transport class for communicating with OpenSearch clusters running in Amazon Managed OpenSearch and OpenSearch Serverless, and works in conjunction with [botocore](https://pypi.org/project/botocore/).
This library supports IAM-based authentication when communicating with OpenSearch clusters running in Amazon Managed OpenSearch and OpenSearch Serverless.

## IAM Authentication with a Synchronous Client

For `Urllib3HttpConnection` use `Urllib3AWSV4SignerAuth`, and for `RequestHttpConnection` use `RequestsAWSV4SignerAuth`.

```python
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
from opensearchpy import OpenSearch, Urllib3HttpConnection, Urllib3AWSV4SignerAuth
import boto3

host = '' # cluster endpoint, for example: my-test-domain.us-east-1.es.amazonaws.com
region = 'us-west-2'
service = 'es' # 'aoss' for OpenSearch Serverless
credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAuth(credentials, region, service)
auth = Urllib3AWSV4SignerAuth(credentials, region, service)

client = OpenSearch(
hosts = [{'host': host, 'port': 443}],
http_auth = auth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection,
connection_class = Urllib3HttpConnection,
pool_maxsize = 20
)

Expand Down
182 changes: 182 additions & 0 deletions guides/document_lifecycle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
# Document Lifecycle Guide
- [Document Lifecycle](#document-lifecycle)
- [Setup](#setup)
- [Document API Actions](#document-api-actions)
- [Create a new document with specified ID](#create-a-new-document-with-specified-id)
- [Create a new document with auto-generated ID](#create-a-new-document-with-auto-generated-id)
- [Get a document](#get-a-document)
- [Get multiple documents](#get-multiple-documents)
- [Check if a document exists](#check-if-a-document-exists)
- [Update a document](#update-a-document)
- [Update multiple documents by query](#update-multiple-documents-by-query)
- [Delete a document](#delete-a-document)
- [Delete multiple documents by query](#delete-multiple-documents-by-query)
- [Cleanup](#cleanup)


# Document Lifecycle
This guide covers OpenSearch Python Client API actions for Document Lifecycle. You'll learn how to create, read, update, and delete documents in your OpenSearch cluster. Whether you're new to OpenSearch or an experienced user, this guide provides the information you need to manage your document lifecycle effectively.

## Setup
Assuming you have OpenSearch running locally on port 9200, you can create a client instance
with the following code:

```python
from opensearchpy import OpenSearch
client = OpenSearch(
hosts=['https://localhost:9200'],
use_ssl=True,
verify_certs=False,
http_auth=('admin', 'admin')
)
```

Next, create an index named `movies` with the default settings:

```python
index = 'movies'
if not client.indices.exists(index=index):
client.indices.create(index=index)
```

## Document API Actions

### Create a new document with specified ID
To create a new document, use the `create` or `index` API action. The following code creates two new documents with IDs of `1` and `2`:

```python
client.create(index=index, id=1, body={'title': 'Beauty and the Beast', 'year': 1991})
client.create(index=index, id=2, body={'title': 'Beauty and the Beast - Live Action', 'year': 2017})
```

Note that the `create` action is NOT idempotent. If you try to create a document with an ID that already exists, the request will fail:

```python
try:
client.create(index=index, id=1, body={'title': 'Just Another Movie'})
except Exception as e:
print(e)
```

The `index` action, on the other hand, is idempotent. If you try to index a document with an existing ID, the request will succeed and overwrite the existing document. Note that no new document will be created in this case. You can think of the `index` action as an upsert:

```python
client.index(index=index, id=2, body={'title': 'Updated Title'})
client.index(index=index, id=2, body={'title': 'The Lion King', 'year': 1994})
```

### Create a new document with auto-generated ID
You can also create a new document with an auto-generated ID by omitting the `id` parameter. The following code creates documents with an auto-generated IDs in the `movies` index:

```python
OR client.index(index=index, body={"title": "The Lion King 2", "year": 1998})
```

In this case, the ID of the created document in the `result` field of the response body:

```python
{
"_index": "movies",
"_type": "_doc",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
```

### Get a document
To get a document, use the `get` API action. The following code gets the document with ID `1` from the `movies` index:

```python
client.get(index=index, id=1)['_source']
# OUTPUT: {"title"=>"Beauty and the Beast","year"=>1991}
```

You can also use `_source_includes` and `_source_excludes` parameters to specify which fields to include or exclude in the response:

```python
client.get(index=index, id=1, _source_includes=['title'])['_source']
# OUTPUT: {"title": "Beauty and the Beast"}

client.get(index=index, id=1, _source_excludes=['title'])['_source']
# OUTPUT: {"year": 1991}
```

### Get multiple documents
To get multiple documents, use the `mget` API action:

```python
client.mget(index=index, body={ 'docs': [{ '_id': 1 }, { '_id': 2 }] })['docs']
```

### Check if a document exists
To check if a document exists, use the `exists` API action. The following code checks if the document with ID `1` exists in the `movies` index:

```python
client.exists(index=index, id=1)
```

### Update a document
To update a document, use the `update` API action. The following code updates the `year` field of the document with ID `1` in the `movies` index:

```python
client.update(index=index, id=1, body={'doc': {'year': 1995}})
```

Alternatively, you can use the `script` parameter to update a document using a script. The following code increments the `year` field of the of document with ID `1` by 5 using painless script, the default scripting language in OpenSearch:

```python
client.update(index=index, id=1, body={ 'script': { 'source': 'ctx._source.year += 5' } })
```

Note that while both `update` and `index` actions perform updates, they are not the same. The `update` action is a partial update, while the `index` action is a full update. The `update` action only updates the fields that are specified in the request body, while the `index` action overwrites the entire document with the new document.

### Update multiple documents by query

To update documents that match a query, use the `update_by_query` API action. The following code decreases the `year` field of all documents with `year` greater than 2023:

```python
client.update_by_query(index=index, body={
'script': { 'source': 'ctx._source.year -= 1' },
'query': { 'range': { 'year': { 'gt': 2023 } } }
})
```

### Delete a document
To delete a document, use the `delete` API action. The following code deletes the document with ID `1`:

```python
client.delete(index=index, id=1)
```

By default, the `delete` action is not idempotent. If you try to delete a document that does not exist, or delete the same document twice, you will run into Not Found (404) error. You can make the `delete` action idempotent by setting the `ignore` parameter to `404`:

```python
client.delete(index=index, id=1, ignore=404)
```

### Delete multiple documents by query
To delete documents that match a query, use the `delete_by_query` API action. The following code deletes all documents with `year` greater than 2023:

```python
client.delete_by_query(index=index, body={
'query': { 'range': { 'year': { 'gt': 2023 } } }
})
```

## Cleanup
To clean up the resources created in this guide, delete the `movies` index:

```python
client.indices.delete(index=index)
```

# Sample Code
See [document_lifecycle_sample.py](/samples/document_lifecycle/document_lifecycle_sample.py) for a working sample of the concepts in this guide.
66 changes: 66 additions & 0 deletions guides/json.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
- [Making Raw JSON REST Requests](#making-raw-json-rest-requests)
- [GET](#get)
- [PUT](#put)
- [POST](#post)
- [DELETE](#delete)

# Making Raw JSON REST Requests

The OpenSearch client implements many high-level REST DSLs that invoke OpenSearch APIs. However you may find yourself in a situation that requires you to invoke an API that is not supported by the client. Use `client.transport.perform_request` to do so. See [samples/json](../samples/json) for a complete working sample.

## GET

The following example returns the server version information via `GET /`.

```python
info = client.transport.perform_request('GET', '/')
print(f"Welcome to {info['version']['distribution']} {info['version']['number']}!")
```

Note that the client will parse the response as JSON when appropriate.

## PUT

The following example creates an index.

```python
index_body = {
'settings': {
'index': {
'number_of_shards': 4
}
}
}

client.transport.perform_request("PUT", "/movies", body=index_body)
```

Note that the client will raise errors automatically. For example, if the index already exists, an `opensearchpy.exceptions.RequestError: RequestError(400, 'resource_already_exists_exception',` will be thrown.

## POST

The following example searches for a document.

```python
q = 'miller'

query = {
'size': 5,
'query': {
'multi_match': {
'query': q,
'fields': ['title^2', 'director']
}
}
}

client.transport.perform_request("POST", "/movies/_search", body = query)
```

## DELETE

The following example deletes an index.

```python
client.transport.perform_request("DELETE", "/movies")
```
2 changes: 1 addition & 1 deletion noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
)


@nox.session(python=["2.7", "3.6", "3.7", "3.8", "3.9", "3.10", "3.11"])
@nox.session(python=["3.6", "3.7", "3.8", "3.9", "3.10", "3.11"])
def test(session):
session.install(".")
session.install("-r", "dev-requirements.txt")
Expand Down
Loading

0 comments on commit 200545c

Please sign in to comment.