Skip to content

Commit

Permalink
Add vector store integration of lindorm, including knn search, … (#14623
Browse files Browse the repository at this point in the history
)
  • Loading branch information
Rainy-GG authored Jul 24, 2024
1 parent 1d7f303 commit 94e137e
Show file tree
Hide file tree
Showing 13 changed files with 2,060 additions and 0 deletions.
667 changes: 667 additions & 0 deletions docs/docs/examples/vector_stores/LindormDemo.ipynb

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
llama_index/_static
.DS_Store
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
bin/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
etc/
include/
lib/
lib64/
parts/
sdist/
share/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
.ruff_cache

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints
notebooks/

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
pyvenv.cfg

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# Jetbrains
.idea
modules/
*.swp

# VsCode
.vscode

# pipenv
Pipfile
Pipfile.lock

# pyright
pyrightconfig.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
poetry_requirements(
name="poetry",
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
GIT_ROOT ?= $(shell git rev-parse --show-toplevel)

help: ## Show all Makefile targets.
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[33m%-30s\033[0m %s\n", $$1, $$2}'

format: ## Run code autoformatters (black).
pre-commit install
git ls-files | xargs pre-commit run black --files

lint: ## Run linters: pre-commit (black, ruff, codespell) and mypy
pre-commit install && git ls-files | xargs pre-commit run --show-diff-on-failure --files

test: ## Run tests via pytest.
pytest tests

watch-docs: ## Build and watch documentation.
sphinx-autobuild docs/ docs/_build/html --open-browser --watch $(GIT_ROOT)/llama_index/
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# LlamaIndex Vector_Stores Integration: Lindorm

- LindormVectorStore support pure vector search, search with metadata filtering, hybrid search, async, etc.
- Please refer to the [notebook](../../../docs/docs/examples/vector_stores/LindormDemo.ipynb) for usage of Lindorm as vector store in LlamaIndex.

# Example Usage

```sh
pip install llama-index
pip install opensearch-py
pip install llama-index-vector-stores-lindorm
```

```python
from llama_index.vector_stores.lindorm import (
LindormVectorStore,
LindormVectorClient,
)

# how to obtain an lindorm search instance:
# https://alibabacloud.com/help/en/lindorm/latest/create-an-instance

# how to access your lindorm search instance:
# https://www.alibabacloud.com/help/en/lindorm/latest/view-endpoints

# run curl commands to connect to and use LindormSearch:
# https://www.alibabacloud.com/help/en/lindorm/latest/connect-and-use-the-search-engine-with-the-curl-command

# lindorm instance info
host = "ld-bp******jm*******-proxy-search-pub.lindorm.aliyuncs.com"
port = 30070
username = "your_username"
password = "your_password"

# index to demonstrate the VectorStore impl
index_name = "lindorm_test_index"

# extension param of lindorm search, number of cluster units to query; between 1 and method.parameters.nlist.
nprobe = "a number(string type)"

# extension param of lindorm search, usually used to improve recall accuracy, but it increases performance overhead;
# between 1 and 200; default: 10.
reorder_factor = "a number(string type)"

# LindormVectorClient encapsulates logic for a single index with vector search enabled
client = LindormVectorClient(
host=host,
port=port,
username=username,
password=password,
index=index_name,
dimension=1536, # match with your embedding model
nprobe=nprobe,
reorder_factor=reorder_factor,
# filter_type="pre_filter/post_filter(default)"
)

# initialize vector store
vector_store = LindormVectorStore(client)
```
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
python_sources()
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from llama_index.vector_stores.lindorm.base import (
LindormVectorStore,
LindormVectorClient,
)

__all__ = ["LindormVectorStore", "LindormVectorClient"]
Loading

0 comments on commit 94e137e

Please sign in to comment.