Skip to content

Comments

Support LEANN in llamaindex #217#224

Open
AnirbansarkarS wants to merge 1 commit intoyichuan-w:mainfrom
AnirbansarkarS:main
Open

Support LEANN in llamaindex #217#224
AnirbansarkarS wants to merge 1 commit intoyichuan-w:mainfrom
AnirbansarkarS:main

Conversation

@AnirbansarkarS
Copy link

Walkthrough: LlamaIndex Integration for LEANN
I have successfully implemented a LlamaIndex vector store integration for LEANN. This allows users to use LEANN as a storage-efficient backend for RAG pipelines, achieving up to 97% storage reduction while maintaining compatibility with LlamaIndex features like metadata filtering and hybrid search.

Changes Made

  1. New Package: llama-index-vector-stores-leann
    Created a new package structure in packages/llama-index-vector-stores-leann following the standard LlamaIndex integration patterns.

pyproject.toml
: Defines dependencies and package metadata.
base.py
: Core implementation of LeannVectorStore.
README.md
: Documentation on installation and usage.
2. Implementation: LeannVectorStore
The LeannVectorStore class inherits from BasePydanticVectorStore and implements the required methods:

add(): Accumulates nodes and metadata into LeannBuilder.
delete(): Filters out documents and rebuilds the index (as per LEANN's current design).
query(): Executes semantic search via LeannSearcher, including support for metadata filters.
persist(): Finalizes the index build process.
3. Testing and Validation
Created comprehensive unit tests in
test_leann_vector_store.py
.
Verified the integration via runtime import tests and mock setups.
Provided a full usage example in
leann_example.py
.
Verification Results
Import Test
Verified that the package and its dependencies are correctly configured for namespace-compliant imports:

Success core
Success leann
Unit Tests
The unit tests cover:

Basic insertion and retrieval.
Document deletion and index rebuild.
Metadata preservation.
Usage Example
from llama_index.vector_stores.leann import LeannVectorStore
from llama_index.core import VectorStoreIndex, StorageContext

Initialize LEANN

vector_store = LeannVectorStore(
index_path="./my_index_dir",
embedding_model="facebook/contriever"
)

Connect to LlamaIndex

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

Query

query_engine = index.as_query_engine()
response = query_engine.query("What is LEANN?")

feat: Add LEANN vector store integration including implementation, tests, examples, and documentation.
@AnirbansarkarS
Copy link
Author

@yichuan-w .... enhanced the 217 issue

@ASuresh0524
Copy link
Collaborator

will look into it, looks fine to me as long as CI is fixed

@ASuresh0524
Copy link
Collaborator

@AnirbansarkarS bump on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants