Background
Scalability Limitations of Current Architecture
WeKnora's retrieval engine architecture works well for small-scale, single-backend environments, but structural limitations emerge when scaling to production.
Limitation 1: Registry is a Singleton per EngineType
The current RetrieveEngineRegistry is implemented as map[RetrieverEngineType]RetrieveEngineService. Attempting to register more than one engine of the same type results in a "repository type %s already registered" error.
// registry.go — current structure
type RetrieveEngineRegistry struct {
repositories map[types.RetrieverEngineType]interfaces.RetrieveEngineService
}
func (r *RetrieveEngineRegistry) Register(repo interfaces.RetrieveEngineService) error {
if _, exists := r.repositories[repo.EngineType()]; exists {
return fmt.Errorf("repository type %s already registered", repo.EngineType())
}
// ...
}
This makes it impossible to run multiple instances of the same DB type. For example, you cannot operate two ES clusters for hot/warm tiers or regional separation. (Different DB types — e.g., one ES + one Qdrant — can coexist since they have different EngineTypes, but two instances of the same type cannot.)
Limitation 2: RETRIEVE_DRIVER is a Global Environment Variable
initRetrieveEngineRegistry() in container.go parses os.Getenv("RETRIEVE_DRIVER") once at startup, creating a single client. All Knowledge Bases are forced to share the same vector DB instance.
// container.go — current initialization flow
func initRetrieveEngineRegistry(db *gorm.DB, cfg *config.Config) {
retrieveDriver := strings.Split(os.Getenv("RETRIEVE_DRIVER"), ",")
// "elasticsearch_v8" → creates exactly one ES client
// all KBs share this instance
}
Limitation 3: All KBs Share a Single Index/Collection
Even within the same vector DB instance, all KB data is stored in a single index (or collection). KB isolation relies solely on filtering by the knowledge_base_id field within documents.
// elasticsearch/v8/repository.go — all KB data in a single index
indexName := os.Getenv("ELASTICSEARCH_INDEX") // default: "xwrag_default"
res := &elasticsearchRepository{client: client, index: indexName}
// → all KB vectors go into this single index
// filtering by knowledge_base_id at query time
must = append(must, types.Query{Terms: &types.TermsQuery{
TermsQuery: map[string]types.TermsQueryField{
"knowledge_base_id.keyword": params.KnowledgeBaseIDs,
},
}})
Milvus similarly uses a single MILVUS_COLLECTION env var for the entire collection name (weknora_embeddings_{dimension}).
Problems with this approach:
- No performance isolation: If KB-A has 10M documents, a 100-document search on KB-B is still affected by the total index size (especially with brute-force search)
- No operational flexibility: Cannot tune shard count, replicas, or HNSW parameters per KB
- No security isolation: Index-level access control is impossible when all KB data resides in a single index
- Index management complexity: A bloated single index means reindexing, snapshots, and recovery operations affect all KBs
Limitation 4: No Vector Store Binding on Knowledge Base
Retrieval engine configuration exists only at the Tenant level (Tenant.RetrieverEngines). There is no way to choose which vector DB instance or which index to use when creating a KB. Limitations 2 (single instance) and 3 (single index) combine to make flexible per-KB placement fundamentally impossible.
Elasticsearch Vector Search Performance Limitations
The current ES driver performs vector search using script_score + cosineSimilarity:
// elasticsearch/v8/repository.go
scoreSource := "cosineSimilarity(params.query_vector, 'embedding')"
This is a brute-force linear scan (O(N)). Search latency increases linearly with document count:
| Documents |
script_score (brute-force) |
ANN (HNSW) |
| 100K |
hundreds of ms |
< 10 ms |
| 1M |
seconds (SLA risk) |
~10 ms |
| 10M |
timeout |
20-50 ms |
Additionally, index creation does not set explicit dense_vector mappings or HNSW parameters, relying on ES auto-mapping:
// elasticsearch/v8/repository.go — index created without mapping
_, err = e.client.Indices.Create(e.index).Do(ctx)
// → no dense_vector type or similarity settings
Proposal Overview
Core Principle: 100% Backward-Compatible, Opt-in Extension
This proposal does not change any existing behavior.
The current approach — single DB, single index, all KBs sharing — continues to work exactly as-is.
New features (multi-store, index isolation, OpenSearch) are activated only when the user explicitly opts in.
This extends the architecture so that, when desired, multiple vector DB instances can be bound at the KB level, and OpenSearch k-NN native vector search is officially supported.
Key Values
- Multi-store scalability: Register multiple vector DB instances — same type or different types — and select per KB. Or keep using a single instance shared by all KBs, just like today.
- Index isolation: Use independent indices/collections per KB, even within the same DB instance. Or keep using a single shared index, just like today.
- Performance: OpenSearch k-NN (HNSW) native vector search improves complexity from O(N) → O(log N)
- Gradual transition: Non-breaking soft handover — connect only new KBs to a new store while existing KBs remain untouched
Multi-store and index isolation are independent yet complementary values.
- Multi-store: place KBs on different DB instances (cluster-level separation)
- Index isolation: separate KB indices within the same DB instance (index-level separation)
- Both are needed for flexible production operations.
- Both are opt-in. If not configured, behavior is 100% identical to today.
Implementation Plan (5 Phases)
Phase Dependencies
Phase 1 (VectorStore + Registry) ─→ Phase 2 (KB binding) ─→ Phase 4 (Cross-store migration)
↘ Phase 5 (Backward compat + docs)
Phase 3 (OpenSearch driver) — can proceed independently of Phase 1
- Phase 1 is the core prerequisite. Phases 2, 4, 5 depend on Phase 1.
- Phase 3 (OpenSearch driver) can be added to the existing Registry structure, so it can proceed in parallel with Phase 1.
Phase 1: VectorStore Entity + Registry Refactoring
Goal: Manage vector DB instances as first-class entities and extend the Registry to per-instance management.
Changes:
-
New vector_stores table
vector_stores
├── id (PK)
├── name — human-readable name (e.g., "es-hot", "opensearch-prod")
├── engine_type — RetrieverEngineType (elasticsearch, opensearch, qdrant, ...)
├── connection_config (JSON) — connection info (addr, username, password, ...)
├── index_config (JSON) — index settings (index_prefix, shards, HNSW params, ...)
├── index_strategy — index isolation strategy: "shared" (current behavior) | "per_kb" (per-KB index)
├── is_default — whether this is the default store
├── tenant_id (FK)
└── created_at / updated_at
-
Index isolation strategy (index_strategy)
"shared" (default): Same as current behavior. All KB data in a single index, filtered by knowledge_base_id field
"per_kb": Automatically creates a separate index per KB (e.g., {index_prefix}_{kb_id})
- Dedicated index created on KB creation, deleted on KB deletion
- Independent mapping/HNSW parameters per KB
- Smaller index size improves both brute-force search and ANN index build times
- Existing deployments default to
"shared", so no behavior change
-
Change Registry key from EngineType → StoreID
// Current: map[RetrieverEngineType]RetrieveEngineService — one per type
// Proposed: map[StoreID]RetrieveEngineService — one per instance
-
RETRIEVE_DRIVER environment variable backward compatibility
- If the env var is set, automatically creates a "default VectorStore" record to preserve existing behavior
- VectorStore table records take precedence over env vars when present
Affected files: registry.go, container.go, new types/vectorstore.go, migration
Phase 2: KB ↔ VectorStore Binding
Goal: Enable selecting which vector DB to use when creating a KB.
Changes:
- Add
vector_store_id (FK, nullable) to the KnowledgeBase model
- Add
vector_store_id parameter to KB create/update APIs
- When
vector_store_id is not specified:
- Use the Tenant's default VectorStore
- Fall back to the
RETRIEVE_DRIVER-based global default
- Modify
CompositeRetrieveEngine creation to look up the engine from the KB's bound store
- Index name resolution logic:
- If store's
index_strategy is "per_kb": use {index_prefix}_{kb_id} format for a dedicated KB index
- If store's
index_strategy is "shared": use {index_name} single index + knowledge_base_id filter (current behavior)
Affected files: knowledgebase.go, composite.go, KB CRUD handler/service, index resolution logic in each repository, migration
Phase 3: OpenSearch Driver Implementation
Goal: Native vector search driver using the OpenSearch k-NN plugin.
Changes:
- Add
"opensearch" to RetrieverEngineType
- Add OpenSearch entry to
retrieverEngineMapping
- New
internal/application/repository/retriever/opensearch/ package:
- Explicit
knn_vector mapping + HNSW parameters on index creation
- Engine selection: Lucene (< 10M docs), Faiss HNSW (≥ 10M docs)
- k-NN native query (
knn DSL) — ANN, not brute-force
- Hybrid search: k-NN vector + BM25 keyword combination
- Implements the existing
RetrieveEngineRepository interface — same contract as other drivers
Expected performance:
- 100K docs: hundreds of ms → < 10 ms (10x+)
- 1M docs: seconds → ~10 ms (100x+)
Phase 4: Cross-store Migration API
Goal: Zero-downtime migration of existing KBs to a different vector DB.
Changes:
- Extend existing
CopyIndices to support cross-store operations
- Read vector data from source store → write to target store
- Direct vector copy without re-computing embeddings (cost savings)
- Migration progress tracking API
- Soft handover workflow:
- Create new VectorStore + new KB (bound to the new store)
- Migrate existing KB data to the new KB
- Deactivate the old KB after verification
Phase 5: Backward Compatibility + Documentation
Goal: Guarantee zero-downtime upgrades for existing deployments. Zero impact on existing users who change nothing.
Changes:
- Automatic fallback chain on upgrade:
- If VectorStore table is empty → operate identically using
RETRIEVE_DRIVER env var
- If VectorStore records exist but KB has
vector_store_id = NULL → fall back to Tenant default store
- If
index_strategy is not set → "shared" (single index, current behavior)
- All existing env vars (
RETRIEVE_DRIVER, ELASTICSEARCH_INDEX, MILVUS_COLLECTION, etc.) fully preserved
- No breaking changes: env-var-only deployments work as before. All new tables/fields are nullable or have defaults
- Migration guide: transitioning from current → multi-store, current → per_kb index
Backward Compatibility Strategy
The core principle of this proposal is not to break any existing deployment. All new features are opt-in. If nothing is configured, behavior is 100% identical to today.
DB Instance Level
| Scenario |
Behavior |
Only RETRIEVE_DRIVER set, VectorStore table empty |
100% identical to current. Env-var-based single client |
VectorStore records exist, KB has no vector_store_id |
Falls back to Tenant default store → global default store |
KB has vector_store_id set |
Uses that specific store for search/indexing |
Index/Collection Level
| Scenario |
Behavior |
index_strategy not set or "shared" |
100% identical to current. Single index for all KBs, knowledge_base_id field filtering |
index_strategy = "per_kb" |
Independent index auto-created per KB ({prefix}_{kb_id}). Linked to KB create/delete |
Deployment Scenarios
Scenario 1: Existing user (changes nothing)
RETRIEVE_DRIVER=elasticsearch_v8
→ VectorStore table is empty
→ Same single ES client, single index (xwrag_default)
→ Changes: none. Just upgrade the code and everything works identically.
Scenario 2: Same DB, only want index isolation
RETRIEVE_DRIVER=elasticsearch_v8
→ Create 1 VectorStore (index_strategy="per_kb")
→ Dedicated index auto-created per KB
→ DB instance stays the same, only indices are separated per KB
Scenario 3-a: Multiple instances of the same DB type
→ VectorStores: ES-hot (recent docs), ES-warm (archive)
→ Frequently searched KBs → ES-hot, archive KBs → ES-warm
→ Same ES type but separate clusters, separate hardware
Scenario 3-b: Mixed DB types
→ VectorStores: ES-legacy (existing), OpenSearch-prod (new)
→ Existing KBs stay on ES as-is, new KBs go to OpenSearch
→ Each store can have its own independent index_strategy
Scenario 4: Gradual transition
→ Existing KBs remain on existing store + shared index
→ Only new KBs created on new store + per_kb index
→ Migrate existing KBs using the migration API when ready
No breaking changes in any scenario. Unless the user explicitly creates a VectorStore or changes index_strategy, everything works identically to today.
Why This Is Valuable for Upstream
-
Real production needs: Our team hit these limitations in a production environment managing millions of documents. Other production users are likely facing the same issues.
-
Architecture improvement: This is not just a feature addition — it naturally extends the existing design. It respects the existing interfaces (RetrieveEngineService, RetrieveEngineRegistry) while extending them.
-
OpenSearch ecosystem: Immediately valuable for users of managed services like AWS OpenSearch Service. k-NN native vector search offers dramatically better performance compared to ES script_score.
-
Incremental adoption: Even merging just Phase 1 provides the foundation for multi-store architecture, and the OpenSearch driver can be added independently. There is no need to merge all phases at once.
Background
Scalability Limitations of Current Architecture
WeKnora's retrieval engine architecture works well for small-scale, single-backend environments, but structural limitations emerge when scaling to production.
Limitation 1: Registry is a Singleton per EngineType
The current
RetrieveEngineRegistryis implemented asmap[RetrieverEngineType]RetrieveEngineService. Attempting to register more than one engine of the same type results in a"repository type %s already registered"error.This makes it impossible to run multiple instances of the same DB type. For example, you cannot operate two ES clusters for hot/warm tiers or regional separation. (Different DB types — e.g., one ES + one Qdrant — can coexist since they have different EngineTypes, but two instances of the same type cannot.)
Limitation 2: RETRIEVE_DRIVER is a Global Environment Variable
initRetrieveEngineRegistry()incontainer.goparsesos.Getenv("RETRIEVE_DRIVER")once at startup, creating a single client. All Knowledge Bases are forced to share the same vector DB instance.Limitation 3: All KBs Share a Single Index/Collection
Even within the same vector DB instance, all KB data is stored in a single index (or collection). KB isolation relies solely on filtering by the
knowledge_base_idfield within documents.Milvus similarly uses a single
MILVUS_COLLECTIONenv var for the entire collection name (weknora_embeddings_{dimension}).Problems with this approach:
Limitation 4: No Vector Store Binding on Knowledge Base
Retrieval engine configuration exists only at the Tenant level (
Tenant.RetrieverEngines). There is no way to choose which vector DB instance or which index to use when creating a KB. Limitations 2 (single instance) and 3 (single index) combine to make flexible per-KB placement fundamentally impossible.Elasticsearch Vector Search Performance Limitations
The current ES driver performs vector search using
script_score+cosineSimilarity:This is a brute-force linear scan (O(N)). Search latency increases linearly with document count:
Additionally, index creation does not set explicit
dense_vectormappings or HNSW parameters, relying on ES auto-mapping:Proposal Overview
Core Principle: 100% Backward-Compatible, Opt-in Extension
This extends the architecture so that, when desired, multiple vector DB instances can be bound at the KB level, and OpenSearch k-NN native vector search is officially supported.
Key Values
Implementation Plan (5 Phases)
Phase Dependencies
Phase 1: VectorStore Entity + Registry Refactoring
Goal: Manage vector DB instances as first-class entities and extend the Registry to per-instance management.
Changes:
New
vector_storestableIndex isolation strategy (
index_strategy)"shared"(default): Same as current behavior. All KB data in a single index, filtered byknowledge_base_idfield"per_kb": Automatically creates a separate index per KB (e.g.,{index_prefix}_{kb_id})"shared", so no behavior changeChange Registry key from
EngineType→StoreIDRETRIEVE_DRIVERenvironment variable backward compatibilityAffected files:
registry.go,container.go, newtypes/vectorstore.go, migrationPhase 2: KB ↔ VectorStore Binding
Goal: Enable selecting which vector DB to use when creating a KB.
Changes:
vector_store_id(FK, nullable) to theKnowledgeBasemodelvector_store_idparameter to KB create/update APIsvector_store_idis not specified:RETRIEVE_DRIVER-based global defaultCompositeRetrieveEnginecreation to look up the engine from the KB's bound storeindex_strategyis"per_kb": use{index_prefix}_{kb_id}format for a dedicated KB indexindex_strategyis"shared": use{index_name}single index +knowledge_base_idfilter (current behavior)Affected files:
knowledgebase.go,composite.go, KB CRUD handler/service, index resolution logic in each repository, migrationPhase 3: OpenSearch Driver Implementation
Goal: Native vector search driver using the OpenSearch k-NN plugin.
Changes:
"opensearch"toRetrieverEngineTyperetrieverEngineMappinginternal/application/repository/retriever/opensearch/package:knn_vectormapping + HNSW parameters on index creationknnDSL) — ANN, not brute-forceRetrieveEngineRepositoryinterface — same contract as other driversExpected performance:
Phase 4: Cross-store Migration API
Goal: Zero-downtime migration of existing KBs to a different vector DB.
Changes:
CopyIndicesto support cross-store operationsPhase 5: Backward Compatibility + Documentation
Goal: Guarantee zero-downtime upgrades for existing deployments. Zero impact on existing users who change nothing.
Changes:
RETRIEVE_DRIVERenv varvector_store_id= NULL → fall back to Tenant default storeindex_strategyis not set →"shared"(single index, current behavior)RETRIEVE_DRIVER,ELASTICSEARCH_INDEX,MILVUS_COLLECTION, etc.) fully preservedBackward Compatibility Strategy
The core principle of this proposal is not to break any existing deployment. All new features are opt-in. If nothing is configured, behavior is 100% identical to today.
DB Instance Level
RETRIEVE_DRIVERset, VectorStore table emptyvector_store_idvector_store_idsetIndex/Collection Level
index_strategynot set or"shared"knowledge_base_idfield filteringindex_strategy = "per_kb"{prefix}_{kb_id}). Linked to KB create/deleteDeployment Scenarios
No breaking changes in any scenario. Unless the user explicitly creates a VectorStore or changes
index_strategy, everything works identically to today.Why This Is Valuable for Upstream
Real production needs: Our team hit these limitations in a production environment managing millions of documents. Other production users are likely facing the same issues.
Architecture improvement: This is not just a feature addition — it naturally extends the existing design. It respects the existing interfaces (
RetrieveEngineService,RetrieveEngineRegistry) while extending them.OpenSearch ecosystem: Immediately valuable for users of managed services like AWS OpenSearch Service. k-NN native vector search offers dramatically better performance compared to ES
script_score.Incremental adoption: Even merging just Phase 1 provides the foundation for multi-store architecture, and the OpenSearch driver can be added independently. There is no need to merge all phases at once.