feat: add support for custom embedding model by VladimirKadlec · Pull Request #56 · lightspeed-core/lightspeed-evaluation

VladimirKadlec · 2025-09-19T09:43:54Z

LCORE-649

Summary by CodeRabbit

New Features
- Configurable embeddings with provider/model support (OpenAI, HuggingFace).
- Embedding management integrated into metrics; per-metric parameters supported.
Documentation
- Added non-OpenAI configuration example and evaluation data structure sample.
Refactor
- Centralized environment validation and streamlined configuration/evaluator initialization. Minor API adjustments may require integration updates.
Chores
- Added dependencies to enable embeddings support.
Tests
- Updated to reflect new configuration and initialization flow.

src/lightspeed_evaluation/pipeline/evaluation/evaluator.py

asamal4 · 2025-09-19T11:45:36Z

@coderabbitai review

coderabbitai · 2025-09-19T11:45:43Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-09-19T11:45:46Z

Walkthrough

Adds embedding support across config, models, and evaluation: new EmbeddingConfig, EmbeddingManager, and RagasEmbeddingManager; defaults added to constants. Refactors LLM env validation into module-level functions. Updates SystemConfig loader and usage. Modifies metrics (Ragas accepts embeddings, removes from_system_config). Adjusts evaluator/pipeline construction and dependencies. Adds dependencies.

Changes

Cohort / File(s)	Summary
Configuration examples `config/system.yaml`, `README.md`	Adds commented embedding examples in YAML; documents non-OpenAI setup and evaluation data structure.
Dependencies `pyproject.toml`	Adds langchain[huggingface]>=0.3.27 and sentence-transformers>=5.1.0.
Constants `src/lightspeed_evaluation/core/constants.py`	Adds DEFAULT_EMBEDDING_PROVIDER and DEFAULT_EMBEDDING_MODEL.
Models `src/lightspeed_evaluation/core/models/system.py`, `src/lightspeed_evaluation/core/models/__init__.py`	Introduces EmbeddingConfig; adds embedding field to SystemConfig; exports EmbeddingConfig.
LLM management and validation `src/lightspeed_evaluation/core/llm/manager.py`, `src/lightspeed_evaluation/core/llm/__init__.py`, `tests/unit/core/llm/test_manager.py`	Moves provider env checks to module-level validators; LLMManager.from_system_config now takes SystemConfig; exposes validate_openai_env/validate_gemini_env; tests updated to use SystemConfig.
Embedding management `src/lightspeed_evaluation/core/embedding/manager.py`, `src/lightspeed_evaluation/core/embedding/ragas.py`	Adds EmbeddingManager (config-driven, env validation) and RagasEmbeddingManager (constructs embeddings for Ragas from provider/model/kwargs).
Metrics updates `src/lightspeed_evaluation/core/metrics/ragas.py`, `src/lightspeed_evaluation/core/metrics/custom.py`, `src/lightspeed_evaluation/core/metrics/deepeval.py`	RagasMetrics now requires embedding_manager; supports metric_kwargs and passes embeddings for response relevancy; removes from_system_config in Ragas/Custom/DeepEval.
System loader `src/lightspeed_evaluation/core/system/loader.py`	Loads embedding section into SystemConfig via EmbeddingConfig.
Pipeline and evaluator `src/lightspeed_evaluation/pipeline/evaluation/evaluator.py`, `src/lightspeed_evaluation/pipeline/evaluation/pipeline.py`	Evaluator now builds LLM and Embedding managers from SystemConfig; constructor signature changed to (config_loader, metric_manager); pipeline updated accordingly.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant ConfigLoader
  participant SystemConfig
  participant LLMManager
  participant EmbeddingManager
  participant RagasEmbeddingManager
  participant MetricsEvaluator

  User->>ConfigLoader: load()
  ConfigLoader->>SystemConfig: construct (includes embedding)
  User->>MetricsEvaluator: __init__(config_loader, metric_manager)
  MetricsEvaluator->>SystemConfig: get()
  MetricsEvaluator->>LLMManager: from_system_config(SystemConfig)
  note right of LLMManager: Runs provider env validators
  MetricsEvaluator->>EmbeddingManager: from_system_config(SystemConfig)
  note right of EmbeddingManager: Validates embedding provider/env
  MetricsEvaluator->>RagasEmbeddingManager: __init__(EmbeddingManager)
  note right of RagasEmbeddingManager: Build embeddings backend

sequenceDiagram
  autonumber
  participant MetricsEvaluator
  participant RagasMetrics
  participant LLMManager
  participant RagasEmbeddingManager
  participant Metric
  participant Dataset

  MetricsEvaluator->>RagasMetrics: evaluate(metric_name, dataset)
  RagasMetrics->>LLMManager: get_llm()
  RagasMetrics->>RagasEmbeddingManager: embeddings
  RagasMetrics->>Metric: construct(llm, embeddings?, metric_kwargs)
  RagasMetrics->>Dataset: compute()
  Metric-->>RagasMetrics: result
  RagasMetrics-->>MetricsEvaluator: result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

archive old eval and make lsc eval as primary #35 — Overlaps in LLMManager refactors; both adjust provider handling and SystemConfig integration.
Turn metric override #55 — Related metric routing and per-turn metrics; interacts with evaluator flow updated here.
Added Unit test cases as well as integration test cases #42 — Touches LLMManager tests and SystemConfig-based construction similarly to this PR.

Suggested reviewers

tisnik
asamal4
Anxhela21

Poem

In burrows of config I hop and weave,
New embeddings sprout—oh what we achieve!
LLMs validated, carrots aligned,
Ragas hums with vectors combined.
Pipelines rearranged, neat and tight—
Thump-thump! I ship this with pure delight. 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly and accurately describes the primary change in this PR — adding support for configurable/custom embedding models (new EmbeddingConfig, EmbeddingManager/RagasEmbeddingManager and related defaults and wiring). It is concise, specific, and directly related to the changes in the diff rather than being vague or off-topic.
Docstring Coverage	✅ Passed	Docstring coverage is 94.44% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

VladimirKadlec · 2025-09-19T12:19:10Z

@coderabbitai review

asamal4 · 2025-09-22T06:52:31Z

@coderabbitai review

coderabbitai · 2025-09-22T06:52:39Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lightspeed_evaluation/core/metrics/custom.py (1)
58-64: Broaden exception handling in evaluate() to avoid crashing the pipeline.

_call_llm() raises RuntimeError, which isn’t caught here; evaluation will bubble up and abort runs.

Apply:
-        except (ValueError, AttributeError, KeyError) as e:
+        except Exception as e:  # includes RuntimeError from _call_llm
             return None, f"Custom {metric_name} evaluation failed: {str(e)}"

🧹 Nitpick comments (14)

src/lightspeed_evaluation/core/metrics/custom.py (3)
29-29: OK to drop factory; minor style nit.

Class-level pylint disable is fine. Consider documenting construction via LLMManager in module docstring for discoverability.

46-46: Use logging instead of print for library code.

Apply:
+import logging
+logger = logging.getLogger(__name__)
@@
-        print(f"✅ Custom Metrics initialized: {self.model_name}")
+        logger.info("Custom Metrics initialized: %s", self.model_name)
23-25: Tighten type hints for contexts.

Optional[list] is too generic.

Apply:
-    contexts: Optional[list] = Field(
+    contexts: Optional[list[dict | str]] = Field(
         None, description="Context information if available"
     )
src/lightspeed_evaluation/core/metrics/deepeval.py (1)
70-76: Guard against unexpected DeepEval errors.

Broaden catch to prevent a single bad metric run from aborting evaluation.

Apply:
-        except (ValueError, AttributeError, KeyError) as e:
+        except Exception as e:
             return None, f"DeepEval {metric_name} evaluation failed: {str(e)}"
src/lightspeed_evaluation/core/constants.py (1)

14-15: Sensible defaults; confirm provider/model align with LiteLLM config.

If OpenAI env isn’t set, ensure callers fail fast with clear errors or switch to HF defaults.
config/system.yaml (1)
20-33: Embedding config docs are clear; consider adding OpenAI example.

Example:
# embedding:
#   provider: "openai"
#   model: "text-embedding-3-small"
#   provider_kwargs: {}
tests/unit/core/llm/test_manager.py (1)

135-145: Add a couple more assertions for robustness.

Case-insensitive provider for embeddings.

Default embedding config behavior when embedding is absent.

If helpful, I can add tests for these.

src/lightspeed_evaluation/core/llm/ragas.py (1)

1-9: Optional: avoid heavy imports at module import time

LangChain providers are heavy; consider lazy-importing them inside RagasEmbeddingManager.init to reduce startup cost when Ragas metrics aren’t used.
src/lightspeed_evaluation/core/metrics/ragas.py (2)
69-89: Derive result column from metric instance to avoid version drift

Hard-coding result_key risks KeyError across ragas versions (e.g., context relevance). Use the metric instance’s name and guard missing columns.
-    def _evaluate_metric(  # pylint: disable=too-many-positional-arguments,too-many-arguments
+    def _evaluate_metric(  # pylint: disable=too-many-positional-arguments,too-many-arguments
         self,
         metric_class: Any,
         metric_kwargs: dict[str, Any] | None,
         dataset_dict: dict[str, Any],
-        result_key: str,
         metric_name: str,
     ) -> tuple[Optional[float], str]:
@@
-        metric_instance = metric_class(llm=self.llm_manager.get_llm(), **metric_kwargs)
+        metric_instance = metric_class(llm=self.llm_manager.get_llm(), **metric_kwargs)
 
         result = evaluate(dataset, metrics=[metric_instance])
         df = result.to_pandas()
-        score = df[result_key].iloc[0]
+        col = getattr(metric_instance, "name", None)
+        if not col:
+            return None, f"Ragas {metric_name} missing metric name for result column"
+        try:
+            score = df[col].iloc[0]
+        except KeyError:
+            return None, f"Ragas {metric_name} result column '{col}' not found. Available: {list(df.columns)}"
         return score, f"Ragas {metric_name}: {score:.2f}"
Note: Update callers to drop the result_key argument.

279-284: Verify column name for ContextRelevance or rely on dynamic name

"nv_context_relevance" may not match ragas’ column in some versions. Safer to use the metric instance’s name per suggestion above.
src/lightspeed_evaluation/core/llm/manager.py (4)
53-59: Prefer logger over print in validators

Use logging for _validate_ollama_env (info) instead of printing.
+import logging
@@
+logger = logging.getLogger(__name__)
@@
-    if not os.environ.get("OLLAMA_HOST"):
-        print("ℹ️ OLLAMA_HOST not set, using default localhost:11434")
+    if not os.environ.get("OLLAMA_HOST"):
+        logger.info("OLLAMA_HOST not set, using default localhost:11434")
74-76: Replace constructor print with logger

Emits to stdout; prefer logger.
-        print(
-            f"✅ LLM Manager: {self.config.provider}/{self.config.model} -> {self.model_name}"
-        )
+        logger.info(
+            "LLM Manager: %s/%s -> %s",
+            self.config.provider, self.config.model, self.model_name
+        )
167-169: Replace EmbeddingManager print with logger

Same logging consistency nit.
-        print(
-            f"""
-✅ Embedding Manager: {self.config.provider} -- {self.config.model} {self.config.provider_kwargs}"""
-        )
+        logger.info(
+            "Embedding Manager: %s -- %s %s",
+            self.config.provider, self.config.model, self.config.provider_kwargs
+        )
189-192: Fix docstring: EmbeddingManager factory

Minor wording nit.
     @classmethod
     def from_system_config(cls, system_config: SystemConfig) -> "EmbeddingManager":
-        """Create LLM Manager from system configuration."""
+        """Create Embedding Manager from system configuration."""
         return cls(system_config.embedding)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0232ecd and bbc8e22.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (14)

config/system.yaml (1 hunks)
pyproject.toml (1 hunks)
src/lightspeed_evaluation/core/constants.py (1 hunks)
src/lightspeed_evaluation/core/llm/manager.py (3 hunks)
src/lightspeed_evaluation/core/llm/ragas.py (2 hunks)
src/lightspeed_evaluation/core/metrics/custom.py (1 hunks)
src/lightspeed_evaluation/core/metrics/deepeval.py (1 hunks)
src/lightspeed_evaluation/core/metrics/ragas.py (9 hunks)
src/lightspeed_evaluation/core/models/__init__.py (2 hunks)
src/lightspeed_evaluation/core/models/system.py (3 hunks)
src/lightspeed_evaluation/core/system/loader.py (2 hunks)
src/lightspeed_evaluation/pipeline/evaluation/evaluator.py (3 hunks)
src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1 hunks)
tests/unit/core/llm/test_manager.py (3 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-09-10T15:48:14.671Z

Learnt from: asamal4
PR: lightspeed-core/lightspeed-evaluation#47
File: src/lightspeed_evaluation/core/output/generator.py:43-49
Timestamp: 2025-09-10T15:48:14.671Z
Learning: In the lightspeed-evaluation framework, system configuration uses Pydantic data models (SystemConfig, OutputConfig, LoggingConfig, etc.) rather than plain dictionaries. Components like OutputHandler receive properly structured Pydantic models, so direct attribute access (e.g., system_config.output.enabled_outputs) is the correct approach.

Applied to files:

src/lightspeed_evaluation/core/models/__init__.py
src/lightspeed_evaluation/pipeline/evaluation/evaluator.py

🔇 Additional comments (12)

src/lightspeed_evaluation/core/metrics/deepeval.py (1)

18-18: OK to remove from_system_config; class-level pylint disable is fine.

src/lightspeed_evaluation/core/models/__init__.py (2)

13-13: Re-export of EmbeddingConfig looks good.

30-30: all update is consistent.

config/system.yaml (1)

11-19: Helpful Gemini example.

Comment-only change improves discoverability.

tests/unit/core/llm/test_manager.py (2)

7-13: Public API import path changes look correct.

106-126: SystemConfig-based construction test reads well.

Covers nested provider_kwargs shape.

src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1)

55-55: Constructor change: ensure MetricsEvaluator signature updated everywhere.

Found call at src/lightspeed_evaluation/pipeline/evaluation/pipeline.py:55 — metrics_evaluator = MetricsEvaluator(self.config_loader). ast-grep did not locate the MetricsEvaluator class/constructor in the repo; confirm MetricsEvaluator.init no longer requires llm_manager and update the constructor or any remaining call sites that pass llm_manager.
pyproject.toml (1)
30-31: Make embedding deps optional & verify sentence-transformers version compatibility

pyproject.toml (lines 30–31): "langchain[huggingface]>=0.3.27" and "sentence-transformers>=5.1.0" — verification failed (SSL certificate error when querying PyPI). Confirm that sentence-transformers 5.1.0 exists and is compatible with pinned torch==2.7.0 (5.1.0 may be too new).

Move heavy embedding deps to an optional extra to avoid pulling them when embeddings aren’t used. Suggested change (apply after verifying versions):
@@
-    "tenacity>=9.1.2",
-    "langchain[huggingface]>=0.3.27",
-    "sentence-transformers>=5.1.0",
+    "tenacity>=9.1.2",
 ]
@@
+[project.optional-dependencies]
+embeddings = [
+    "langchain[huggingface]>=0.3.27",
+    # pick a known-good floor; adjust after verification
+    "sentence-transformers>=3.0.0",
+]
src/lightspeed_evaluation/core/models/system.py (1)

225-227: Good addition: defaulted embedding config on SystemConfig

Wiring EmbeddingConfig with default_factory is correct and consistent with LLM. No issues.

Please confirm DEFAULT_EMBEDDING_PROVIDER/MODEL defaults point to actually installed providers/models in pyproject to avoid runtime ImportError.

src/lightspeed_evaluation/pipeline/evaluation/evaluator.py (2)

37-38: Correct dependency injection into RagasMetrics

Passing both LLM and Embedding managers is the right move.

26-35: Constructor call sites updated — no action required

Search found a single instantiation: MetricsEvaluator(self.config_loader) at src/lightspeed_evaluation/pipeline/evaluation/pipeline.py:55; no callers pass llm_manager.

src/lightspeed_evaluation/core/metrics/ragas.py (1)

147-153: Nice: embeddings passed for ResponseRelevancy

Correctly wiring embeddings into ResponseRelevancy.

src/lightspeed_evaluation/core/llm/ragas.py

src/lightspeed_evaluation/core/models/system.py

src/lightspeed_evaluation/core/system/loader.py

asamal4 · 2025-09-22T07:00:30Z

src/lightspeed_evaluation/core/llm/manager.py

+            "openai": _validate_openai_env,
+            # "google": _validate_gemini_env, # Google embeddings are not supported at the moment


Do we really need to support openai embedding model ??
For embedding models I would rather prefer just the open source models support.
It is very unlikely that someone would use openai embedding model, when opensource models are equally capable

Yes, because this is the default for Ragas. Also Huggingface embeddings runs locally which might not be ideal for some scenarios.

because this is the default for Ragas

I don't think we have to keep same default or use same model.

runs locally which might not be ideal for some scenarios

not sure about the scenario (unless system resource is unreasonably low)..

openAI adds cost, limited options and similar quality models are available free..
with huggingface most model gets downloaded easily, if it is not present in the system. To avoid this, team using eval tool can provide model path (the way it is done in OLS rag)
For local run, it is one time download.

But I guess, no harm in keeping it as an option.

asamal4 · 2025-09-22T07:01:29Z

src/lightspeed_evaluation/core/constants.py

+DEFAULT_EMBEDDING_PROVIDER = "openai"
+DEFAULT_EMBEDDING_MODEL = "text-embedding-3-small"


This is just for reference, still I would suggest to keep an opensource model as default.

Note that choosing some huggingface model requires downloading and running it locally, IMHO a big change from the current setup.

Fo default, I would personally prefer a free model..
with huggingface most model gets downloaded easily, if it is not present in the system.

For openai, team needs to have the access/key..
I know we have openai as default LLM, but majorly people use gemini, so they won't have embedding model.

asamal4 · 2025-09-22T07:04:39Z

src/lightspeed_evaluation/core/llm/manager.py

        return cls(llm_config)
+
+
+class EmbeddingManager:  # pylint: disable=too-few-public-methods


Please keep embedding model initialization code outside llm. in a folder named embedding..

Have to say that the current llm configuration is slightly over complicated. But I can create another module for sure :)

Thank you..

Have to say that the current llm configuration is slightly over complicated

it seems complicated, but we can support new providers very easily. Perhaps a separate discussion..
Please ping me or create a story with details, so that we can improve.

asamal4 · 2025-09-22T07:06:43Z

src/lightspeed_evaluation/core/llm/ragas.py


 import litellm
+from langchain_huggingface import HuggingFaceEmbeddings
+from langchain_openai import OpenAIEmbeddings


Same as earlier, I would rather prefer just the opensource embedding models.

src/lightspeed_evaluation/core/models/system.py

asamal4 · 2025-09-22T07:31:17Z

config/system.yaml

+# Example Google Gemini LLM
+# llm:
+#   provider: "gemini"
+#   model: "gemini-1.5-pro"    
+#   temperature: 0.0  
+#   max_tokens: 512  
+#   timeout: 120        
+#   num_retries: 3


One example is already in place for llm (openai)..
Can we remove gemini specific example from this file.. we are not going to add every llm provider example here.

Agreed, I'll move it to some docs.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

src/lightspeed_evaluation/core/embedding/ragas.py (1)
18-24: Add error handling for unsupported providers.

The current implementation raises a generic RuntimeError for unknown providers, but this should align with the validation pattern established elsewhere in the codebase.

Consider creating a specific EmbeddingError and importing it from the manager module:
+from ..embedding.manager import EmbeddingError
+
 embedding_class = {
     "openai": OpenAIEmbeddings,
     "huggingface": HuggingFaceEmbeddings,
 }.get(config.provider)
 if not embedding_class:
-    raise RuntimeError(f"Unknown embedding provider {config.provider}")
+    raise EmbeddingError(f"Unknown embedding provider {config.provider}")
src/lightspeed_evaluation/core/models/system.py (1)
72-85: Consider validation of provider_kwargs structure.

While the typing is correct, consider adding validation to ensure provider_kwargs contains expected keys for each provider to catch configuration errors early.
 provider_kwargs: Optional[dict[str, Any]] = Field(
     default=None,
     description="Embedding provider arguments, e.g. model_kwargs: device:cpu",
 )

+@field_validator("provider_kwargs")
+@classmethod
+def _validate_provider_kwargs(cls, v: Optional[dict[str, Any]], info) -> Optional[dict[str, Any]]:
+    if v is None:
+        return v
+    
+    provider = info.data.get("provider", "")
+    if provider == "huggingface" and "model_kwargs" in v:
+        # Validate model_kwargs is a dict
+        if not isinstance(v["model_kwargs"], dict):
+            raise ValueError("model_kwargs must be a dictionary")
+    return v
src/lightspeed_evaluation/core/embedding/manager.py (2)
29-40: Potential validation redundancy with EmbeddingConfig.

The manager validates providers again, but EmbeddingConfig already has field validation. This could create inconsistency if the allowed sets diverge.

Consider leveraging the EmbeddingConfig validation rather than duplicating it:
 def _validate_config(self) -> None:
     """Validate config and env variables."""
     
     def empty_check() -> None:
         pass

-    env_validator = {
-        "openai": validate_openai_env,
-        # "google": _validate_gemini_env, # Google embeddings are not supported at the moment
-        "huggingface": empty_check,
-    }.get(self.config.provider)
-
-    if env_validator is None:
-        raise EmbeddingError(
-            f"Unsupported embedding provider {self.config.provider}"
-        )
-
-    env_validator()
+    # Note: Provider validation is already handled by EmbeddingConfig
+    env_validators = {
+        "openai": validate_openai_env,
+        "huggingface": empty_check,
+    }
+    
+    validator = env_validators.get(self.config.provider, empty_check)
+    validator()
18-21: Consider more informative success message.

The current print statement formatting could be cleaner and more consistent with other manager outputs.
-print(
-    f"""
-✅ Embedding Manager: {self.config.provider} -- {self.config.model} {self.config.provider_kwargs}"""
-)
+print(f"✅ Embedding Manager: {self.config.provider}/{self.config.model}")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bbc8e22 and 0a3f7f6.

📒 Files selected for processing (12)

README.md (1 hunks)
config/system.yaml (1 hunks)
src/lightspeed_evaluation/core/embedding/manager.py (1 hunks)
src/lightspeed_evaluation/core/embedding/ragas.py (1 hunks)
src/lightspeed_evaluation/core/llm/__init__.py (2 hunks)
src/lightspeed_evaluation/core/llm/manager.py (3 hunks)
src/lightspeed_evaluation/core/metrics/ragas.py (9 hunks)
src/lightspeed_evaluation/core/models/system.py (3 hunks)
src/lightspeed_evaluation/core/system/loader.py (2 hunks)
src/lightspeed_evaluation/pipeline/evaluation/evaluator.py (2 hunks)
src/lightspeed_evaluation/pipeline/evaluation/pipeline.py (1 hunks)
tests/unit/core/llm/test_manager.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

config/system.yaml
src/lightspeed_evaluation/pipeline/evaluation/pipeline.py
src/lightspeed_evaluation/pipeline/evaluation/evaluator.py
src/lightspeed_evaluation/core/system/loader.py

🧰 Additional context used

🧬 Code graph analysis (6)

src/lightspeed_evaluation/core/llm/__init__.py (1)

src/lightspeed_evaluation/core/llm/manager.py (4)

LLMError (9-10)

LLMManager (61-156)

validate_gemini_env (43-50)

validate_openai_env (13-18)

src/lightspeed_evaluation/core/embedding/ragas.py (1)

src/lightspeed_evaluation/core/embedding/manager.py (1)

EmbeddingManager (11-45)

tests/unit/core/llm/test_manager.py (2)

src/lightspeed_evaluation/core/llm/manager.py (1)

LLMManager (61-156)

src/lightspeed_evaluation/core/models/system.py (2)

LLMConfig (33-64)

SystemConfig (228-255)

src/lightspeed_evaluation/core/embedding/manager.py (2)

src/lightspeed_evaluation/core/llm/manager.py (1)

validate_openai_env (13-18)

src/lightspeed_evaluation/core/models/system.py (2)

EmbeddingConfig (67-95)

SystemConfig (228-255)

src/lightspeed_evaluation/core/metrics/ragas.py (5)

src/lightspeed_evaluation/core/embedding/manager.py (1)

EmbeddingManager (11-45)

src/lightspeed_evaluation/core/embedding/ragas.py (1)

RagasEmbeddingManager (10-31)

src/lightspeed_evaluation/core/llm/manager.py (3)

LLMManager (61-156)

get_model_name (130-132)

get_litellm_params (134-142)

src/lightspeed_evaluation/core/llm/ragas.py (2)

RagasLLMManager (84-111)

get_llm (102-104)

src/lightspeed_evaluation/core/models/data.py (2)

EvaluationScope (211-222)

TurnData (34-126)

src/lightspeed_evaluation/core/llm/manager.py (2)

src/lightspeed_evaluation/core/models/system.py (2)

LLMConfig (33-64)

SystemConfig (228-255)

src/lightspeed_evaluation/core/embedding/manager.py (1)

from_system_config (43-45)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: mypy
GitHub Check: tests (3.13)
GitHub Check: tests (3.11)

🔇 Additional comments (14)

README.md (1)

159-184: LGTM: Well-documented embedding configuration example.

The non-OpenAI configuration example provides clear documentation of the new embedding integration, including HuggingFace embeddings with provider_kwargs. The examples align with the implementation changes in the codebase.

tests/unit/core/llm/test_manager.py (2)

7-11: LGTM: Imports aligned with new architecture.

The imports now correctly include SystemConfig and utilize the centralized import structure from the core models module.

105-114: LGTM: Test updated to use SystemConfig validation.

The test correctly constructs a SystemConfig using model_validate, which ensures proper validation of the configuration structure including the new embedding configuration.

src/lightspeed_evaluation/core/llm/__init__.py (1)

5-5: LGTM: Environment validators exposed for reuse.

Making the environment validators publicly available through the module's API allows them to be reused by the embedding manager and other components, promoting code reuse.

Also applies to: 14-16

src/lightspeed_evaluation/core/embedding/ragas.py (1)

25-31: LGTM: Proper kwargs handling and embedding wrapper construction.

The implementation correctly handles None provider_kwargs and constructs the LangchainEmbeddingsWrapper as expected by Ragas, based on the web search results showing this is the proper integration pattern.

src/lightspeed_evaluation/core/models/system.py (2)

87-95: LGTM: Provider validation addresses previous feedback.

The field validator properly restricts embedding providers to the supported set (openai, huggingface) and provides clear error messages, addressing the past review comment about documentation misalignment.

235-237: Address reviewer's question about API embedding requirement.

Based on the past review comment questioning why embedding is required for API functionality, consider if the embedding field should be optional or have conditional validation based on API configuration.

src/lightspeed_evaluation/core/llm/manager.py (3)

13-19: LGTM: Proper environment validator extraction.

Moving the environment validators to module level enables reuse by the embedding manager and maintains consistent validation patterns across the codebase.

149-151: LGTM: SystemConfig integration properly implemented.

The updated from_system_config method correctly uses the structured SystemConfig object and accesses the llm field, aligning with the new configuration architecture.

53-59: Informational message for Ollama is appropriate.

The informational message for missing OLLAMA_HOST provides helpful guidance without failing validation, which is appropriate for local development scenarios.

src/lightspeed_evaluation/core/metrics/ragas.py (4)

28-28: LGTM: Clean embedding manager integration.

The constructor properly accepts both LLM and embedding managers, and the RagasEmbeddingManager initialization follows the established pattern.

Also applies to: 33-41

71-85: LGTM: Enhanced metric evaluation with embeddings support.

The updated _evaluate_metric method signature and implementation properly supports metric-specific configuration, enabling embedding injection for response relevancy metrics.

176-178: LGTM: Consistent metric_kwargs parameter passing.

All metric evaluation methods now consistently pass metric_kwargs, maintaining API consistency even when no specific kwargs are needed (empty dict for most metrics).

Also applies to: 200-205, 230-236, 261-263, 281-286

149-155: Approve — embedding integration aligns with Ragas. ResponseRelevancy accepts an "embeddings" kwarg expecting a LangChain-style embeddings object; RagasEmbeddingManager provides a LangchainEmbeddingsWrapper, so passing self.embedding_manager.embeddings is correct.

tisnik

LGTM

github-advanced-security bot found potential problems Sep 19, 2025

View reviewed changes

src/lightspeed_evaluation/pipeline/evaluation/evaluator.py Fixed Show fixed Hide fixed

VladimirKadlec force-pushed the embedding-config branch 2 times, most recently from 5b529a6 to 52c9984 Compare September 19, 2025 11:17

feat: add support for custom embedding model

d7fd99e

VladimirKadlec force-pushed the embedding-config branch from 52c9984 to d7fd99e Compare September 19, 2025 11:21

VladimirKadlec marked this pull request as ready for review September 19, 2025 11:25

VladimirKadlec added 2 commits September 19, 2025 13:48

test: add basic test

f07f7bf

merge: main

bbc8e22

coderabbitai bot reviewed Sep 22, 2025

View reviewed changes

src/lightspeed_evaluation/core/llm/ragas.py Outdated Show resolved Hide resolved

src/lightspeed_evaluation/core/models/system.py Show resolved Hide resolved

src/lightspeed_evaluation/core/system/loader.py Outdated Show resolved Hide resolved

asamal4 reviewed Sep 22, 2025

View reviewed changes

VladimirKadlec added 2 commits September 22, 2025 14:29

fix: address PR comments

3b2baa4

Merge branch 'main' into embedding-config

0a3f7f6

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

asamal4 approved these changes Sep 23, 2025

View reviewed changes

tisnik approved these changes Sep 24, 2025

View reviewed changes

tisnik merged commit b5961d5 into lightspeed-core:main Sep 24, 2025
15 checks passed

coderabbitai bot mentioned this pull request Sep 26, 2025

add vertex to main eval & refactor #63

Merged

VladimirKadlec deleted the embedding-config branch October 3, 2025 11:29

coderabbitai bot mentioned this pull request Oct 7, 2025

add common custom llm #70

Merged

This was referenced Nov 11, 2025

LEADS-113: Added support for gemini embedding models #99

Merged

GEval Integration #97

Merged

coderabbitai bot mentioned this pull request Dec 19, 2025

[LEADS-140] lightspeed-evaluation has dependency on torch and nvidia* packages that are not required for all usecases #123

Merged

15 tasks

coderabbitai bot mentioned this pull request Jan 7, 2026

[LEADS-172] Support classical evaluation metrics #130

Merged

15 tasks

		"openai": _validate_openai_env,
		# "google": _validate_gemini_env, # Google embeddings are not supported at the moment

		DEFAULT_EMBEDDING_PROVIDER = "openai"
		DEFAULT_EMBEDDING_MODEL = "text-embedding-3-small"

		return cls(llm_config)


		class EmbeddingManager: # pylint: disable=too-few-public-methods

Conversation

VladimirKadlec commented Sep 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Uh oh!

asamal4 commented Sep 19, 2025

Uh oh!

coderabbitai bot commented Sep 19, 2025

Uh oh!

coderabbitai bot commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

VladimirKadlec commented Sep 19, 2025

Uh oh!

asamal4 commented Sep 22, 2025

Uh oh!

coderabbitai bot commented Sep 22, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

VladimirKadlec commented Sep 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 19, 2025 •

edited

Loading