[LEADS-140] lightspeed-evaluation has dependency on torch and nvidia* packages that are not required for all usecases#123
Conversation
|
Warning Rate limit exceeded@bsatapat-jpg has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 48 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (5)
WalkthroughAdds optional local HuggingFace embedding support: docs and install instructions, an optional dependency entry in pyproject, a pre-flight HuggingFace dependency check in the embedding manager, and explicit provider-dispatch plus optional disk-cache wiring in the Ragas embedding path. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant EmbeddingManager
participant EmbeddingFactory
participant DiskCache
participant Provider
Caller->>EmbeddingManager: request embeddings (config, model, cache?)
EmbeddingManager->>EmbeddingManager: _validate_config()
alt provider == "huggingface"
EmbeddingManager->>EmbeddingManager: check_huggingface_available()
end
EmbeddingManager->>EmbeddingFactory: build provider instance (provider, model, kwargs)
EmbeddingFactory->>Provider: initialize provider (OpenAI / HuggingFace / Gemini)
alt cache enabled
EmbeddingFactory->>DiskCache: create DiskCacheBackend(...)
DiskCache-->>EmbeddingFactory: cache backend
EmbeddingFactory->>Provider: wrap provider with DiskCacheBackend
end
EmbeddingFactory-->>EmbeddingManager: return embeddings wrapper
EmbeddingManager-->>Caller: embeddings wrapper ready
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/lightspeed_evaluation/core/embedding/ragas.py (1)
12-14: Improve type hint for HuggingFaceEmbeddings.Using
Anyas a type hint reduces type safety. Consider using a more specific type or a protocol.🔎 Suggested improvement
-# HuggingFace embeddings are optional - only imported when needed -# This avoids pulling in torch/nvidia packages (~6GB) when using OpenAI/Gemini -HuggingFaceEmbeddings: Any = None +from typing import Optional, Type + +# HuggingFace embeddings are optional - only imported when needed +# This avoids pulling in torch/nvidia packages (~6GB) when using OpenAI/Gemini +HuggingFaceEmbeddings: Optional[Type] = NoneAs per coding guidelines, provide type hints for all public functions and methods.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
README.md(1 hunks)docs/configuration.md(1 hunks)pyproject.toml(2 hunks)src/lightspeed_evaluation/core/embedding/manager.py(2 hunks)src/lightspeed_evaluation/core/embedding/ragas.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/lightspeed_evaluation/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/lightspeed_evaluation/**/*.py: Provide type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels in lightspeed_evaluation modules
Files:
src/lightspeed_evaluation/core/embedding/manager.pysrc/lightspeed_evaluation/core/embedding/ragas.py
🧠 Learnings (2)
📚 Learning: 2025-09-22T07:24:57.366Z
Learnt from: VladimirKadlec
Repo: lightspeed-core/lightspeed-evaluation PR: 56
File: src/lightspeed_evaluation/core/llm/ragas.py:127-140
Timestamp: 2025-09-22T07:24:57.366Z
Learning: HuggingFaceEmbeddings from langchain_huggingface accepts both `model` and `model_name` parameters - `model` serves as an alias for `model_name` parameter in the constructor.
Applied to files:
docs/configuration.md
📚 Learning: 2025-09-22T07:24:57.366Z
Learnt from: VladimirKadlec
Repo: lightspeed-core/lightspeed-evaluation PR: 56
File: src/lightspeed_evaluation/core/llm/ragas.py:127-140
Timestamp: 2025-09-22T07:24:57.366Z
Learning: HuggingFaceEmbeddings from langchain_huggingface accepts both `model` and `model_name` parameters - both work as valid parameter names in the constructor due to internal parameter handling.
Applied to files:
docs/configuration.md
🧬 Code graph analysis (1)
src/lightspeed_evaluation/core/embedding/ragas.py (1)
src/lightspeed_evaluation/core/embedding/manager.py (1)
EmbeddingManager(33-61)
🪛 GitHub Actions: Python linter
src/lightspeed_evaluation/core/embedding/ragas.py
[error] 19-19: Pylint: W0603 - Using the global statement (global-statement).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: tests (3.12)
- GitHub Check: tests (3.11)
- GitHub Check: mypy
🔇 Additional comments (9)
README.md (1)
34-47: LGTM! Clear documentation for optional dependencies.The documentation clearly explains the distinction between remote and local embedding providers and provides installation instructions. The note about PyTorch size (~6GB) helps users make informed decisions.
docs/configuration.md (2)
40-40: LGTM! Provider description updated to reflect optional dependencies.The updated description clearly indicates that HuggingFace requires optional dependencies.
46-60: LGTM! Excellent documentation for embedding model options.This section clearly explains the tradeoffs between remote and local embedding models and provides installation guidance. Well-aligned with the PR's objectives.
pyproject.toml (3)
29-31: LGTM! Dependency restructuring supports optional local embeddings.Removing the
[huggingface]extra from langchain and making it a separate optional dependency group is the right approach for this use case.
53-56: LGTM! Dev dependencies include local embeddings for comprehensive testing.Including the local-embeddings packages in dev dependencies ensures full test coverage.
33-39: Update langchain-huggingface minimum version constraint to reflect current stability.The current constraint
langchain-huggingface>=0.1.0is over a year old and too permissive. Consider raising this to at least>=0.3.0or>=1.0.0to ensure better stability and access to recent bug fixes. Thesentence-transformers>=5.1.0constraint is acceptable and compatible with current releases.src/lightspeed_evaluation/core/embedding/ragas.py (1)
44-51: LGTM! Explicit provider branching improves clarity.The change from dictionary-based lookup to explicit if-elif branches makes the code more readable and allows for the lazy loading pattern with HuggingFace.
src/lightspeed_evaluation/core/embedding/manager.py (2)
11-31: LGTM! Excellent pre-flight dependency check with helpful error message.The
_check_huggingface_available()function provides clear, actionable guidance when dependencies are missing. The error message includes multiple installation options and suggests alternatives.Minor suggestion: Add explicit
-> Nonereturn type hint for consistency with coding guidelines.Optional refinement
def _check_huggingface_available() -> None: """Check if HuggingFace/sentence-transformers dependencies are available. Raises: EmbeddingError: If required packages are not installed. """(Already present - this is just for reference)
As per coding guidelines, provide type hints for all public functions and methods.
52-52: LGTM! Validation integration is clean.The call to
_check_huggingface_available()properly validates dependencies before attempting to use the HuggingFace provider.
88f60ff to
6a6f61b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/lightspeed_evaluation/core/embedding/ragas.py (1)
17-32: Thread-safety concern persists; consider simplifying error handling.The lazy import implementation has a race condition at line 19 where multiple threads can simultaneously pass the check and execute the import block. While the practical impact is minimal (duplicate imports are harmless), this was flagged in a previous review and marked as resolved but wasn't actually fixed.
Additionally, since
EmbeddingManager._validate_config()already calls_check_huggingface_available()to validate dependencies before this code runs (see manager.py lines 51-52), theImportErrorhere at lines 27-31 is redundant defensive code. The different error messages between the two locations could confuse users.🔎 Optional: Add thread lock for safety
If concurrent embedding initialization is a concern in your environment:
+import threading from typing import Any from langchain_google_genai import GoogleGenerativeAIEmbeddings from langchain_openai import OpenAIEmbeddings from ragas.cache import DiskCacheBackend from ragas.embeddings import LangchainEmbeddingsWrapper from lightspeed_evaluation.core.embedding.manager import EmbeddingManager # HuggingFace embeddings are optional - only imported when needed # This avoids pulling in torch/nvidia packages (~6GB) when using OpenAI/Gemini _CACHED_IMPORTS: dict[str, Any] = {} +_import_lock = threading.Lock() def _get_huggingface_embeddings() -> Any: """Lazily import HuggingFaceEmbeddings to avoid loading torch unless needed.""" - if "HuggingFaceEmbeddings" not in _CACHED_IMPORTS: - try: - from langchain_huggingface import ( # pylint: disable=import-outside-toplevel - HuggingFaceEmbeddings, - ) - - _CACHED_IMPORTS["HuggingFaceEmbeddings"] = HuggingFaceEmbeddings - except ImportError as e: - raise ImportError( - "HuggingFace embeddings require additional dependencies. " - "Install with: pip install 'lightspeed-evaluation[local-embeddings]' " - "or: pip install langchain-huggingface sentence-transformers" - ) from e + with _import_lock: + if "HuggingFaceEmbeddings" not in _CACHED_IMPORTS: + from langchain_huggingface import ( # pylint: disable=import-outside-toplevel + HuggingFaceEmbeddings, + ) + _CACHED_IMPORTS["HuggingFaceEmbeddings"] = HuggingFaceEmbeddings return _CACHED_IMPORTS["HuggingFaceEmbeddings"]Note: The try-except can be removed since manager.py already validates dependencies.
Based on past review comments and coding guidelines, consider using structured logging for import operations.
pyproject.toml (1)
33-39: Consider tightening the langchain-huggingface version constraint.The specified versions are valid—langchain-huggingface 0.1.0 exists (released Oct 28, 2024, with current latest at 1.2.0 as of Dec 12, 2025), and sentence-transformers 5.1.0 exists (released Aug 6, 2025, with current latest at 5.2.0 as of Dec 11, 2025). However,
langchain-huggingface>=0.1.0is overly permissive given the package's rapid development cycles. Consider specifying a more precise constraint likelangchain-huggingface>=0.1.0,<2.0.0or>=0.3.0to ensure compatible behavior across patch versions.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
README.md(1 hunks)docs/configuration.md(1 hunks)pyproject.toml(2 hunks)src/lightspeed_evaluation/core/embedding/manager.py(2 hunks)src/lightspeed_evaluation/core/embedding/ragas.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/configuration.md
- README.md
🧰 Additional context used
📓 Path-based instructions (1)
src/lightspeed_evaluation/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/lightspeed_evaluation/**/*.py: Provide type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels in lightspeed_evaluation modules
Files:
src/lightspeed_evaluation/core/embedding/ragas.pysrc/lightspeed_evaluation/core/embedding/manager.py
🧬 Code graph analysis (1)
src/lightspeed_evaluation/core/embedding/ragas.py (1)
src/lightspeed_evaluation/core/embedding/manager.py (1)
EmbeddingManager(33-61)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: tests (3.12)
- GitHub Check: mypy
- GitHub Check: tests (3.13)
- GitHub Check: tests (3.11)
🔇 Additional comments (4)
pyproject.toml (1)
29-31: LGTM! Core dependency simplified correctly.Removing the
[huggingface]extra from the langchain dependency is the right approach to make torch and nvidia packages optional. This aligns perfectly with the PR objective.src/lightspeed_evaluation/core/embedding/manager.py (1)
52-52: LGTM! Proper validation integration.The call to
_check_huggingface_available()is correctly placed in the validation flow, ensuring dependencies are checked before attempting to use HuggingFace embeddings.src/lightspeed_evaluation/core/embedding/ragas.py (2)
43-51: LGTM! Provider selection correctly integrates lazy loading.The conditional logic properly delegates to
_get_huggingface_embeddings()for HuggingFace embeddings while keeping direct references for other providers. TheAnytype annotation forembedding_classis appropriate given the heterogeneous provider classes.
54-55: Good defensive programming.The None-check for
kwargsprevents potential TypeErrors and makes the initialization more robust.
VladimirKadlec
left a comment
There was a problem hiding this comment.
Thanks for this! One comment bellow.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/lightspeed_evaluation/core/embedding/manager.py (1)
7-8: EmbeddingError should be moved to core.system.exceptions.As per coding guidelines, custom exceptions should be defined in
core.system.exceptionsfor consistency with other modules (llm, script, system, api).
🧹 Nitpick comments (1)
src/lightspeed_evaluation/core/embedding/manager.py (1)
41-44: Consider using structured logging instead of print().As per coding guidelines, use structured logging with appropriate levels in lightspeed_evaluation modules. The initialization confirmation could use
logger.info()instead.🔎 Proposed fix
+import logging + +logger = logging.getLogger(__name__) + # In __init__: - print( - f""" -✅ Embedding Manager: {self.config.provider} -- {self.config.model} {self.config.provider_kwargs}""" - ) + logger.info( + "Embedding Manager initialized: %s -- %s %s", + self.config.provider, + self.config.model, + self.config.provider_kwargs, + )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
README.md(1 hunks)docs/configuration.md(1 hunks)pyproject.toml(2 hunks)src/lightspeed_evaluation/core/embedding/manager.py(2 hunks)src/lightspeed_evaluation/core/embedding/ragas.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/lightspeed_evaluation/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/lightspeed_evaluation/**/*.py: Provide type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels in lightspeed_evaluation modules
Files:
src/lightspeed_evaluation/core/embedding/manager.pysrc/lightspeed_evaluation/core/embedding/ragas.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T10:05:06.422Z
Learning: Do not add new features to lsc_agent_eval/ directory - add to src/lightspeed_evaluation/ instead
📚 Learning: 2025-09-22T07:24:57.366Z
Learnt from: VladimirKadlec
Repo: lightspeed-core/lightspeed-evaluation PR: 56
File: src/lightspeed_evaluation/core/llm/ragas.py:127-140
Timestamp: 2025-09-22T07:24:57.366Z
Learning: HuggingFaceEmbeddings from langchain_huggingface accepts both `model` and `model_name` parameters - `model` serves as an alias for `model_name` parameter in the constructor.
Applied to files:
docs/configuration.md
📚 Learning: 2025-09-22T07:24:57.366Z
Learnt from: VladimirKadlec
Repo: lightspeed-core/lightspeed-evaluation PR: 56
File: src/lightspeed_evaluation/core/llm/ragas.py:127-140
Timestamp: 2025-09-22T07:24:57.366Z
Learning: HuggingFaceEmbeddings from langchain_huggingface accepts both `model` and `model_name` parameters - both work as valid parameter names in the constructor due to internal parameter handling.
Applied to files:
docs/configuration.md
📚 Learning: 2025-12-11T10:05:06.422Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T10:05:06.422Z
Learning: Applies to src/lightspeed_evaluation/**/*.py : Use custom exceptions from core.system.exceptions for error handling
Applied to files:
src/lightspeed_evaluation/core/embedding/manager.py
🧬 Code graph analysis (1)
src/lightspeed_evaluation/core/embedding/ragas.py (1)
src/lightspeed_evaluation/core/embedding/manager.py (2)
EmbeddingManager(34-62)check_huggingface_available(11-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: black
- GitHub Check: mypy
- GitHub Check: tests (3.11)
- GitHub Check: tests (3.13)
- GitHub Check: tests (3.12)
🔇 Additional comments (8)
README.md (1)
34-47: LGTM! Clear documentation for optional local embeddings.The documentation clearly explains when local embeddings are needed, provides both pip and uv installation commands, and includes a helpful note about the ~6GB PyTorch requirement. This aligns well with the pyproject.toml changes.
docs/configuration.md (1)
46-59: Good documentation of the remote vs local embedding trade-offs.The explanation of why local embeddings are optional (PyTorch ~6GB) and the installation instructions are clear and consistent with README.md. This helps users understand the design decision.
pyproject.toml (2)
53-56: Dev dependencies correctly include local-embeddings packages for testing.Including these in dev ensures full test coverage for the HuggingFace embedding path.
33-39: Clean separation of optional dependencies.The optional-dependencies section properly isolates the heavy HuggingFace/PyTorch packages. langchain-huggingface is actively maintained with version 1.2.0 released in December 2025, making the >=0.1.0 constraint appropriate. sentence-transformers version 5.2.0 is available, confirming the >=5.1.0 constraint is reasonable. The comment explaining the ~6GB size and when these packages are needed is helpful for users browsing pyproject.toml.
src/lightspeed_evaluation/core/embedding/manager.py (1)
11-31: Well-implemented dependency check with comprehensive error messaging.The
check_huggingface_available()function provides clear guidance with multiple installation options (pip, uv, manual) and an alternative (remote providers). The docstring follows Google-style conventions and includes theRaisessection.src/lightspeed_evaluation/core/embedding/ragas.py (3)
10-13: Good refactoring to centralize the HuggingFace availability check.Importing
check_huggingface_availablefrommanager.pyaddresses the previous review feedback about duplicate import checks and ensures consistent error messaging.
20-30: Lazy loading implementation is functional.The lazy import pattern successfully defers torch loading until HuggingFace embeddings are requested. The implementation correctly uses
check_huggingface_available()before attempting the import.Regarding the thread-safety concern from previous reviews: the current implementation without a lock may result in duplicate imports if multiple threads race, but since this typically happens once at startup and duplicate imports are harmless (just wasted work), the trade-off is acceptable for this use case.
41-49: Clean provider selection with lazy loading integration.The provider selection logic is clear and the lazy loading for HuggingFace is properly integrated. Using
Anyforembedding_classis appropriate given the dynamic nature of the lazy loading.
There was a problem hiding this comment.
Changes look good, however we need to do few changes.
- keep langchain[huggingface] as default, only keep sentence-embedding as optional. Check the package size, it langchain huggingface should not create huge difference.
- update and add lock file.
Below changes can be a separate PR (above will fix current issue)
- Move embedding model object creation to manager.py, so that we only check once.
- add a ability to install only cpu version.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/lightspeed_evaluation/core/embedding/manager.py (1)
7-8: Consider moving EmbeddingError to core.system.exceptions.A previous review comment flagged that
EmbeddingErrorshould be defined incore.system.exceptionsto align with project exception handling standards. Other modules (llm, script, system, api) import their exceptions from the centralized module. This comment remains unaddressed.Based on coding guidelines, custom exceptions should be imported from
core.system.exceptions.
🧹 Nitpick comments (1)
src/lightspeed_evaluation/core/embedding/manager.py (1)
11-31: Add structured logging before raising EmbeddingError.The function immediately raises an exception when the dependency is missing. Per coding guidelines, lightspeed_evaluation modules should use structured logging with appropriate levels. Consider logging a warning or error before raising to aid debugging and observability.
🔎 Proposed enhancement with structured logging
Add logging at the module level and log before raising:
"""Embedding Manager - Generic embedding configuration, validation, and parameter provider.""" +import logging from lightspeed_evaluation.core.models import EmbeddingConfig, SystemConfig from lightspeed_evaluation.core.system.env_validator import validate_provider_env +logger = logging.getLogger(__name__) def check_huggingface_available() -> None: """Check if sentence-transformers dependency is available. Raises: EmbeddingError: If required packages are not installed. """ try: import sentence_transformers # noqa: F401 # pylint: disable=import-outside-toplevel,unused-import except ImportError as e: + logger.error( + "HuggingFace embedding provider requires sentence-transformers package, " + "which is not installed. Install with: pip install 'lightspeed-evaluation[local-embeddings]'" + ) raise EmbeddingError(
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
README.md(1 hunks)docs/configuration.md(1 hunks)pyproject.toml(2 hunks)src/lightspeed_evaluation/core/embedding/manager.py(2 hunks)src/lightspeed_evaluation/core/embedding/ragas.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- README.md
- docs/configuration.md
- src/lightspeed_evaluation/core/embedding/ragas.py
🧰 Additional context used
📓 Path-based instructions (1)
src/lightspeed_evaluation/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/lightspeed_evaluation/**/*.py: Provide type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels in lightspeed_evaluation modules
Files:
src/lightspeed_evaluation/core/embedding/manager.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T10:05:06.422Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T10:05:06.422Z
Learning: Applies to src/lightspeed_evaluation/**/*.py : Use custom exceptions from core.system.exceptions for error handling
Applied to files:
src/lightspeed_evaluation/core/embedding/manager.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: mypy
- GitHub Check: tests (3.13)
- GitHub Check: tests (3.12)
- GitHub Check: tests (3.11)
🔇 Additional comments (2)
pyproject.toml (1)
52-53: Inclusion of local embeddings in dev dependencies is appropriate.Adding
sentence-transformersto dev dependencies ensures full test coverage for the HuggingFace embedding provider. The comment clearly explains the purpose.src/lightspeed_evaluation/core/embedding/manager.py (1)
53-53: Dependency check integration looks correct.The call to
check_huggingface_available()is appropriately placed within the provider validation logic and maintains consistency with the validation pattern for other providers.
asamal4
left a comment
There was a problem hiding this comment.
LGTM, one minor change..
pyproject.toml
Outdated
| # Include local-embeddings for full testing | ||
| "sentence-transformers>=5.1.0", |
There was a problem hiding this comment.
We should not add this to dev dependencies. if someone needs to do full test, they can do the import accordingly.
| # Include local-embeddings for full testing | |
| "sentence-transformers>=5.1.0", |
f8714a5 to
2e50a52
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/lightspeed_evaluation/core/embedding/ragas.py (1)
37-42: Cache implementation looks good; consider architectural suggestion.The disk cache integration and embedding wrapper initialization are well-implemented and correctly handle the optional caching scenario.
Note: There's an unresolved suggestion from a previous review (asamal4) to move embedding model object creation to
manager.pyto avoid managing it in multiple places. While the current implementation works, consolidating embedding instantiation logic might improve maintainability.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
README.mddocs/configuration.mdpyproject.tomlsrc/lightspeed_evaluation/core/embedding/manager.pysrc/lightspeed_evaluation/core/embedding/ragas.py
🚧 Files skipped from review as they are similar to previous changes (4)
- README.md
- docs/configuration.md
- src/lightspeed_evaluation/core/embedding/manager.py
- pyproject.toml
🧰 Additional context used
📓 Path-based instructions (1)
src/lightspeed_evaluation/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/lightspeed_evaluation/**/*.py: Provide type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels in lightspeed_evaluation modules
Files:
src/lightspeed_evaluation/core/embedding/ragas.py
🔇 Additional comments (2)
src/lightspeed_evaluation/core/embedding/ragas.py (2)
3-4: LGTM: Type import supports better type annotations.The
Anyimport is appropriately used to type theembedding_classvariable on line 22, improving type safety for the dynamic provider selection logic.
33-35: LGTM: Good defensive programming.The None-guard for
provider_kwargsprevents potential errors when unpacking kwargs and follows defensive coding best practices.
… packages that are not required for all usecases
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/lightspeed_evaluation/core/embedding/ragas.py (1)
5-9: Critical: Top-level import ofHuggingFaceEmbeddingsdefeats the optional dependency goal.The PR objective is to make torch/nvidia packages optional, but importing
HuggingFaceEmbeddingsat module level (line 6) causes the import to fail immediately iflangchain_huggingfaceorsentence-transformersis not installed. This renders the pre-flight check inmanager.pyineffective since this module will crash on import before validation runs.The import should be lazy-loaded only when the
huggingfaceprovider is selected:🔎 Proposed fix with conditional import
from typing import Any from langchain_google_genai import GoogleGenerativeAIEmbeddings -from langchain_huggingface import HuggingFaceEmbeddings from langchain_openai import OpenAIEmbeddings from ragas.cache import DiskCacheBackend from ragas.embeddings import LangchainEmbeddingsWrapper from lightspeed_evaluation.core.embedding.manager import EmbeddingManagerThen inside the
__init__method:embedding_class: Any if config.provider == "openai": embedding_class = OpenAIEmbeddings elif config.provider == "huggingface": # EmbeddingManager already validated sentence-transformers is available + from langchain_huggingface import HuggingFaceEmbeddings # pylint: disable=import-outside-toplevel embedding_class = HuggingFaceEmbeddings elif config.provider == "gemini": embedding_class = GoogleGenerativeAIEmbeddings
♻️ Duplicate comments (2)
src/lightspeed_evaluation/core/embedding/manager.py (1)
11-32: Well-implemented pre-flight check with helpful error messaging.The function correctly validates the optional dependency and provides clear installation guidance. The docstring follows Google-style conventions and the type hint is present.
Note: The past review comment about moving
EmbeddingErrortocore.system.exceptionsstill applies to this module.src/lightspeed_evaluation/core/embedding/ragas.py (1)
22-31: Use custom exception instead ofRuntimeError.Per coding guidelines, use custom exceptions from
core.system.exceptions. This was previously flagged in past reviews.🔎 Proposed fix
+from lightspeed_evaluation.core.embedding.manager import EmbeddingError +# or from lightspeed_evaluation.core.system.exceptions import EmbeddingError (preferred location) else: - raise RuntimeError(f"Unknown embedding provider {config.provider}") + raise EmbeddingError(f"Unknown embedding provider {config.provider}")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
README.mddocs/configuration.mdpyproject.tomlsrc/lightspeed_evaluation/core/embedding/manager.pysrc/lightspeed_evaluation/core/embedding/ragas.py
🚧 Files skipped from review as they are similar to previous changes (2)
- README.md
- pyproject.toml
🧰 Additional context used
📓 Path-based instructions (1)
src/lightspeed_evaluation/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/lightspeed_evaluation/**/*.py: Provide type hints for all public functions and methods
Use Google-style docstrings for all public APIs
Use custom exceptions from core.system.exceptions for error handling
Use structured logging with appropriate levels in lightspeed_evaluation modules
Files:
src/lightspeed_evaluation/core/embedding/manager.pysrc/lightspeed_evaluation/core/embedding/ragas.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T10:05:06.422Z
Learning: Do not add new features to lsc_agent_eval/ directory - add to src/lightspeed_evaluation/ instead
📚 Learning: 2025-09-22T07:24:57.366Z
Learnt from: VladimirKadlec
Repo: lightspeed-core/lightspeed-evaluation PR: 56
File: src/lightspeed_evaluation/core/llm/ragas.py:127-140
Timestamp: 2025-09-22T07:24:57.366Z
Learning: HuggingFaceEmbeddings from langchain_huggingface accepts both `model` and `model_name` parameters - `model` serves as an alias for `model_name` parameter in the constructor.
Applied to files:
docs/configuration.md
📚 Learning: 2025-09-22T07:24:57.366Z
Learnt from: VladimirKadlec
Repo: lightspeed-core/lightspeed-evaluation PR: 56
File: src/lightspeed_evaluation/core/llm/ragas.py:127-140
Timestamp: 2025-09-22T07:24:57.366Z
Learning: HuggingFaceEmbeddings from langchain_huggingface accepts both `model` and `model_name` parameters - both work as valid parameter names in the constructor due to internal parameter handling.
Applied to files:
docs/configuration.md
📚 Learning: 2025-12-11T10:05:06.422Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T10:05:06.422Z
Learning: Applies to src/lightspeed_evaluation/**/*.py : Use custom exceptions from core.system.exceptions for error handling
Applied to files:
src/lightspeed_evaluation/core/embedding/manager.pysrc/lightspeed_evaluation/core/embedding/ragas.py
📚 Learning: 2025-12-11T10:05:06.422Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-evaluation PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T10:05:06.422Z
Learning: Applies to src/lightspeed_evaluation/**/*.py : Use structured logging with appropriate levels in lightspeed_evaluation modules
Applied to files:
src/lightspeed_evaluation/core/embedding/ragas.py
🔇 Additional comments (3)
docs/configuration.md (1)
40-59: LGTM! Clear documentation for optional local embeddings.The documentation accurately describes the remote vs local embedding distinction and provides clear installation instructions for both pip and uv. This aligns well with the dependency check implementation in the codebase.
src/lightspeed_evaluation/core/embedding/manager.py (1)
52-53: LGTM!Clean integration of the pre-flight check following the existing validation pattern for other providers.
src/lightspeed_evaluation/core/embedding/ragas.py (1)
33-42: LGTM!The cache wiring logic is clean with proper None-guard for
provider_kwargsand conditional disk cache setup.
Description
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Bug Fixes / Reliability
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.