feat: implement configuration gui #304

maciejmajek · 2024-11-04T18:43:57Z

Purpose

RAI's configuration is mostly dictated by the config.toml file.

Proposed Changes

This PR introduces a gui for configuring the following parts of RAI:

Models
Tracing
ASR (not yet honored by RAI_ASR)
TTS (not yet honored by RAI_TTS)
Additional features

Issues

Links to relevant issues

Testing

streamlit run src/rai/rai/utils/configurator.py

Summary by CodeRabbit

New Features
- Introduced a Streamlit-based configuration wizard for setting up an AI environment, guiding users through model selection, tracing, speech recognition, and more.
- Added new sections for automatic speech recognition (ASR) and text-to-speech (TTS) in the configuration, enhancing functionality.
- Added a new section in the README for configuring RAI with updated setup instructions.
- Included a new dependency for enhanced project management.
Bug Fixes
- Improved error handling in the configuration wizard to ensure necessary environment variables are present before saving configurations.
Refactor
- Updated the function for tracing callbacks to allow optional parameters for more flexible configuration.
Documentation
- Enhanced the README to clarify setup processes and limitations of the RAI framework, including resource usage concerns.

maciejmajek · 2024-11-07T13:24:10Z

@CodeRabbit pause

coderabbitai

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (5)

src/rai/rai/utils/model_initialization.py (3)

157-159: Add docstring to document the new parameters.

The function signature changes look good, but please add a docstring to document the purpose of the new override parameters and their interaction with the configuration.

 def get_tracing_callbacks(
     override_use_langfuse: bool = False, override_use_langsmith: bool = False
 ) -> List[BaseCallbackHandler]:
+    """Get tracing callbacks based on configuration and override flags.
+    
+    Args:
+        override_use_langfuse (bool, optional): Force enable Langfuse tracing regardless of config. Defaults to False.
+        override_use_langsmith (bool, optional): Force enable Langsmith tracing regardless of config. Defaults to False.
+    
+    Returns:
+        List[BaseCallbackHandler]: List of configured tracing callbacks.
+    
+    Raises:
+        ValueError: If required environment variables are not set when tracing is enabled.
+    """

Line range hint 162-176: Consider using early returns to improve readability.

The logic is correct, but the code structure could be improved by using early returns and extracting the environment variable validation.

     if config.tracing.langfuse.use_langfuse or override_use_langfuse:
         from langfuse.callback import CallbackHandler  # type: ignore
 
         public_key = os.getenv("LANGFUSE_PUBLIC_KEY", None)
         secret_key = os.getenv("LANGFUSE_SECRET_KEY", None)
-        if public_key is None or secret_key is None:
-            raise ValueError("LANGFUSE_PUBLIC_KEY or LANGFUSE_SECRET_KEY is not set")
-
-        callback = CallbackHandler(
-            public_key=public_key,
-            secret_key=secret_key,
-            host=config.tracing.langfuse.host,
-        )
-        callbacks.append(callback)
+        _validate_langfuse_env_vars(public_key, secret_key)
+        callbacks.append(
+            CallbackHandler(
+                public_key=public_key,
+                secret_key=secret_key,
+                host=config.tracing.langfuse.host,
+            )
+        )

+def _validate_langfuse_env_vars(public_key: str | None, secret_key: str | None) -> None:
+    if public_key is None or secret_key is None:
+        raise ValueError("LANGFUSE_PUBLIC_KEY or LANGFUSE_SECRET_KEY is not set")

Line range hint 177-183: Improve environment variable handling for Langsmith.

The current implementation has several potential issues:

Setting environment variables at runtime could affect other parts of the application
No cleanup of environment variables
Missing validation of the project name

Consider these improvements:

     if config.tracing.langsmith.use_langsmith or override_use_langsmith:
+        if not config.tracing.project:
+            raise ValueError("Tracing project name cannot be empty")
+
         os.environ["LANGCHAIN_TRACING_V2"] = "true"
         os.environ["LANGCHAIN_PROJECT"] = config.tracing.project
         api_key = os.getenv("LANGCHAIN_API_KEY", None)
         if api_key is None:
             raise ValueError("LANGCHAIN_API_KEY is not set")
+
+        from langchain.callbacks.tracers.langsmith import LangSmithTracer
+        callbacks.append(LangSmithTracer(
+            project_name=config.tracing.project,
+        ))

Also, consider using a context manager to handle environment variables:

@contextmanager
def langsmith_env_vars(project: str):
    """Temporarily set Langsmith environment variables."""
    original_vars = {
        "LANGCHAIN_TRACING_V2": os.environ.get("LANGCHAIN_TRACING_V2"),
        "LANGCHAIN_PROJECT": os.environ.get("LANGCHAIN_PROJECT"),
    }
    try:
        os.environ["LANGCHAIN_TRACING_V2"] = "true"
        os.environ["LANGCHAIN_PROJECT"] = project
        yield
    finally:
        for key, value in original_vars.items():
            if value is None:
                os.environ.pop(key, None)
            else:
                os.environ[key] = value

README.md (1)

109-115: Enhance the configuration section with more details.

While the instructions are clear, consider adding more context about:

What settings can be configured (Models, Tracing, ASR, TTS)

Prerequisites for running Streamlit

Where the configuration will be saved

Here's a suggested expansion:
 #### 1.4 Configure RAI
 
-Run the configuration tool to set up your vendor and other settings:
+Run the configuration tool to set up your environment:
+
+Prerequisites:
+- Streamlit (installed automatically via poetry)
+
+The configurator allows you to:
+- Select and configure AI models
+- Set up tracing options
+- Configure ASR (Automatic Speech Recognition)
+- Configure TTS (Text-to-Speech)
+
+Run the configuration wizard:
 
 ```bash
 streamlit run src/rai/rai/utils/configurator.py
+The settings will be saved to your config.toml file.
</blockquote></details>
<details>
<summary>src/rai/rai/utils/configurator.py (1)</summary><blockquote>

`699-701`: **Simplify boolean return statement**

The current return logic can be simplified by directly returning the condition's result, improving code readability.



Apply this diff:

```diff
-if np.sum(np.abs(recording)) == 0:
-    return False
-return True
+return np.sum(np.abs(recording)) != 0
🧰 Tools

🪛 Ruff

699-701: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 7987aba and ec32761.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

README.md (1 hunks)
config.toml (2 hunks)
pyproject.toml (1 hunks)
src/rai/rai/utils/configurator.py (1 hunks)
src/rai/rai/utils/model_initialization.py (2 hunks)

🧰 Additional context used

🪛 Ruff

src/rai/rai/utils/configurator.py

699-701: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

740-740: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

🔇 Additional comments (4)

config.toml (1)

2-4: ⚠️ Potential issue

Verify OpenAI model names and configurations

The default vendor is set to "openai", but the model names in the OpenAI section ("gpt-4o-mini", "gpt-4o-2024-08-06") don't appear to be standard OpenAI model names. This could cause runtime errors.

pyproject.toml (2)

55-55: LGTM: tomli-w dependency addition is appropriate.

The addition of tomli-w complements the existing tomli dependency by providing TOML writing capabilities needed for the configuration GUI.

55-55: Consider alternative TOML libraries.

While tomli-w works well for writing TOML files, consider using more actively maintained alternatives like tomlkit or rtoml that provide both reading and writing capabilities in a single package. This could help reduce the number of dependencies.

src/rai/rai/utils/model_initialization.py (1)

Line range hint 157-183: Verify integration with the configuration GUI.

Let's verify how these changes are used in the new configuration GUI and other parts of the codebase.

✅ Verification successful

The tracing configuration is properly integrated with the configuration GUI and codebase

The verification shows that:

The tracing configuration is fully integrated into the configurator GUI (step 3 of 7) with proper settings for both Langfuse and LangSmith.
The configuration is correctly stored in the config.toml file and loaded by get_tracing_callbacks().
The tracing callbacks are properly used in the codebase:
- In src/rai/rai/node.py for the main RAI node
- In examples/taxi-demo.py for the demo application
- Well documented in docs/tracing.md

The changes to get_tracing_callbacks() are consistent with the GUI configuration and existing usage patterns.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for usage of get_tracing_callbacks
echo "Searching for get_tracing_callbacks usage..."
rg "get_tracing_callbacks\(" -A 3

# Search for the new configurator GUI integration
echo "Searching for configurator integration..."
rg "configurator\.py.*get_tracing_callbacks" -A 3

Length of output: 1939

Script:

#!/bin/bash
# Let's check for configuration related to tracing in the codebase
echo "Searching for tracing configuration..."
rg "tracing" --type toml -A 5

# Let's check for any GUI or configurator related files
echo "Searching for configurator files..."
fd -e py configurator

# Let's check the config structure
echo "Checking config structure..."
rg "class.*Config" -A 10

Length of output: 3763

Script:

#!/bin/bash
# Let's check the content of the configurator file
echo "Checking configurator.py content..."
cat src/rai/rai/utils/configurator.py

# Let's also check for any UI or GUI related imports of the tracing configuration
echo "Checking for UI/GUI imports of tracing config..."
rg "from.*tracing" -A 2
rg "import.*tracing" -A 2

Length of output: 28601

boczekbartek

@maciejmajek Great feature! I left some comments. Also:

Would it be easy to make a drop-down list of available models? Or add a link to langchain/openai api website next to the text field?
For langsmith/cloud langfuse I would add a link to create accounts in the configurator
For langfuse I would use self-hosting (http://localhost:3000) as default and point the user to this doc: https://langfuse.com/docs/deployment/local
I think saving doesn't work as expected:

[aws]
simple_model = "anthropic.claude-3-haiku-20240307-v1:0"
complex_model = "anthropic.claude-3-5-sonnet-20240620-v1:0"
embeddings_model = "amazon.titan-embed-text-v1"

[openai]
simple_model = "anthropic.claude-3-haiku-20240307-v1:0"
complex_model = "anthropic.claude-3-5-sonnet-20240620-v1:0"
embeddings_model = "amazon.titan-embed-text-v1"

[ollama]
simple_model = "anthropic.claude-3-haiku-20240307-v1:0"
complex_model = "anthropic.claude-3-5-sonnet-20240620-v1:0"
embeddings_model = "amazon.titan-embed-text-v1"

README.md

src/rai/rai/utils/configurator.py

coderabbitai · 2024-11-13T12:39:13Z

Walkthrough

This pull request introduces several enhancements to the RAI framework documentation and configuration files. The README.md file now includes a "Configure RAI" section and an expanded "Limitations" section. The config.toml file has been updated with new sections for automatic speech recognition (ASR) and text-to-speech (TTS) configurations, along with changes to vendor models. A new configuration wizard has been implemented in configurator.py using Streamlit. Additionally, the pyproject.toml file adds a new dependency, and modifications to the get_tracing_callbacks function in model_initialization.py improve flexibility in tracing configurations.

Changes

File	Change Summary
README.md	- Added "Configure RAI" section with configuration tool instructions. - Expanded "Limitations" section to address resource usage.
config.toml	- Updated `[vendor]` section to change `embeddings_model` and added new `[asr]` and `[tts]` sections with specific parameters.
pyproject.toml	- Added dependency: `tomli-w = "^1.1.0"` in `[tool.poetry.dependencies]`.
src/rai/rai/utils/configurator.py	- Introduced a configuration wizard with multiple steps for AI environment setup, including error handling and testing functionalities.
src/rai/rai/utils/model_initialization.py	- Updated `get_tracing_callbacks` function to include `override_use_langfuse` and `override_use_langsmith` parameters for enhanced flexibility.

Possibly related PRs

Embodied AI Community Group update #306: The changes in README.md for the "Embodied AI Community Group update" PR also involve modifications to the README.md file, enhancing clarity and accessibility, similar to the updates made in the main PR regarding configuration and limitations.

Suggested reviewers

adamdbrw
boczekbartek

Poem

🐇 In the land of code where rabbits play,
New sections bloom, guiding the way.
Configuration tools, oh what a delight,
With ASR and TTS, our future is bright!
So hop along, let’s set it right,
Together we’ll code, through day and night! 🐰✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (6)

config.toml (1)

Line range hint 1-45: Add configuration documentation and security considerations

As this configuration file is now manageable through a GUI:

Consider adding inline comments explaining each parameter's purpose and valid values

Document security implications of wake word and recording device settings

Consider moving sensitive values (API keys, etc.) to environment variables

src/rai/rai/utils/model_initialization.py (1)

Line range hint 157-184: Consider a more robust configuration management system.

Given that this is part of a larger GUI configuration system, consider:

Implementing a dedicated configuration manager class that handles all environment variables and settings

Using dependency injection for callbacks instead of global configuration

Adding validation for all configuration values before they're used

This would improve maintainability and make the system more robust to configuration errors.
README.md (2)
111-112: Add more details about configurable settings.

Consider expanding the description to mention the specific components that can be configured (Models, Tracing, ASR, TTS) to better inform users about the tool's capabilities.
-Run the configuration tool to set up your vendor and other settings:
+Run the configuration tool to set up various RAI components including:
+- AI Models and Vendors
+- Tracing Configuration
+- Automatic Speech Recognition (ASR)
+- Text-to-Speech (TTS)
+- Additional Features
113-117: Add a note about browser access.

Add a note mentioning that Streamlit will automatically open the configurator in the default browser and display the access URLs in the terminal.
 ```bash
 poetry shell
 streamlit run src/rai/rai/utils/configurator.py
+The configurator will automatically open in your default web browser. The terminal will display local and network URLs for access.
</blockquote></details>
<details>
<summary>src/rai/rai/utils/configurator.py (2)</summary><blockquote>

`699-701`: **Simplify the return statement for clarity**

You can simplify the code by returning the condition directly, improving readability.



Apply this diff:

```diff
         if np.sum(np.abs(recording)) == 0:
             return False
         return True
+        return not np.sum(np.abs(recording)) == 0
🧰 Tools

🪛 Ruff

699-701: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

740-740: Simplify iteration over dictionary keys

You can iterate over the dictionary directly without calling .keys().

Apply this diff:
-        for component in test_results.keys():
+        for component in test_results:
🧰 Tools

🪛 Ruff

740-740: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 7987aba and 8471555.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

README.md (1 hunks)
config.toml (2 hunks)
pyproject.toml (1 hunks)
src/rai/rai/utils/configurator.py (1 hunks)
src/rai/rai/utils/model_initialization.py (2 hunks)

🧰 Additional context used

🪛 Ruff

src/rai/rai/utils/configurator.py

699-701: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

740-740: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

🔇 Additional comments (7)

config.toml (1)

2-4: Verify OpenAI model names and consider cost implications

The switch from Ollama to OpenAI for all models is a significant change that:

Requires verification of model names (especially "gpt-4o-mini" and "gpt-4o-2024-08-06")
May have cost implications as OpenAI is a paid service versus Ollama being free

pyproject.toml (1)

55-55: LGTM! Consider using the latest version.

The addition of tomli-w makes sense for the configuration GUI to write TOML files. The version constraint is appropriate, though you might want to verify if the latest version (1.2.0) could be used.

src/rai/rai/utils/model_initialization.py (3)

157-159: LGTM! Well-structured function signature change.

The addition of optional override parameters maintains backward compatibility while providing flexibility for the new GUI configuration feature.

Line range hint 177-184: Consider thread-safety improvements for environment variables.

Setting environment variables within a function can lead to race conditions in concurrent scenarios. Consider:

Moving environment variable setup to application initialization
Using LangChain's configuration methods instead of environment variables

#!/bin/bash
# Check for potential concurrent usage
rg -A 5 "get_tracing_callbacks.*async|concurrent|thread|multiprocess" 

# Check if the configurator handles these environment variables
rg -A 5 "LANGCHAIN_(API_KEY|TRACING_V2|PROJECT)" "src/rai/rai/utils/configurator.py"

Line range hint 162-176: Consider moving the Langfuse import to module level.

The conditional import of Langfuse callback handler could cause import-time issues if the module is not installed. Consider moving it to the module level with proper error handling.

+ try:
+     from langfuse.callback import CallbackHandler  # type: ignore
+ except ImportError:
+     CallbackHandler = None

def get_tracing_callbacks(...):
    ...
    if config.tracing.langfuse.use_langfuse or override_use_langfuse:
-        from langfuse.callback import CallbackHandler  # type: ignore
+        if CallbackHandler is None:
+            raise ImportError("Langfuse package is not installed")

Verify environment variables in the configurator.

Since this is part of a GUI configuration system, ensure that the configurator validates these environment variables before enabling Langfuse.

✅ Verification successful

Environment variables LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are properly validated in configurator.py and model_initialization.py.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the configurator validates environment variables
rg -A 5 "LANGFUSE_(PUBLIC|SECRET)_KEY" "src/rai/rai/utils/configurator.py"

Length of output: 13533

Script:

#!/bin/bash
# Search the entire codebase for validations of LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY
rg "LANGFUSE_(PUBLIC|SECRET)_KEY" --type py

Length of output: 718

README.md (1)

109-110: LGTM!

The section header follows the document structure and formatting conventions.

src/rai/rai/utils/configurator.py (1)

25-25: Verify the import statement for ElevenLabs

Ensure that you are importing the correct class or function from the elevenlabs library. The usage might require initializing with set_api_key rather than passing api_key to ElevenLabs.

Run the following script to check the available classes and functions in the elevenlabs module:

coderabbitai · 2024-11-13T12:44:18Z

config.toml

+[asr]
+recording_device_name = "default"
+vendor = "whisper"
+language = "en"
+silence_grace_period = 0.3
+use_wake_word = false
+vad_threshold = 0.3
+wake_word_model = ""
+wake_word_threshold = 0.5


💡 Codebase verification

🛠️ Refactor suggestion

Parameter validation for ASR configuration is missing

vad_threshold and wake_word_threshold are not validated to be between 0 and 1.

silence_grace_period is not validated to be positive.

recording_device_name is not verified against available devices.

🔗 Analysis chain

Add parameter validation for ASR configuration

The ASR configuration includes several parameters that should be validated:

vad_threshold and wake_word_threshold should be between 0 and 1

silence_grace_period should be positive

recording_device_name should be verified against available devices

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check for parameter validation in ASR configuration loading # Search for ASR config validation echo "Checking for ASR config validation..." ast-grep --pattern 'def $_(config: $_) { $$$ asr$_ $$$ }' # Search for device name validation echo "Checking for recording device validation..." rg -l "recording_device.*validation|check.*recording.*device"

Length of output: 346

coderabbitai · 2024-11-13T12:44:18Z

config.toml

+[tts]
+vendor = "elevenlabs"
+keep_speaker_busy = false


💡 Codebase verification

Missing API Key Configuration for ElevenLabs in TTS

The TTS configuration lacks API key setup for ElevenLabs integration:

API key configuration not found in src/rai/rai/utils/configurator.py

🔗 Analysis chain

Consider additional TTS configuration parameters

The TTS configuration appears minimal and might be missing important parameters:

API key configuration for ElevenLabs

Voice selection options

Speech parameters (speed, pitch, etc.)

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check for TTS vendor integration and configuration handling # Search for ElevenLabs integration echo "Checking for ElevenLabs integration..." rg -l "elevenlabs.*api|elevenlabs.*key" # Search for TTS configuration handling echo "Checking for TTS configuration handling..." ast-grep --pattern 'class $TTS { $$$ def $_($_, config: $_) { $$$ } $$$ }'

Length of output: 346

coderabbitai · 2024-11-13T12:44:18Z

src/rai/rai/utils/configurator.py

+        st.session_state.config["ollama"] = {
+            "simple_model": simple_model,
+            "complex_model": complex_model,
+            "embeddings_model": embeddings_model,
+        }


⚠️ Potential issue

Include base_url in Ollama configuration to prevent KeyError

The base_url is not stored in st.session_state.config["ollama"], which may lead to a KeyError when initializing the Ollama models.

Apply this diff to store base_url in the configuration:

st.session_state.config["ollama"] = { "simple_model": simple_model, "complex_model": complex_model, "embeddings_model": embeddings_model, + "base_url": base_url, }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

st.session_state.config["ollama"] = {

"simple_model": simple_model,

"complex_model": complex_model,

"embeddings_model": embeddings_model,

}

st.session_state.config["ollama"] = {

"simple_model": simple_model,

"complex_model": complex_model,

"embeddings_model": embeddings_model,

"base_url": base_url,

}

coderabbitai · 2024-11-13T12:44:18Z

src/rai/rai/utils/configurator.py

+                model = create_chat_model(model_type)
+                answer = model.invoke("Say hello!")
+                return answer.content is not None


⚠️ Potential issue

Use correct method to invoke the chat model

The invoke method is not standard for LangChain chat models. You should use the __call__ method instead.

Apply this diff to correct the method call:

model = create_chat_model(model_type) - answer = model.invoke("Say hello!") + answer = model("Say hello!") return answer.content is not None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

model = create_chat_model(model_type)

answer = model.invoke("Say hello!")

return answer.content is not None

model = create_chat_model(model_type)

answer = model("Say hello!")

return answer.content is not None

coderabbitai · 2024-11-13T12:44:18Z

src/rai/rai/utils/configurator.py

+        devices = sd.query_devices()
+        device_index = [device["name"] for device in devices].index(
+            st.session_state.config["asr"]["recording_device_name"]
+        )
+        sample_rate = int(devices[device_index]["default_samplerate"])


⚠️ Potential issue

Add error handling for missing recording device

If the selected recording device is not found, the code will raise a ValueError. Adding error handling will prevent the application from crashing.

Apply this diff to handle the exception:

devices = sd.query_devices() + try: device_index = [device["name"] for device in devices].index( st.session_state.config["asr"]["recording_device_name"] ) + except ValueError: + st.error("Selected recording device not found.") + return False sample_rate = int(devices[device_index]["default_samplerate"])

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

devices = sd.query_devices()

device_index = [device["name"] for device in devices].index(

st.session_state.config["asr"]["recording_device_name"]

)

sample_rate = int(devices[device_index]["default_samplerate"])

devices = sd.query_devices()

try:

device_index = [device["name"] for device in devices].index(

st.session_state.config["asr"]["recording_device_name"]

)

except ValueError:

st.error("Selected recording device not found.")

return False

sample_rate = int(devices[device_index]["default_samplerate"])

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)

src/rai/rai/utils/configurator.py (3)
713-714: Address TODO comments for ASR testing.

The ASR testing functionality is missing and should be implemented for complete configuration validation.

Would you like me to help implement the ASR test functionality or create a GitHub issue to track this task?

706-708: Simplify recording test condition.

The recording test condition can be simplified for better readability.

Apply this diff:
-            if np.sum(np.abs(recording)) == 0:
-                return False
-            return True
+            return np.sum(np.abs(recording)) != 0
🧰 Tools

🪛 Ruff

706-708: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

670-694: Improve test_tts function structure.

The function has inconsistent return paths and redundant statements.

Apply this diff:
 def test_tts():
     vendor = st.session_state.config["tts"]["vendor"]
     if vendor == "elevenlabs":
         try:
             client = ElevenLabs(api_key=os.getenv("ELEVENLABS_API_KEY"))
             output = client.generate(text="Hello, world!")
             output = list(output)
             return True
         except Exception as e:
             st.error(f"TTS error: {e}")
-        return False
+            return False
     elif vendor == "opentts":
         try:
             params = {
                 "voice": "glow-speak:en-us_mary_ann",
                 "text": "Hello, world!",
             }
             response = requests.get(
                 "http://localhost:5500/api/tts", params=params
             )
-            if response.status_code == 200:
-                return True
+            return response.status_code == 200
         except Exception as e:
             st.error(f"TTS error: {e}")
-        return False
+            return False
+    return False  # Unknown vendor

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8471555 and a5c7025.

📒 Files selected for processing (1)

src/rai/rai/utils/configurator.py (1 hunks)

🧰 Additional context used

🪛 Ruff

src/rai/rai/utils/configurator.py

706-708: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

747-747: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

🔇 Additional comments (1)

src/rai/rai/utils/configurator.py (1)

35-39: 🛠️ Refactor suggestion

Enhance config loading with validation and error handling.

The current implementation only handles FileNotFoundError. Consider adding comprehensive error handling and config validation.

Apply this diff:

 try:
     with open("config.toml", "rb") as f:
-        st.session_state.config = tomli.load(f)
+        config = tomli.load(f)
+        # Validate required sections
+        required_sections = ["vendor", "tracing", "asr", "tts"]
+        for section in required_sections:
+            config.setdefault(section, {})
+        st.session_state.config = config
 except FileNotFoundError:
     st.session_state.config = {}
+except tomli.TOMLDecodeError as e:
+    st.error(f"Error parsing config.toml: {e}")
+    st.session_state.config = {}
+except Exception as e:
+    st.error(f"Unexpected error loading config: {e}")
+    st.session_state.config = {}

Likely invalid or redundant comment.

coderabbitai · 2024-11-13T12:59:42Z

src/rai/rai/utils/configurator.py

+            st.session_state.config["ollama"] = {
+                "simple_model": simple_model,
+                "complex_model": complex_model,
+                "embeddings_model": embeddings_model,
+            }


⚠️ Potential issue

Store Ollama base_url in configuration.

The base_url parameter is collected but not stored in the configuration, which could cause issues when initializing Ollama models.

Apply this diff:

st.session_state.config["ollama"] = { "simple_model": simple_model, "complex_model": complex_model, "embeddings_model": embeddings_model, + "base_url": base_url, }

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-13T12:59:42Z

src/rai/rai/utils/configurator.py

+    recording_devices = get_recording_devices()
+    currently_selected_device_name = st.session_state.config.get("asr", {}).get(
+        "recording_device_name", ""
+    )
+    try:
+        device_index = [device["name"] for device in recording_devices].index(
+            currently_selected_device_name
+        )
+    except ValueError:
+        device_index = None
+    recording_device_name = st.selectbox(


🛠️ Refactor suggestion

Improve recording device handling.

The code assumes recording devices are available and doesn't handle the case when no devices are found.

Apply this diff:

recording_devices = get_recording_devices() +if not recording_devices: + st.error("No recording devices found. Please check your microphone connection.") + device_index = None + recording_device_name = None +else: currently_selected_device_name = st.session_state.config.get("asr", {}).get( "recording_device_name", "" ) try: device_index = [device["name"] for device in recording_devices].index( currently_selected_device_name ) except ValueError: device_index = None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

recording_devices = get_recording_devices()

currently_selected_device_name = st.session_state.config.get("asr", {}).get(

"recording_device_name", ""

)

try:

device_index = [device["name"] for device in recording_devices].index(

currently_selected_device_name

)

except ValueError:

device_index = None

recording_device_name = st.selectbox(

recording_devices = get_recording_devices()

if not recording_devices:

st.error("No recording devices found. Please check your microphone connection.")

device_index = None

recording_device_name = None

else:

currently_selected_device_name = st.session_state.config.get("asr", {}).get(

"recording_device_name", ""

)

try:

device_index = [device["name"] for device in recording_devices].index(

currently_selected_device_name

)

except ValueError:

device_index = None

recording_device_name = st.selectbox(

coderabbitai · 2024-11-13T12:59:42Z

src/rai/rai/utils/configurator.py

+        def create_chat_model(model_type: str):
+            vendor_name = st.session_state.config["vendor"][f"{model_type}_model"]
+            model_name = st.session_state.config[vendor_name][f"{model_type}_model"]
+
+            if vendor_name == "openai":
+                return ChatOpenAI(model=model_name)
+            elif vendor_name == "aws":
+                return ChatBedrock(model_id=model_name)
+            elif vendor_name == "ollama":
+                return ChatOllama(
+                    model=model_name,
+                    base_url=st.session_state.config["ollama"]["base_url"],
+                )
+            raise ValueError(f"Unknown vendor: {vendor_name}")
+
+        def test_chat_model(model_type: str) -> bool:
+            try:
+                model = create_chat_model(model_type)
+                answer = model.invoke("Say hello!")
+                return answer.content is not None
+            except Exception as e:
+                st.error(f"{model_type.title()} model error: {e}")
+                return False
+


🛠️ Refactor suggestion

Implement consistent error handling pattern.

Different functions handle errors differently. Consider implementing a consistent error handling pattern across all functions.

Example pattern for model-related functions:

def create_chat_model(model_type: str): try: vendor_name = st.session_state.config["vendor"][f"{model_type}_model"] model_name = st.session_state.config[vendor_name][f"{model_type}_model"] if vendor_name == "openai": + if not os.getenv("OPENAI_API_KEY"): + raise ValueError("OPENAI_API_KEY not set") return ChatOpenAI(model=model_name) elif vendor_name == "aws": + if not os.getenv("AWS_ACCESS_KEY_ID"): + raise ValueError("AWS credentials not set") return ChatBedrock(model_id=model_name) elif vendor_name == "ollama": + base_url = st.session_state.config["ollama"].get("base_url") + if not base_url: + raise ValueError("Ollama base_url not configured") return ChatOllama( model=model_name, - base_url=st.session_state.config["ollama"]["base_url"], + base_url=base_url, ) - raise ValueError(f"Unknown vendor: {vendor_name}") + except KeyError as e: + raise ValueError(f"Missing configuration: {e}") + except Exception as e: + raise ValueError(f"Error creating {model_type} model: {e}")

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (4)

config.toml (1)
43-45: Document TTS configuration parameters

The TTS configuration needs documentation for:

Purpose and behavior of keep_speaker_busy parameter

Expected values and their impact

Consider adding comments in the config file or documentation explaining these parameters:
[tts]
vendor = "elevenlabs"
+# Controls whether the TTS engine should maintain an active connection
+# Set to true to reduce latency between speech segments
keep_speaker_busy = false
src/rai/rai/utils/configurator.py (3)
355-357: Remove duplicate tracing config check

The tracing config existence check is duplicated from line 305.

Apply this diff:
-        if "tracing" not in st.session_state.config:
-            st.session_state.config["tracing"] = {}
         st.session_state.config["tracing"]["langsmith"] = {
             "use_langsmith": langsmith_enabled
         }
823-825: Simplify recording test condition

The recording test condition can be simplified.

Apply this diff:
-                if np.sum(np.abs(recording)) == 0:
-                    return False
-                return True
+                return np.sum(np.abs(recording)) != 0
🧰 Tools

🪛 Ruff

823-825: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

830-831: Address TODO comments about missing tests

The code has TODO comments indicating missing ASR tests.

Would you like me to help implement the ASR tests or create GitHub issues to track these tasks?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between a5c7025 and b4be760.

📒 Files selected for processing (2)

config.toml (2 hunks)
src/rai/rai/utils/configurator.py (1 hunks)

🧰 Additional context used

🪛 Ruff

src/rai/rai/utils/configurator.py

39-39: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

823-825: Return the condition not np.sum(np.abs(recording)) == 0 directly

Replace with return not np.sum(np.abs(recording)) == 0

(SIM103)

864-864: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

🔇 Additional comments (2)

config.toml (1)

28-28: Verify local Langfuse server setup

The Langfuse host has been changed to localhost. Please ensure:

Documentation includes instructions for setting up a local Langfuse server
This change is intentional and not accidentally committed

src/rai/rai/utils/configurator.py (1)

881-885: ⚠️ Potential issue

Add error handling for configuration save operation

The save operation should handle potential file system errors.

Apply this diff:

         if st.button("Save Configuration"):
-            # Save configuration to TOML file
-            with open("config.toml", "wb") as f:
-                tomli_w.dump(st.session_state.config, f)
-            st.success("Configuration saved successfully!")
+            try:
+                # Create backup of existing config
+                if os.path.exists("config.toml"):
+                    import shutil
+                    backup_path = "config.toml.backup"
+                    shutil.copy2("config.toml", backup_path)
+                
+                # Save configuration to TOML file
+                with open("config.toml", "wb") as f:
+                    tomli_w.dump(st.session_state.config, f)
+                st.success("Configuration saved successfully!")
+            except (OSError, PermissionError) as e:
+                st.error(f"Failed to save configuration: {e}")

Likely invalid or redundant comment.

config.toml

src/rai/rai/utils/configurator.py

…nd OpenTTS vendors refactor(configurator.py): consolidate test execution into a loop for improved readability and maintainability fix(configurator.py): ensure advanced config updates vendor settings correctly chore(configurator.py): add TODOs for future improvements in ASR testing and test file organization

Co-authored-by: Bartek Boczek <22739059+boczekbartek@users.noreply.github.com>

fix: model saving

boczekbartek

Looks good!

This comment was marked as resolved.

Sign in to view

maciejmajek force-pushed the feat/configurator branch from 328aad0 to ec32761 Compare November 7, 2024 13:24

RobotecAI deleted a comment from coderabbitai bot Nov 7, 2024

coderabbitai bot reviewed Nov 7, 2024

View reviewed changes

RobotecAI deleted a comment from coderabbitai bot Nov 7, 2024

boczekbartek requested changes Nov 13, 2024

View reviewed changes

README.md Show resolved Hide resolved

README.md Show resolved Hide resolved

src/rai/rai/utils/configurator.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Nov 13, 2024

View reviewed changes

config.toml Show resolved Hide resolved

config.toml Show resolved Hide resolved

src/rai/rai/utils/configurator.py Show resolved Hide resolved

src/rai/rai/utils/configurator.py Show resolved Hide resolved

src/rai/rai/utils/configurator.py Show resolved Hide resolved

maciejmajek and others added 10 commits November 13, 2024 17:02

feat: implement configuration gui

89aed59

chore: add license

9277916

refactor: configurator, feat: expanded config.toml

d3acba4

docs: mention configurator in README.md

2b41f41

feat: handle asr and tts parameters

c754255

refactor: simplify tests, feat: test asr

62e0ffc

Apply suggestions from code review

b53b0d6

Co-authored-by: Bartek Boczek <22739059+boczekbartek@users.noreply.github.com>

fix: advanced config state

1f7f133

fix: model saving

fix: handle session_state

8d16c76

maciejmajek force-pushed the feat/configurator branch from b4be760 to 8d16c76 Compare November 13, 2024 16:02

docs: add note in case the browser does not start automatically

63b6e09

maciejmajek requested a review from boczekbartek November 13, 2024 17:18

boczekbartek approved these changes Nov 14, 2024

View reviewed changes

maciejmajek merged commit 137f803 into development Nov 14, 2024
5 checks passed

maciejmajek deleted the feat/configurator branch November 14, 2024 11:14

This was referenced Nov 28, 2024

feat: debugging assistant #326

Merged

feat: allow passing base_url in openai section of config.toml #328

Merged

coderabbitai bot mentioned this pull request Apr 14, 2025

refactor: remove rai.utils module #524

Merged

feat: implement configuration gui #304

feat: implement configuration gui #304

Uh oh!

Conversation

maciejmajek commented Nov 4, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Proposed Changes

Issues

Testing

Summary by CodeRabbit

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

maciejmajek commented Nov 7, 2024

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

boczekbartek left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

maciejmajek commented Nov 4, 2024 •

edited by coderabbitai bot

Loading

boczekbartek left a comment •

edited

Loading

coderabbitai bot commented Nov 13, 2024 •

edited

Loading