-
Notifications
You must be signed in to change notification settings - Fork 756
feat(wren-ai-service): create Streamlit UI for configuring LLM models #1584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis update introduces a new Streamlit-based UI tool for configuring AI provider settings in the Wren AI service. It adds all supporting Python modules, a Dockerfile for containerization, and integrates the UI setup flow into the Go-based launcher, including Docker orchestration and configuration file management for custom LLM providers. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Launcher (Go)
participant Docker
participant Streamlit UI (Python)
participant Filesystem
User->>Launcher (Go): Selects "custom" LLM provider and runs launch
Launcher (Go)->>Filesystem: Checks ~/.wrenai exists
alt Directory missing
Launcher (Go)->>User: Warn and exit
else Directory exists
Launcher (Go)->>Docker: Remove old UI container if exists
Launcher (Go)->>Docker: Build Streamlit UI image
Launcher (Go)->>Filesystem: Ensure config.yaml, .env, config.done exist
Launcher (Go)->>Docker: Start Streamlit UI container with files mounted
User->>Streamlit UI (Python): Interacts with UI to configure providers
Streamlit UI (Python)->>Filesystem: Writes config.yaml and sets config.done to "true" when finished
Launcher (Go)->>Filesystem: Polls config.done
alt config.done == "true"
Launcher (Go)->>Docker: Remove UI container
Launcher (Go)->>User: Proceed with Wren AI launch
end
end
Suggested labels
Poem
Tip ⚡️ Faster reviews with caching
Enjoy the performance boost—your workflow just got faster. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…_state.py; combine init and override behaviors (import yaml files)
1. Rename custom_llm_ui.py to app.py for better naming convention 2. Extract Embedder and Document Store configuration from app.py to ui_components.py to improve code organization 3. Fix data type issue: ensure "timeout" value is properly converted to integer 4. Optimize alias handling: only include alias field in dictionary when it has a value
- Implemented pipeline configuration UI. - Set init_pipeline in session_state.py to manage pipeline state.
…and embedding models
… and ensure API keys are saved to .env
…s via Docker - Integrate 'custom' option in launch.go to trigger Streamlit-based config UI - Implement logic in docker.go to ensure required files (.env, config.yaml, config.done) are available
- Replace pip+requirements.txt with pyproject.toml and ppoetry.lock for dependency management
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
🔭 Outside diff range comments (1)
wren-ai-service/tools/providers-setup/dry_run_test.py (1)
66-73
: 🛠️ Refactor suggestionMissing existence check for API-key state
st.session_state[ConfigState.API_KEY]
is dereferenced without verifying that the key exists.
IfConfigState.init_apikey
hasn’t run (e.g., the user shortcuts directly to the dry-run page), this will raise.- for service, api_key in st.session_state[ConfigState.API_KEY].items(): + for service, api_key in st.session_state.get(ConfigState.API_KEY, {}).items(): os.environ[service] = api_key
🧹 Nitpick comments (11)
wren-ai-service/tools/providers-setup/pyproject.toml (1)
1-18
: Overall configuration looks good, but a few improvements could make it more complete.The Poetry configuration is well-structured with properly pinned dependencies and version constraints. Using Poetry for package management is a good choice for Python projects.
Consider adding these missing elements to your configuration:
- Add a proper description instead of leaving it empty
- Add a license field
- Consider adding development dependencies like pytest for testing
- Verify that the referenced README.md file exists
[tool.poetry] name = "providers-setup" version = "0.1.0" -description = "" +description = "Streamlit UI for configuring custom LLM providers for WrenAI" authors = ["YI-CHIEH LU <steven.yichieh.lu@gmail.com>"] readme = "README.md" package-mode = false +license = "MIT" # or appropriate license + +[tool.poetry.group.dev.dependencies] +pytest = "^7.4.0" +black = "^23.7.0"wren-ai-service/tools/providers-setup/app.py (1)
42-74
: Consider adding type hints to improve code maintainability.The code lacks type annotations which would improve readability and enable better IDE support and static type checking.
Add type annotations to improve code maintainability:
# Layout: two columns – left for inputs, right for preview/export -col1, col2 = st.columns([1.5, 1]) +col1, col2 = st.columns([1.5, 1]) # type: tuple[st.delta_generator.DeltaGenerator, st.delta_generator.DeltaGenerator]Also consider adding a docstring to the main file to explain its purpose and usage.
wren-ai-service/tools/providers-setup/Dockerfile (1)
45-49
: Consider adding a health check for improved container management.Adding a health check would help Docker determine when the application is ready to accept requests.
Add a HEALTHCHECK instruction to allow Docker to monitor the application status:
ENV STREAMLIT_SERVER_PORT=8501 ENV STREAMLIT_SERVER_ENABLECORS=false + +# Add health check to detect if Streamlit is running +HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ + CMD curl -f http://localhost:8501/ || exit 1wren-launcher/commands/launch.go (3)
251-252
: Fix typo in comment.There's a typo in the comment - "chech" instead of "check".
- // chech if ~/.wrenai is exist + // check if ~/.wrenai exists
265-270
: Improve error handling for container startup.The error handling when the container fails to start could be improved to provide more context and potentially suggest remediation steps.
Enhance the error handling:
// Build and start the Streamlit UI container err = utils.RunStreamlitUIContainer() if err != nil { - pterm.Error.Println("❌ Failed to start Streamlit UI:", err) + pterm.Error.Println("❌ Failed to start Streamlit UI container:") + pterm.Error.Println(err) + pterm.Info.Println("Try the following:") + pterm.Info.Println("1. Ensure Docker is running") + pterm.Info.Println("2. Check if port 8501 is available") + pterm.Info.Println("3. Check if ~/.wrenai directory has proper permissions") return }
283-287
: Handle errors more gracefully during container cleanup.The error handling during container cleanup could be improved to provide more context and next steps.
Improve error handling during container cleanup:
// Clean up container after config is complete if err := utils.RemoveContainerIfExists(containerName); err != nil { - pterm.Warning.Println("⚠️ Failed to remove existing container:", err) + pterm.Warning.Println("⚠️ Failed to remove container after completion:", err) + pterm.Info.Println("You can manually remove it with: docker rm -f", containerName) + // Continue anyway since the configuration is complete }wren-ai-service/tools/providers-setup/config_loader.py (1)
13-20
: Silent fallback obscures parsing errors
load_config_yaml_blocks()
swallows YAML exceptions and returns[]
, making the UI look as if “no config found” rather than “config invalid”.Surface the error to the caller (or Streamlit) so the user can correct their YAML, e.g.:
- print(f"❌ Failed to parse local config.yaml: {e}") - return [] + st.error(f"❌ Failed to parse local config.yaml: {e}") + raiseIf noisy exceptions aren’t desirable, at least propagate a structured error object.
wren-ai-service/tools/providers-setup/constants.py (1)
65-67
:CONFIG_IN_PATH
decided at import time
docker_path.exists()
is evaluated once, so if the file is created after module import (e.g., the Streamlit wizard writes it), later reads will still point at~/.wrenai/config.yaml
.Compute this lazily or provide a helper:
def get_config_in_path() -> Path: return docker_path if docker_path.exists() else local_pathand call it where needed.
wren-ai-service/tools/providers-setup/session_state.py (1)
165-172
: Network request during module import blocks UI startup
init_example_configs
performs an HTTP call on page load.
If GitHub is slow or unavailable, the entire UI will freeze until timeout (10 s).
Best practice: perform such I/O in a background thread or only when the user opens the “Examples” tab.- from config_loader import fetch_example_yaml_filenames - st.session_state[cls.EXAMPLE_CONFIG_NAMES_KEY] = fetch_example_yaml_filenames() + if cls.EXAMPLE_CONFIG_NAMES_KEY not in st.session_state: + from config_loader import fetch_example_yaml_filenames + st.session_state[cls.EXAMPLE_CONFIG_NAMES_KEY] = fetch_example_yaml_filenames()wren-launcher/utils/docker.go (1)
639-646
: Volume mounts break when paths contain spaces
"-v", configPath+":/app/data/config.yaml",
passes the entiresrc:dst
string as one CLI argument. IfconfigPath
contains spaces the colon
separator is treated as part of the source path, and Docker errors.Safer pattern:
-"-v", configPath+":/app/data/config.yaml", + "-v", fmt.Sprintf("%s:/app/data/config.yaml", configPath),but more robust is to rely on the
--mount
flag which does proper quoting:"-v", fmt.Sprintf("%s:/app/data/config.yaml", configPath), // ↩ maintain for backwards compat… "--mount", fmt.Sprintf("type=bind,source=%s,target=/app/data/.env", envPath),At minimum, document that the path must not contain spaces.
wren-ai-service/tools/providers-setup/ui_components.py (1)
221-232
: Remove unnecessary f-strings (static strings)Ruff flags lines 222, 282, 287:
f"**type:** \
embedder`"etc. The
fprefix allocates a
FormatStringeven though no interpolation occurs. Just drop the
f`:-st.markdown(f"**type:** `embedder`") +st.markdown("**type:** `embedder`")Same for the other two lines.
Also applies to: 282-289
🧰 Tools
🪛 Ruff (0.8.2)
222-222: f-string without any placeholders
Remove extraneous
f
prefix(F541)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
wren-ai-service/tools/providers-setup/poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (10)
wren-ai-service/tools/providers-setup/Dockerfile
(1 hunks)wren-ai-service/tools/providers-setup/app.py
(1 hunks)wren-ai-service/tools/providers-setup/config_loader.py
(1 hunks)wren-ai-service/tools/providers-setup/constants.py
(1 hunks)wren-ai-service/tools/providers-setup/dry_run_test.py
(1 hunks)wren-ai-service/tools/providers-setup/pyproject.toml
(1 hunks)wren-ai-service/tools/providers-setup/session_state.py
(1 hunks)wren-ai-service/tools/providers-setup/ui_components.py
(1 hunks)wren-launcher/commands/launch.go
(1 hunks)wren-launcher/utils/docker.go
(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (6)
wren-ai-service/tools/providers-setup/app.py (3)
wren-ai-service/tools/providers-setup/config_loader.py (2)
load_config_yaml_blocks
(8-20)group_blocks
(74-91)wren-ai-service/tools/providers-setup/session_state.py (2)
ConfigState
(5-173)init
(17-27)wren-ai-service/tools/providers-setup/ui_components.py (8)
render_llm_config
(147-211)render_embedder_config
(215-272)render_document_store_config
(276-322)render_import_yaml
(9-32)render_pipeline_config
(325-357)render_preview_and_generate
(360-410)render_apikey
(35-145)render_finished_setting
(412-426)
wren-launcher/commands/launch.go (1)
wren-launcher/utils/docker.go (4)
TryGetWrenAIDir
(552-564)RemoveContainerIfExists
(657-674)RunStreamlitUIContainer
(576-606)IsCustomConfigReady
(678-693)
wren-ai-service/tools/providers-setup/config_loader.py (1)
wren-ai-service/tools/providers-setup/session_state.py (2)
ConfigState
(5-173)init
(17-27)
wren-ai-service/tools/providers-setup/dry_run_test.py (1)
wren-ai-service/tools/providers-setup/session_state.py (1)
ConfigState
(5-173)
wren-ai-service/tools/providers-setup/ui_components.py (4)
wren-ai-service/tools/providers-setup/session_state.py (1)
ConfigState
(5-173)wren-ai-service/src/config.py (1)
config_loader
(86-96)wren-ai-service/tools/providers-setup/config_loader.py (2)
load_selected_example_yaml
(22-33)apply_config_blocks
(107-119)wren-ai-service/tools/providers-setup/dry_run_test.py (2)
llm_completion_test
(6-31)llm_embedding_test
(34-63)
wren-ai-service/tools/providers-setup/session_state.py (2)
wren-ai-service/src/config.py (1)
config_loader
(86-96)wren-ai-service/tools/providers-setup/config_loader.py (1)
fetch_example_yaml_filenames
(93-105)
🪛 Ruff (0.8.2)
wren-ai-service/tools/providers-setup/ui_components.py
222-222: f-string without any placeholders
Remove extraneous f
prefix
(F541)
282-282: f-string without any placeholders
Remove extraneous f
prefix
(F541)
287-287: f-string without any placeholders
Remove extraneous f
prefix
(F541)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: pytest
- GitHub Check: pytest
🔇 Additional comments (6)
wren-ai-service/tools/providers-setup/pyproject.toml (1)
9-14
:✅ Verification successful
Verify Python version constraint.
The Python version is pinned to 3.12, which is quite recent. This may limit compatibility with environments that haven't upgraded yet.
🌐 Web query:
What is the current Python version adoption and stability of Python 3.12?
💡 Result:
Python 3.12 Adoption and Stability (as of May 2025)
Adoption Trends
- Python 3.12 has been available since October 2, 2023, and is currently one of the actively supported versions of Python[10][5].
- According to recent ecosystem data, while adoption of the latest Python versions typically lags behind their release, uptake is strong. By April 2024, a significant proportion of developers were still using Python 3.8 or 3.9, but migration toward 3.12 is accelerating due to its performance and support advantages[6].
- Reports from early 2025 confirm that Python 3.12 has seen widespread adoption, especially among major library authors and in frameworks that seek to leverage its new features and performance enhancements[4][7].
- Major Linux distributions, such as Ubuntu 24.04 LTS, officially support Python 3.12 and will maintain it until at least August 2029, further boosting its adoption in both server and enterprise environments[8].
Stability and Maintenance
- Python 3.12 is classified as a stable release and is part of the regular maintenance schedule of the Python Software Foundation[10][5].
- As of February 2025, Python 3.12.9 was released, marking the ninth maintenance update for the series. This update included around 180 bug fixes, documentation improvements, and minor build enhancements since the previous release[2][7].
- The 3.12 branch continues to receive frequent security patches and bug fixes, ensuring high reliability for production use. This aligns with standard Python support practices, where each major version receives regular updates and long-term support timelines[8].
- The removal of deprecated modules and APIs (e.g., the
distutils
package) demonstrates a focus on maintaining a clean, robust language core[5].- Performance improvements from the Faster CPython initiative are reflected in real-world benchmarks, with Python 3.12 estimated to deliver about a 5% performance boost over previous versions[9][10].
Ecosystem and Compatibility
- Most major Python libraries and frameworks have added or are actively maintaining support for Python 3.12. This includes releases of popular packages such as Pydantic, FastAPI, and others that leverage new language features and typing improvements[4].
- Some older versions of libraries or tools may not support Python 3.12, so users are encouraged to upgrade dependencies as part of the migration process to avoid compatibility issues[6].
- Community sentiment remains positive, with developers reporting stable experiences and incremental improvements in day-to-day development tasks[9].
Summary Table: Python 3.12 Snapshot (May 2025)
Aspect Status/Details Release Date October 2, 2023 Current Version 3.12.9 (as of February 2025) Support Status Actively maintained; security and bug fixes ongoing, supported until at least August 2029[8] Adoption Widespread and increasing, especially in professional and enterprise environments[6][7][8] Stability High—frequent bug fixes, well-supported by the community and library ecosystem[2][4][5] Major Enhancements Performance gains (~5%), expressive f-strings, isolated subinterpreters, improved error messages Library Support Most major libraries and frameworks support 3.12[4][5] Conclusion
Python 3.12 is now a stable, production-ready release with strong adoption and ongoing support from both the core development team and the wider Python community. Its adoption is rapidly increasing thanks to performance improvements, enhanced security features, and robust ecosystem compatibility. Regular maintenance updates ensure its reliability for current and future projects[2][7][8].
Citations:
- 1: https://w3techs.com/technologies/history_details/pl-python/3
- 2: https://realpython.com/python-news-march-2025/
- 3: https://engineering.fb.com/2025/05/05/developer-tools/enhancing-the-python-ecosystem-with-type-checking-and-free-threading/
- 4: https://realpython.com/python-news-april-2025/
- 5: https://docs.python.org/3/whatsnew/3.12.html
- 6: https://goatswitch.ai/python-3-12-migration-guide-master-the-upgrade-from-python-3-8/
- 7: https://techstockinsights.hashnode.dev/python-tech-news-latest-updates-and-trends-2025
- 8: https://getnikola.com/python-version-support-policy.html
- 9: https://hackernoon.com/python-312-overview-past-limitations-faster-cpythons-advent-and-the-excitement-of-new-versions
- 10: https://www.python.org/downloads/release/python-3120/
Python 3.12 constraint is production-ready
I’ve verified that Python 3.12 is a stable, widely adopted release (supported until at least 2029 by major Linux distributions, with active maintenance and broad library support). Locking your dependency to “>=3.12,<3.13” is safe and should not pose compatibility risks in modern environments.
wren-ai-service/tools/providers-setup/app.py (1)
15-19
: Good page configuration for a better user experience.The Streamlit page configuration with wide layout and expanded sidebar provides a good user experience for a configuration tool.
wren-ai-service/tools/providers-setup/Dockerfile (2)
8-11
: Good practice with apt cleanup.Properly cleaning up after apt-get installs to keep the image size small is a best practice.
33-35
: Excellent Docker layer optimization.The multi-stage copy approach where dependency files are copied first before the rest of the code is a great technique for optimizing Docker layer caching.
wren-ai-service/tools/providers-setup/session_state.py (1)
30-37
:⚠️ Potential issue
force
flag ignored for previously-loaded LLM formsThe early-exit check (
if not force and ...
) prevents re-loading whenforce=True
is passed viaConfigState.init(...)
(l. 113-118 inconfig_loader.py
) because it testsnot force
first.
Flip the condition:- if not force and cls.LLM_FORMS_KEY in st.session_state and st.session_state[cls.LLM_FORMS_KEY]: + if not force: + if cls.LLM_FORMS_KEY in st.session_state and st.session_state[cls.LLM_FORMS_KEY]: + returnSame pattern applies to the other
init_*
methods.Likely an incorrect or invalid review comment.
wren-launcher/utils/docker.go (1)
609-613
: Relative build context is brittle – resolve to an absolute path
exec.Command("docker", "build", "-t", "wrenai-providers-setup", "../wren-ai-service/tools/providers-setup")
assumes the current working directory of the running binary.
When the launcher is installed system-wide (e.g./usr/local/bin
) the CWD is
often not the repository root, so the build fails with
unable to prepare context: path "../wren-ai-service/..." not found
.Consider:
-cmd := exec.Command("docker", "build", "-t", "wrenai-providers-setup", "../wren-ai-service/tools/providers-setup") +absCtx, err := filepath.Abs(filepath.Join(filepath.Dir(os.Args[0]), "..", "wren-ai-service", "tools", "providers-setup")) +if err != nil { return err } +cmd := exec.Command("docker", "build", "-t", "wrenai-providers-setup", absCtx)This decouples the build from whichever directory the user invokes the CLI in.
# Use official slim Python 3.12 base image | ||
FROM python:3.12.0-slim | ||
|
||
# ------------------------------- | ||
# System Dependencies for Poetry | ||
# ------------------------------- | ||
# Install minimal system packages: curl (for downloading), build tools (for native extensions) | ||
RUN apt-get update && apt-get install -y --no-install-recommends \ | ||
curl build-essential gcc \ | ||
&& apt-get clean \ | ||
&& rm -rf /var/lib/apt/lists/* | ||
|
||
# ------------------------------- | ||
# Install Poetry (Python package manager) | ||
# ------------------------------- | ||
ENV POETRY_VERSION=1.8.2 | ||
ENV POETRY_HOME="/opt/poetry" | ||
ENV PATH="$POETRY_HOME/bin:$PATH" | ||
ENV POETRY_VIRTUALENVS_CREATE=false | ||
# Don't use virtualenvs inside the container | ||
|
||
RUN curl -sSL https://install.python-poetry.org | python3 - \ | ||
&& ln -s $POETRY_HOME/bin/poetry /usr/local/bin/poetry | ||
|
||
# ------------------------------- | ||
# Set working directory for app | ||
# ------------------------------- | ||
WORKDIR /app | ||
|
||
# ------------------------------- | ||
# Install Python dependencies via Poetry | ||
# ------------------------------- | ||
# Copy only dependency files first to leverage Docker layer caching | ||
COPY pyproject.toml poetry.lock ./ | ||
RUN poetry install --no-interaction --no-ansi | ||
|
||
# ------------------------------- | ||
# Copy remaining app code | ||
# ------------------------------- | ||
COPY . . | ||
|
||
# ------------------------------- | ||
# Environment variables for Streamlit | ||
# ------------------------------- | ||
ENV PYTHONUNBUFFERED=1 | ||
ENV STREAMLIT_SERVER_HEADLESS=true | ||
ENV STREAMLIT_SERVER_PORT=8501 | ||
ENV STREAMLIT_SERVER_ENABLECORS=false | ||
|
||
# ------------------------------- | ||
# Expose Streamlit port | ||
# ------------------------------- | ||
EXPOSE 8501 | ||
|
||
# ------------------------------- | ||
# Default command to run the Streamlit app | ||
# ------------------------------- | ||
CMD ["streamlit", "run", "app.py"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Run container as non-root user for security.
The Dockerfile doesn't specify a user, which means the container will run as root by default. This is a security risk.
Add a non-root user and switch to it before running the application:
COPY . .
+# -------------------------------
+# Create and switch to non-root user
+# -------------------------------
+RUN useradd -m streamlituser
+USER streamlituser
+
# -------------------------------
# Environment variables for Streamlit
# -------------------------------
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Use official slim Python 3.12 base image | |
FROM python:3.12.0-slim | |
# ------------------------------- | |
# System Dependencies for Poetry | |
# ------------------------------- | |
# Install minimal system packages: curl (for downloading), build tools (for native extensions) | |
RUN apt-get update && apt-get install -y --no-install-recommends \ | |
curl build-essential gcc \ | |
&& apt-get clean \ | |
&& rm -rf /var/lib/apt/lists/* | |
# ------------------------------- | |
# Install Poetry (Python package manager) | |
# ------------------------------- | |
ENV POETRY_VERSION=1.8.2 | |
ENV POETRY_HOME="/opt/poetry" | |
ENV PATH="$POETRY_HOME/bin:$PATH" | |
ENV POETRY_VIRTUALENVS_CREATE=false | |
# Don't use virtualenvs inside the container | |
RUN curl -sSL https://install.python-poetry.org | python3 - \ | |
&& ln -s $POETRY_HOME/bin/poetry /usr/local/bin/poetry | |
# ------------------------------- | |
# Set working directory for app | |
# ------------------------------- | |
WORKDIR /app | |
# ------------------------------- | |
# Install Python dependencies via Poetry | |
# ------------------------------- | |
# Copy only dependency files first to leverage Docker layer caching | |
COPY pyproject.toml poetry.lock ./ | |
RUN poetry install --no-interaction --no-ansi | |
# ------------------------------- | |
# Copy remaining app code | |
# ------------------------------- | |
COPY . . | |
# ------------------------------- | |
# Environment variables for Streamlit | |
# ------------------------------- | |
ENV PYTHONUNBUFFERED=1 | |
ENV STREAMLIT_SERVER_HEADLESS=true | |
ENV STREAMLIT_SERVER_PORT=8501 | |
ENV STREAMLIT_SERVER_ENABLECORS=false | |
# ------------------------------- | |
# Expose Streamlit port | |
# ------------------------------- | |
EXPOSE 8501 | |
# ------------------------------- | |
# Default command to run the Streamlit app | |
# ------------------------------- | |
CMD ["streamlit", "run", "app.py"] | |
# Use official slim Python 3.12 base image | |
FROM python:3.12.0-slim | |
# ------------------------------- | |
# System Dependencies for Poetry | |
# ------------------------------- | |
# Install minimal system packages: curl (for downloading), build tools (for native extensions) | |
RUN apt-get update && apt-get install -y --no-install-recommends \ | |
curl build-essential gcc \ | |
&& apt-get clean \ | |
&& rm -rf /var/lib/apt/lists/* | |
# ------------------------------- | |
# Install Poetry (Python package manager) | |
# ------------------------------- | |
ENV POETRY_VERSION=1.8.2 | |
ENV POETRY_HOME="/opt/poetry" | |
ENV PATH="$POETRY_HOME/bin:$PATH" | |
ENV POETRY_VIRTUALENVS_CREATE=false | |
# Don't use virtualenvs inside the container | |
RUN curl -sSL https://install.python-poetry.org | python3 - \ | |
&& ln -s $POETRY_HOME/bin/poetry /usr/local/bin/poetry | |
# ------------------------------- | |
# Set working directory for app | |
# ------------------------------- | |
WORKDIR /app | |
# ------------------------------- | |
# Install Python dependencies via Poetry | |
# ------------------------------- | |
# Copy only dependency files first to leverage Docker layer caching | |
COPY pyproject.toml poetry.lock ./ | |
RUN poetry install --no-interaction --no-ansi | |
# ------------------------------- | |
# Copy remaining app code | |
# ------------------------------- | |
COPY . . | |
# ------------------------------- | |
# Create and switch to non-root user | |
# ------------------------------- | |
RUN useradd -m streamlituser | |
USER streamlituser | |
# ------------------------------- | |
# Environment variables for Streamlit | |
# ------------------------------- | |
ENV PYTHONUNBUFFERED=1 | |
ENV STREAMLIT_SERVER_HEADLESS=true | |
ENV STREAMLIT_SERVER_PORT=8501 | |
ENV STREAMLIT_SERVER_ENABLECORS=false | |
# ------------------------------- | |
# Expose Streamlit port | |
# ------------------------------- | |
EXPOSE 8501 | |
# ------------------------------- | |
# Default command to run the Streamlit app | |
# ------------------------------- | |
CMD ["streamlit", "run", "app.py"] |
🤖 Prompt for AI Agents (early access)
In wren-ai-service/tools/providers-setup/Dockerfile lines 1 to 58, the container
runs as root by default, which is a security risk. To fix this, add commands to
create a non-root user and group, set appropriate ownership of the /app
directory, and switch to this user before the CMD instruction. This ensures the
application runs with limited privileges inside the container.
url = "https://api.github.com/repos/Canner/WrenAI/releases/latest" | ||
try: | ||
response = requests.get(url, timeout=10) | ||
if response.status_code == 200: | ||
data = response.json() | ||
return data["tag_name"] | ||
else: | ||
print(f"Failed to get latest release: {response.status_code}") | ||
except Exception as e: | ||
print(f"Error fetching latest config version: {e}") | ||
|
||
return "main" # Fallback to 'main' branch if the request fails |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Unauthenticated GitHub API call risks 403 rate-limit
/releases/latest
without a token is limited to 60 requests / hour per IP.
Inside Docker-based CI or many users behind NAT this can quickly return 403, causing the app to fall back to main
and possibly fetch incompatible config schemas.
Consider:
- Reading a
GITHUB_TOKEN
from env and adding anAuthorization
header when present. - Caching the result in a file or environment variable to avoid repeated calls during a single run.
🤖 Prompt for AI Agents (early access)
In wren-ai-service/tools/providers-setup/constants.py around lines 17 to 28, the
GitHub API call to fetch the latest release is unauthenticated, risking 403 rate
limits. Fix this by reading a GITHUB_TOKEN from the environment and, if present,
include it as an Authorization header in the requests.get call. Additionally,
implement caching of the fetched tag_name in a file or environment variable to
avoid repeated API calls during a single run.
userUUID, err := prepareUserUUID(projectDir) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
// if config.yaml file does not exist, return error | ||
if _, err := os.Stat(getConfigFilePath(projectDir)); os.IsNotExist(err) { | ||
return fmt.Errorf("config.yaml file does not exist, please download the config.yaml file from %s to ~/.wrenai, rename it to config.yaml and fill in the required information", AI_SERVICE_CONFIG_URL) | ||
// Ensure .env exists (download if missing) | ||
envFilePath := getEnvFilePath(projectDir) | ||
if _, err := os.Stat(envFilePath); os.IsNotExist(err) { | ||
pterm.Println(".env file not found, downloading from default URL...") | ||
err = downloadFile(envFilePath, DOCKER_COMPOSE_ENV_URL) | ||
if err != nil { | ||
return fmt.Errorf("failed to download .env: %v", err) | ||
} | ||
|
||
// Read downloaded .env content | ||
contentBytes, err := os.ReadFile(envFilePath) | ||
if err != nil { | ||
return fmt.Errorf("failed to read .env: %v", err) | ||
} | ||
str := string(contentBytes) | ||
|
||
// Replace or append USER_UUID | ||
reg := regexp.MustCompile(`(?m)^USER_UUID=.*$`) | ||
if reg.MatchString(str) { | ||
str = reg.ReplaceAllString(str, "USER_UUID="+userUUID) | ||
} else { | ||
str = str + "\nUSER_UUID=" + userUUID + "\n" | ||
} | ||
|
||
// Save updated .env file | ||
err = os.WriteFile(envFilePath, []byte(str), 0644) | ||
if err != nil { | ||
return fmt.Errorf("failed to write updated .env: %v", err) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
USER_UUID
may never be written when a pre-existing .env
file is present
The code only inserts USER_UUID
when the .env
file does not exist.
If the user already has a .env
, there is no guarantee that the key is
present (or up-to-date), causing later logic that relies on it to fail.
- if _, err := os.Stat(envFilePath); os.IsNotExist(err) {
- ... # write USER_UUID
- }
+// Always make sure USER_UUID exists / is refreshed
+contentBytes, _ := os.ReadFile(envFilePath) // ignore err when file does not exist
+str := string(contentBytes)
+reg := regexp.MustCompile(`(?m)^USER_UUID=.*$`)
+if reg.MatchString(str) {
+ str = reg.ReplaceAllString(str, "USER_UUID="+userUUID)
+} else {
+ str += "\nUSER_UUID=" + userUUID + "\n"
+}
+_ = os.WriteFile(envFilePath, []byte(str), 0644)
This keeps the original behaviour for first-time users while also patching
existing environments.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents (early access)
In wren-launcher/utils/docker.go around lines 300 to 334, the code only inserts
or updates USER_UUID in the .env file if the file does not exist, but it should
also ensure USER_UUID is present and updated in pre-existing .env files. Modify
the logic to always read the .env file, check if USER_UUID is present, and
replace or append it accordingly, then write back the updated content regardless
of whether the file was newly downloaded or already existed.
with st.expander("Embedder Configuration", expanded=False): | ||
st.markdown(f"**type:** `embedder`") | ||
st.markdown(f"**provider:** `{st.session_state[ConfigState.EMBEDDER_KEY].get('provider')}`") | ||
|
||
embedding_models = st.session_state[ConfigState.EMBEDDER_KEY].get("models", []) | ||
embedding_api_base = embedding_models[0].get("api_base", "https://api.openai.com/v1") if embedding_models else "" | ||
|
||
embedding_model_name = st.text_input("Embedding Model Name", key="embedding_model_name", value="text-embedding-3-large") | ||
embedding_model_alias = st.text_input("Alias (optional, e.g. default)", key="embedding_model_alias", value="default") | ||
embedding_model_api_base = st.text_input("API Base URL", key="embedding_model_api_base", value=embedding_api_base) | ||
embedding_model_timeout = st.text_input("Timeout (default: 120)", key="embedding_model_timeout", value="120") | ||
|
||
custom_embedding_setting = [{ | ||
"model": embedding_model_name, | ||
"alias": embedding_model_alias, | ||
"timeout": embedding_model_timeout, | ||
"api_base": embedding_model_api_base | ||
}] | ||
|
||
if st.button("💾 save", key="save_embedding_model"): | ||
errors = [] | ||
if not embedding_model_name: | ||
errors.append("Embedding Model Name is required.") | ||
if not embedding_model_timeout: | ||
errors.append("Timeout is required.") | ||
else: | ||
try: | ||
int(embedding_model_timeout) | ||
except ValueError: | ||
errors.append("Timeout must be an integer.") | ||
|
||
if errors: | ||
for error in errors: | ||
st.error(error) | ||
else: | ||
st.session_state.embedding_model = { | ||
"type": "embedder", | ||
"provider": st.session_state[ConfigState.EMBEDDER_KEY].get("provider"), | ||
"models": custom_embedding_setting | ||
} | ||
st.success("Updated embedder model configuration") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Embedder settings are saved to a new key, breaking later preview/export
render_embedder_config()
writes to st.session_state.embedding_model
(lines 256-261), but other code (e.g. preview generation) expects the data at
st.session_state[ConfigState.EMBEDDER_KEY]
. As a result, user updates are
silently ignored in the generated config.yaml
.
- st.session_state.embedding_model = {
+ st.session_state[ConfigState.EMBEDDER_KEY] = {
Apply the same adjustment in render_document_store_config()
where
st.session_state.document_store
is used instead of
ConfigState.DOC_STORE_KEY
.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
with st.expander("Embedder Configuration", expanded=False): | |
st.markdown(f"**type:** `embedder`") | |
st.markdown(f"**provider:** `{st.session_state[ConfigState.EMBEDDER_KEY].get('provider')}`") | |
embedding_models = st.session_state[ConfigState.EMBEDDER_KEY].get("models", []) | |
embedding_api_base = embedding_models[0].get("api_base", "https://api.openai.com/v1") if embedding_models else "" | |
embedding_model_name = st.text_input("Embedding Model Name", key="embedding_model_name", value="text-embedding-3-large") | |
embedding_model_alias = st.text_input("Alias (optional, e.g. default)", key="embedding_model_alias", value="default") | |
embedding_model_api_base = st.text_input("API Base URL", key="embedding_model_api_base", value=embedding_api_base) | |
embedding_model_timeout = st.text_input("Timeout (default: 120)", key="embedding_model_timeout", value="120") | |
custom_embedding_setting = [{ | |
"model": embedding_model_name, | |
"alias": embedding_model_alias, | |
"timeout": embedding_model_timeout, | |
"api_base": embedding_model_api_base | |
}] | |
if st.button("💾 save", key="save_embedding_model"): | |
errors = [] | |
if not embedding_model_name: | |
errors.append("Embedding Model Name is required.") | |
if not embedding_model_timeout: | |
errors.append("Timeout is required.") | |
else: | |
try: | |
int(embedding_model_timeout) | |
except ValueError: | |
errors.append("Timeout must be an integer.") | |
if errors: | |
for error in errors: | |
st.error(error) | |
else: | |
st.session_state.embedding_model = { | |
"type": "embedder", | |
"provider": st.session_state[ConfigState.EMBEDDER_KEY].get("provider"), | |
"models": custom_embedding_setting | |
} | |
st.success("Updated embedder model configuration") | |
with st.expander("Embedder Configuration", expanded=False): | |
st.markdown(f"**type:** `embedder`") | |
st.markdown(f"**provider:** `{st.session_state[ConfigState.EMBEDDER_KEY].get('provider')}`") | |
embedding_models = st.session_state[ConfigState.EMBEDDER_KEY].get("models", []) | |
embedding_api_base = embedding_models[0].get("api_base", "https://api.openai.com/v1") if embedding_models else "" | |
embedding_model_name = st.text_input("Embedding Model Name", key="embedding_model_name", value="text-embedding-3-large") | |
embedding_model_alias = st.text_input("Alias (optional, e.g. default)", key="embedding_model_alias", value="default") | |
embedding_model_api_base = st.text_input("API Base URL", key="embedding_model_api_base", value=embedding_api_base) | |
embedding_model_timeout = st.text_input("Timeout (default: 120)", key="embedding_model_timeout", value="120") | |
custom_embedding_setting = [{ | |
"model": embedding_model_name, | |
"alias": embedding_model_alias, | |
"timeout": embedding_model_timeout, | |
"api_base": embedding_model_api_base | |
}] | |
if st.button("💾 save", key="save_embedding_model"): | |
errors = [] | |
if not embedding_model_name: | |
errors.append("Embedding Model Name is required.") | |
if not embedding_model_timeout: | |
errors.append("Timeout is required.") | |
else: | |
try: | |
int(embedding_model_timeout) | |
except ValueError: | |
errors.append("Timeout must be an integer.") | |
if errors: | |
for error in errors: | |
st.error(error) | |
else: | |
- st.session_state.embedding_model = { | |
+ st.session_state[ConfigState.EMBEDDER_KEY] = { | |
"type": "embedder", | |
"provider": st.session_state[ConfigState.EMBEDDER_KEY].get("provider"), | |
"models": custom_embedding_setting | |
} | |
st.success("Updated embedder model configuration") |
🧰 Tools
🪛 Ruff (0.8.2)
222-222: f-string without any placeholders
Remove extraneous f
prefix
(F541)
🤖 Prompt for AI Agents (early access)
In wren-ai-service/tools/providers-setup/ui_components.py around lines 221 to
262, the embedder settings are saved to st.session_state.embedding_model instead
of updating st.session_state[ConfigState.EMBEDDER_KEY], causing later
preview/export to miss user changes. Fix this by saving the updated embedder
configuration back to st.session_state[ConfigState.EMBEDDER_KEY] instead of
st.session_state.embedding_model. Similarly, in the
render_document_store_config() function, ensure updates are saved to
st.session_state[ConfigState.DOC_STORE_KEY] rather than
st.session_state.document_store to maintain consistency.
kwargs_dict = {} | ||
for p in form["kwargs"]: | ||
if p["key"]: | ||
try: | ||
kwargs_dict[p["key"]] = eval(p["value"]) | ||
except Exception: | ||
kwargs_dict[p["key"]] = p["value"] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eval()
is a security hazard – replace with ast.literal_eval
Arbitrary Python expressions entered in the UI are executed server-side,
allowing code injection (e.g. __import__("os").system("rm -rf /")
).
Use ast.literal_eval
, which safely parses Python literals.
-import yaml
+import yaml
+import ast # ← new import
@@
- try:
- kwargs_dict[p["key"]] = eval(p["value"])
- except Exception:
- kwargs_dict[p["key"]] = p["value"]
+ try:
+ kwargs_dict[p["key"]] = ast.literal_eval(p["value"])
+ except (ValueError, SyntaxError):
+ kwargs_dict[p["key"]] = p["value"] # fallback to raw string
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
kwargs_dict = {} | |
for p in form["kwargs"]: | |
if p["key"]: | |
try: | |
kwargs_dict[p["key"]] = eval(p["value"]) | |
except Exception: | |
kwargs_dict[p["key"]] = p["value"] | |
# at top of file, alongside the existing imports | |
import yaml | |
import ast # ← new import for safe literal parsing | |
# … | |
kwargs_dict = {} | |
for p in form["kwargs"]: | |
if p["key"]: | |
try: | |
# safely parse Python literals instead of full eval() | |
kwargs_dict[p["key"]] = ast.literal_eval(p["value"]) | |
except (ValueError, SyntaxError): | |
# fallback to the raw string if it's not a valid literal | |
kwargs_dict[p["key"]] = p["value"] | |
# … |
🤖 Prompt for AI Agents (early access)
In wren-ai-service/tools/providers-setup/ui_components.py around lines 443 to
450, the code uses eval() to parse user input, which is a security risk due to
arbitrary code execution. Replace eval() with ast.literal_eval() to safely parse
only Python literals. Import the ast module if not already imported, and update
the try block to use ast.literal_eval(p["value"]) instead of eval(p["value"]).
…guration handling - Removed the document store configuration section and its associated rendering logic. - Updated the embedder configuration to include dimension input for embedding models. - Simplified the rendering of pipeline configurations and improved the selection process for LLM models. - Enhanced the preview and generation functionality for configuration files. - Adjusted the layout and organization of UI components for better clarity and usability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
wren-ai-service/tools/providers-setup/ui_components.py (3)
268-272
: Embedder settings are saved to a new key, breaking later preview/export.
render_embedder_config()
writes tost.session_state.embedding_model
, but other code expects the data atst.session_state[ConfigState.EMBEDDER_KEY]
. User updates are silently ignored in the generated config.- st.session_state.embedding_model = { + st.session_state[ConfigState.EMBEDDER_KEY] = { "type": "embedder", "provider": st.session_state[ConfigState.EMBEDDER_KEY].get("provider"), "models": custom_embedding_setting }
430-437
:eval()
is a security hazard – replace withast.literal_eval
Arbitrary Python expressions entered in the UI are executed server-side, allowing code injection (e.g.
__import__("os").system("rm -rf /")
).
Useast.literal_eval
, which safely parses Python literals.-import yaml +import yaml +import ast # ← new import @@ - try: - kwargs_dict[p["key"]] = eval(p["value"]) - except Exception: - kwargs_dict[p["key"]] = p["value"] + try: + kwargs_dict[p["key"]] = ast.literal_eval(p["value"]) + except (ValueError, SyntaxError): + kwargs_dict[p["key"]] = p["value"] # fallback to raw string
274-281
:⚠️ Potential issueDocument store settings are saved to a wrong session key.
Similar to the embedder settings issue, document store data is written to
st.session_state.document_store
instead ofst.session_state[ConfigState.DOC_STORE_KEY]
.- st.session_state.document_store = { + st.session_state[ConfigState.DOC_STORE_KEY] = { "type": "document_store", "provider": st.session_state[ConfigState.DOC_STORE_KEY].get("provider"), "location": st.session_state[ConfigState.DOC_STORE_KEY].get("location"), "embedding_model_dim": document_store_dim, "timeout": st.session_state[ConfigState.DOC_STORE_KEY].get("timeout"), "recreate_index": st.session_state[ConfigState.DOC_STORE_KEY].get("recreate_index") }
🧹 Nitpick comments (2)
wren-ai-service/tools/providers-setup/ui_components.py (2)
224-224
: Remove unnecessary f-string prefix.The string doesn't contain any interpolation placeholders, so the
f
prefix is unnecessary.- st.markdown(f"**type:** `embedder`") + st.markdown("**type:** `embedder`")🧰 Tools
🪛 Ruff (0.8.2)
224-224: f-string without any placeholders
Remove extraneous
f
prefix(F541)
235-235
: Fix typo in the comment.There's a typo in "doucument_store".
- document_store_dim = st.text_input("Embedding_model_dim (placed in doucument_store)", key="embedding_model_dim", value="3072") + document_store_dim = st.text_input("Embedding_model_dim (placed in document_store)", key="embedding_model_dim", value="3072")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
wren-ai-service/tools/providers-setup/app.py
(1 hunks)wren-ai-service/tools/providers-setup/ui_components.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- wren-ai-service/tools/providers-setup/app.py
🧰 Additional context used
🪛 Ruff (0.8.2)
wren-ai-service/tools/providers-setup/ui_components.py
224-224: f-string without any placeholders
Remove extraneous f
prefix
(F541)
… component integration - Refactored configuration path retrieval by introducing dedicated functions for fetching paths to config.yaml, config.done, and .env files. - Updated UI components to utilize the new configuration path functions, enhancing maintainability and clarity. - Simplified file operations in the API key management section to ensure consistent access to the environment file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (5)
wren-ai-service/tools/providers-setup/constants.py (1)
17-28
:⚠️ Potential issueUnauthenticated GitHub API call risks 403 rate-limit
/releases/latest
without a token is limited to 60 requests / hour per IP. Inside Docker-based CI or many users behind NAT this can quickly return 403, causing the app to fall back tomain
and possibly fetch incompatible config schemas.Apply this fix to authenticate with GitHub and cache the result:
import os from pathlib import Path import requests + # Cache for storing the fetched version to avoid repeated API calls + _cached_version = None def get_latest_config_version(): """ Retrieve the latest release tag from the WrenAI GitHub repository. Returns: str: The latest version tag (e.g., "0.20.2") if successful, or "main" as a fallback if the request fails. """ + global _cached_version + + # Return cached version if available + if _cached_version: + return _cached_version + url = "https://api.github.com/repos/Canner/WrenAI/releases/latest" try: + headers = {} + # Use GitHub token if available + if "GITHUB_TOKEN" in os.environ: + headers["Authorization"] = f"token {os.environ['GITHUB_TOKEN']}" + - response = requests.get(url, timeout=10) + response = requests.get(url, headers=headers, timeout=10) if response.status_code == 200: data = response.json() + _cached_version = data["tag_name"] + return _cached_version - return data["tag_name"] else: print(f"Failed to get latest release: {response.status_code}") except Exception as e: print(f"Error fetching latest config version: {e}") return "main" # Fallback to 'main' branch if the request failswren-ai-service/tools/providers-setup/config_loader.py (2)
55-66
:⚠️ Potential issue
extract_config_blocks
may pass a list where a dict is expected
group_blocks
turns duplicate-type blocks into a list (lines 84–88).extract_config_blocks
then forwards that raw value (which could be a list) toConfigState.init
(lines 114-118).ConfigState.init_llm_forms
expects a dict and calls.get("models")
, which will raise anAttributeError
for lists.Consider coercing to a canonical shape to avoid AttributeError:
def extract_config_blocks(config_list: List[Dict[str, Any]]) -> Dict[str, Any]: """ Convert a flat list of config blocks into grouped dictionary format with keys like 'llm', 'embedder', 'document_store', and 'pipeline'. """ grouped = group_blocks(config_list) + + # Handle potentially nested lists from group_blocks + llm = grouped.get("llm", {}) + if isinstance(llm, list): + llm = llm[0] # Take first item if multiple LLM blocks + + embedder = grouped.get("embedder", {}) + if isinstance(embedder, list): + embedder = embedder[0] + + document_store = grouped.get("document_store", {}) + if isinstance(document_store, list): + document_store = document_store[0] + + pipeline = grouped.get("pipeline", {}) + if isinstance(pipeline, list): + pipeline = pipeline[0] + return { - "llm": grouped.get("llm", {}), - "embedder": grouped.get("embedder", {}), - "document_store": grouped.get("document_store", {}), - "pipeline": grouped.get("pipeline", {}) + "llm": llm, + "embedder": embedder, + "document_store": document_store, + "pipeline": pipeline }
75-92
:⚠️ Potential issueInefficient list–promotion logic in
group_blocks
When encountering the second block of a given type, the code wraps the first block in a list and appends the new one. However, for a third block, it hits the
else
path and replaces the list with[old_list, block]
, nesting lists unexpectedly instead of appending to the existing list.Fix this by properly handling the transition from single item to list:
def group_blocks(blocks: List[Dict[str, Any]]) -> Dict[str, Any]: """ Group YAML blocks by their 'type' field. If multiple blocks share the same type, they are stored as a list. """ - save_blocks = {} + save_blocks = {} for block in blocks: key = block.get("type") or ("settings" if "settings" in block else None) if not key: continue if key in save_blocks: if isinstance(save_blocks[key], list): save_blocks[key].append(block) else: - save_blocks[key] = [save_blocks[key], block] + save_blocks[key] = [save_blocks[key], block] else: save_blocks[key] = block return save_blocksBetter yet, refactor to use a defaultdict to simplify the logic:
from typing import Any, Dict, List +from collections import defaultdict def group_blocks(blocks: List[Dict[str, Any]]) -> Dict[str, Any]: """ Group YAML blocks by their 'type' field. If multiple blocks share the same type, they are stored as a list. """ - save_blocks = {} + # Track counts for each key type + key_counts = defaultdict(int) + save_blocks = {} for block in blocks: key = block.get("type") or ("settings" if "settings" in block else None) if not key: continue + key_counts[key] += 1 + count = key_counts[key] - if key in save_blocks: - if isinstance(save_blocks[key], list): - save_blocks[key].append(block) - else: - save_blocks[key] = [save_blocks[key], block] - else: + if count == 1: save_blocks[key] = block + elif count == 2: + # Convert to list when we see the second item + save_blocks[key] = [save_blocks[key], block] + else: + # Append to the list for third+ items + save_blocks[key].append(block) return save_blockswren-ai-service/tools/providers-setup/ui_components.py (2)
264-277
:⚠️ Potential issueEmbedder settings are saved to a new key, breaking later preview/export
render_embedder_config()
writes tost.session_state.embedding_model
(lines 264-268), but other code expects the data atst.session_state[ConfigState.EMBEDDER_KEY]
. As a result, user updates are silently ignored in the generatedconfig.yaml
.Apply this fix:
else: - st.session_state.embedding_model = { + st.session_state[ConfigState.EMBEDDER_KEY] = { "type": "embedder", "provider": st.session_state[ConfigState.EMBEDDER_KEY].get("provider"), "models": custom_embedding_setting } - st.session_state.document_store = { + st.session_state[ConfigState.DOC_STORE_KEY] = { "type": "document_store", "provider": st.session_state[ConfigState.DOC_STORE_KEY].get("provider"), "location": st.session_state[ConfigState.DOC_STORE_KEY].get("location"), "embedding_model_dim": document_store_dim, "timeout": st.session_state[ConfigState.DOC_STORE_KEY].get("timeout"), "recreate_index": st.session_state[ConfigState.DOC_STORE_KEY].get("recreate_index") }
428-435
:⚠️ Potential issue
eval()
is a security hazard – replace withast.literal_eval
Arbitrary Python expressions entered in the UI are executed server-side, allowing code injection (e.g.
__import__("os").system("rm -rf /")
). Useast.literal_eval
instead, which safely parses Python literals.+import ast # Add at the top of the file # Convert list of kwargs to dictionary kwargs_dict = {} for p in form["kwargs"]: if p["key"]: try: - kwargs_dict[p["key"]] = eval(p["value"]) + kwargs_dict[p["key"]] = ast.literal_eval(p["value"]) except Exception: kwargs_dict[p["key"]] = p["value"]
🧹 Nitpick comments (12)
wren-ai-service/tools/providers-setup/constants.py (2)
2-2
: Remove unused importThe
os
module is imported but never used in this file.from pathlib import Path -import os import requests
🧰 Tools
🪛 Ruff (0.8.2)
2-2:
os
imported but unusedRemove unused import:
os
(F401)
59-87
: Refactor path resolution functions to reduce code duplicationThe three path resolution functions (
get_config_done_path
,get_config_path
, andget_env_path
) share almost identical logic but operate on different filenames.Consider consolidating these into a single function:
+def get_file_path(filename, subdir=".wrenai"): + """ + Get the path for a config file, preferring Docker-mounted version if available. + + Args: + filename (str): Name of the file to locate + subdir (str): Subdirectory name under user's home directory + + Returns: + Path: Path object for the requested file + """ + docker_path = volume_app_data / filename + local_path = Path.home() / subdir / filename + + return docker_path if docker_path.exists() else local_path + +def get_config_done_path(): + return get_file_path("config.done") + +def get_config_path(): + return get_file_path("config.yaml") + +def get_env_path(): + return get_file_path(".env")wren-ai-service/tools/providers-setup/config_loader.py (3)
3-5
: Use explicit relative imports for better maintainabilityThe code uses implicit relative imports which can lead to import errors when the package structure changes.
-from session_state import ConfigState +from .session_state import ConfigState from pathlib import Path -import constants as cst +from . import constants as cst
18-19
: Replace print statements with proper loggingUsing print statements for error handling makes it difficult to control log levels, format logs consistently, or redirect logs to appropriate destinations.
+import logging + +logger = logging.getLogger(__name__) + def load_config_yaml_blocks() -> List[Dict[str, Any]]: """ Load the config.yaml from local disk if available; otherwise, fetch it from the GitHub URL without downloading it. """ CONFIG_IN_PATH = cst.get_config_path() if CONFIG_IN_PATH.exists(): try: return load_yaml_list(CONFIG_IN_PATH) except Exception as e: - print(f"❌ Failed to parse local config.yaml: {e}") + logger.error(f"Failed to parse local config.yaml: {e}") return []Apply similar changes to error handling in other functions.
Also applies to: 32-34, 52-53
3-4
: Add type annotations to improve code clarity and enable static type checkingType hints are inconsistently applied across the module. Adding more comprehensive type annotations would make the code more maintainable.
Example for imported modules:
-from session_state import ConfigState +from session_state import ConfigState from pathlib import Path import constants as cst -from typing import Any, Dict, List +from typing import Any, Dict, List, Optionalwren-ai-service/tools/providers-setup/ui_components.py (7)
220-220
: Remove extraneous f-string prefixThis line uses an f-string prefix but doesn't contain any placeholders.
- st.markdown(f"**type:** `embedder`") + st.markdown("**type:** `embedder`")🧰 Tools
🪛 Ruff (0.8.2)
220-220: f-string without any placeholders
Remove extraneous
f
prefix(F541)
231-231
: Fix typo in comment: "doucument_store"There's a typo in the comment for the document store dimension field.
- # Dimension of the embedding model output vector, used by document store for index creation - document_store_dim = st.text_input("Embedding_model_dim (placed in doucument_store)", key="embedding_model_dim", value="3072") + # Dimension of the embedding model output vector, used by document store for index creation + document_store_dim = st.text_input("Embedding_model_dim (placed in document_store)", key="embedding_model_dim", value="3072")
294-342
: Remove commented-out codeThe file contains a large block of commented-out code for
render_document_store_config()
that should be removed to improve readability and maintainability.-# DELETE -# def render_document_store_config(): -# """ -# Render the document store configuration section. -# Displays current settings and allows updating index location, dimensions, timeout, etc. -# """ -# with st.expander("Document Store Configuration", expanded=False): -# st.markdown(f"**type:** `document_store`") -# st.markdown(f"**provider:** `{st.session_state[ConfigState.DOC_STORE_KEY].get('provider')}`") -# st.markdown(f"**location:** `{st.session_state[ConfigState.DOC_STORE_KEY].get('location')}`") - -# document_store_timeout = st.text_input("Timeout (default: 120)", key="document_store_timeout", value="120") -# st.markdown(f"**timeout:** `120`") -# st.markdown(f"**recreate_index:** `{st.session_state[ConfigState.DOC_STORE_KEY].get('recreate_index')}`") - -# document_store_dim = st.text_input("Embedding_model_dim", value="3072") - -# if st.button("💾 save", key="save_document_store"): -# errors = [] -# if not document_store_dim: -# errors.append("Embedding model dim is required.") -# else: -# try: -# int(document_store_dim) -# except ValueError: -# errors.append("Embedding model dim must be an integer.") - -# if not document_store_timeout: -# errors.append("Timeout is required.") -# else: -# try: -# int(document_store_timeout) -# except ValueError: -# errors.append("Timeout must be an integer.") - -# if errors: -# for error in errors: -# st.error(error) -# else: -# st.session_state.document_store = { -# "type": "document_store", -# "provider": st.session_state[ConfigState.DOC_STORE_KEY].get("provider"), -# "location": st.session_state[ConfigState.DOC_STORE_KEY].get("location"), -# "embedding_model_dim": document_store_dim, -# "timeout": document_store_timeout, -# "recreate_index": st.session_state[ConfigState.DOC_STORE_KEY].get("recreate_index") -# } -# st.success("Updated document store configuration")
41-42
: Move import statement to the top of the fileImporting modules inside functions is generally discouraged as it makes dependencies harder to track, can impact performance, and may cause unexpected behavior.
import streamlit as st import uuid from session_state import ConfigState from config_loader import load_selected_example_yaml, apply_config_blocks from dry_run_test import llm_completion_test, llm_embedding_test import yaml import os +from constants import get_env_path, get_config_path, get_config_done_path def render_import_yaml(): """ Render the configuration YAML import section. Supports loading example YAMLs from GitHub or uploading a custom YAML. """ with st.expander("Import Configuration YAML", expanded=False): # Load example YAML filenames from GitHub example_names = ["None"] + st.session_state.get("example_yaml_names", []) selected_examples_yaml = st.selectbox("Select example config YAML", options=example_names) # Load selected example YAML if selected_examples_yaml != "None" and st.button("Import.yaml", key="import__examples_yaml"): blocks = load_selected_example_yaml(selected_examples_yaml) apply_config_blocks(blocks) st.success("YAML import succeeded. Settings updated.") # Upload custom YAML uploaded_file = st.file_uploader("Upload your own YAML file", type=["yaml", "yml"]) if uploaded_file is not None and st.button("Import.yaml", key="import_own_yaml"): try: blocks = list(yaml.safe_load_all(uploaded_file)) apply_config_blocks(blocks) except Exception as e: st.error(f"Failed to import YAML file: {e}") def render_apikey(): """ Render the API Key configuration section. Supports adding, editing, saving, and deleting multiple API keys. """ with st.expander("API Key", expanded=False): - from constants import get_env_path CONFIG_ENV_PATH = get_env_path()Similarly, move the import in
render_generate_button()
(line 398) to the top of the file.
398-399
: Move import statement to the top of the fileSimilar to the issue in
render_apikey()
, this function imports modules locally instead of at the top of the file.def render_generate_button(engine_blocks, settings_block): """ Render the generate button section and handle saving configuration to files. Writes the final configuration to config.yaml and creates config.done marker file. """ if st.button("Save configuration", key="generate_config_yaml"): - from constants import get_config_path, get_config_done_path CONFIG_OUT_PATH = get_config_path() CONFIG_DONE_PATH = get_config_done_path()
10-14
: Add type hints to function definitionsAdding type hints to function parameters and return values would improve code readability and enable better static type checking.
For example:
-def render_import_yaml(): +def render_import_yaml() -> None: """ Render the configuration YAML import section. Supports loading example YAMLs from GitHub or uploading a custom YAML. """Apply similar type hints to other function definitions in the file.
Also applies to: 36-39, 144-147, 210-215, 343-347, 383-387, 392-396
483-493
: Convert timeout to int during validationThe current validation only checks if
timeout
can be converted to an integer but doesn't actually convert it. This could lead to type inconsistencies later.if not form.get("timeout"): errors.append("Timeout is required.") else: try: - int(form["timeout"]) + form["timeout"] = int(form["timeout"]) except ValueError: errors.append("Timeout must be an integer.")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
wren-ai-service/tools/providers-setup/config_loader.py
(1 hunks)wren-ai-service/tools/providers-setup/constants.py
(1 hunks)wren-ai-service/tools/providers-setup/ui_components.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-ai-service/tools/providers-setup/ui_components.py (5)
wren-ai-service/tools/providers-setup/session_state.py (1)
ConfigState
(5-173)wren-ai-service/src/config.py (1)
config_loader
(86-96)wren-ai-service/tools/providers-setup/config_loader.py (2)
load_selected_example_yaml
(23-34)apply_config_blocks
(108-120)wren-ai-service/tools/providers-setup/dry_run_test.py (2)
llm_completion_test
(6-31)llm_embedding_test
(34-63)wren-ai-service/tools/providers-setup/constants.py (3)
get_env_path
(80-87)get_config_path
(69-77)get_config_done_path
(59-67)
🪛 Ruff (0.8.2)
wren-ai-service/tools/providers-setup/ui_components.py
220-220: f-string without any placeholders
Remove extraneous f
prefix
(F541)
wren-ai-service/tools/providers-setup/constants.py
2-2: os
imported but unused
Remove unused import: os
(F401)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: pytest
local_path = Path.home() / ".wrenAI" / ".env" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Inconsistent home directory naming can cause conflicts
The code uses two different directory names (.wrenai
for config files vs .wrenAI
for .env) which can cause issues on case-sensitive file systems and lead to confusion.
Standardize on a single directory name:
def get_env_path():
docker_path = volume_app_data / ".env"
- local_path = Path.home() / ".wrenAI" / ".env"
+ local_path = Path.home() / ".wrenai" / ".env"
if docker_path.exists():
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
local_path = Path.home() / ".wrenAI" / ".env" | |
def get_env_path(): | |
docker_path = volume_app_data / ".env" | |
local_path = Path.home() / ".wrenai" / ".env" | |
if docker_path.exists(): | |
return docker_path | |
return local_path |
🤖 Prompt for AI Agents (early access)
In wren-ai-service/tools/providers-setup/constants.py at lines 82 to 83, the
directory name used for the home path is ".wrenAI" which is inconsistent with
the ".wrenai" used elsewhere for config files. To fix this, standardize the
directory name by changing ".wrenAI" to ".wrenai" so that the home directory
path is consistent across the codebase and avoids conflicts on case-sensitive
file systems.
for model in st.session_state[ConfigState.LLM_MODELS_KEY]: | ||
if model.get("alias"): | ||
pipeline_llm_options.append("litellm_llm." + model["alias"]) | ||
elif model.get("model"): | ||
pipeline_llm_options.append("litellm_llm." + model["model"]) | ||
|
||
# enumerate all pipelines, record original index | ||
for original_idx, pipe in enumerate(all_pipelines): | ||
if not pipe.get("llm"): | ||
continue | ||
|
||
pipe_name = pipe.get("name", f"Unnamed Pipeline {original_idx}") | ||
with st.expander(f"🔧 Pipeline: {pipe_name}", expanded=False): | ||
for key, value in pipe.items(): | ||
if key == "llm": | ||
selected_llm = st.selectbox( | ||
"LLM Model", | ||
options=pipeline_llm_options, | ||
index=pipeline_llm_options.index(value) if value in pipeline_llm_options else 0, | ||
key=f"llm_{original_idx}" | ||
) | ||
else: | ||
st.markdown(f"**{key}:** `{value}`") | ||
|
||
if st.button("💾 Save this llm", key=f"save_{pipe_name}"): | ||
# ✅ use original index to update llm | ||
st.session_state[ConfigState.PIPELINE_KEY]["pipes"][original_idx]["llm"] = selected_llm | ||
st.success(f"✅ Updated pipeline `{pipe_name}` LLM to `{selected_llm}`") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for empty pipeline_llm_options list
The code assumes that pipeline_llm_options
will not be empty when populating the select box. If no LLM models are defined, this could cause an error.
# set all LLM models options
for model in st.session_state[ConfigState.LLM_MODELS_KEY]:
if model.get("alias"):
pipeline_llm_options.append("litellm_llm." + model["alias"])
elif model.get("model"):
pipeline_llm_options.append("litellm_llm." + model["model"])
+ # Ensure we have at least one option
+ if not pipeline_llm_options:
+ pipeline_llm_options.append("No LLM models available")
# enumerate all pipelines, record original index
for original_idx, pipe in enumerate(all_pipelines):
if not pipe.get("llm"):
continue
pipe_name = pipe.get("name", f"Unnamed Pipeline {original_idx}")
with st.expander(f"🔧 Pipeline: {pipe_name}", expanded=False):
for key, value in pipe.items():
if key == "llm":
selected_llm = st.selectbox(
"LLM Model",
options=pipeline_llm_options,
index=pipeline_llm_options.index(value) if value in pipeline_llm_options else 0,
key=f"llm_{original_idx}"
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
for model in st.session_state[ConfigState.LLM_MODELS_KEY]: | |
if model.get("alias"): | |
pipeline_llm_options.append("litellm_llm." + model["alias"]) | |
elif model.get("model"): | |
pipeline_llm_options.append("litellm_llm." + model["model"]) | |
# enumerate all pipelines, record original index | |
for original_idx, pipe in enumerate(all_pipelines): | |
if not pipe.get("llm"): | |
continue | |
pipe_name = pipe.get("name", f"Unnamed Pipeline {original_idx}") | |
with st.expander(f"🔧 Pipeline: {pipe_name}", expanded=False): | |
for key, value in pipe.items(): | |
if key == "llm": | |
selected_llm = st.selectbox( | |
"LLM Model", | |
options=pipeline_llm_options, | |
index=pipeline_llm_options.index(value) if value in pipeline_llm_options else 0, | |
key=f"llm_{original_idx}" | |
) | |
else: | |
st.markdown(f"**{key}:** `{value}`") | |
if st.button("💾 Save this llm", key=f"save_{pipe_name}"): | |
# ✅ use original index to update llm | |
st.session_state[ConfigState.PIPELINE_KEY]["pipes"][original_idx]["llm"] = selected_llm | |
st.success(f"✅ Updated pipeline `{pipe_name}` LLM to `{selected_llm}`") | |
for model in st.session_state[ConfigState.LLM_MODELS_KEY]: | |
if model.get("alias"): | |
pipeline_llm_options.append("litellm_llm." + model["alias"]) | |
elif model.get("model"): | |
pipeline_llm_options.append("litellm_llm." + model["model"]) | |
# Ensure we have at least one option | |
if not pipeline_llm_options: | |
pipeline_llm_options.append("No LLM models available") | |
# enumerate all pipelines, record original index | |
for original_idx, pipe in enumerate(all_pipelines): | |
if not pipe.get("llm"): | |
continue | |
pipe_name = pipe.get("name", f"Unnamed Pipeline {original_idx}") | |
with st.expander(f"🔧 Pipeline: {pipe_name}", expanded=False): | |
for key, value in pipe.items(): | |
if key == "llm": | |
selected_llm = st.selectbox( | |
"LLM Model", | |
options=pipeline_llm_options, | |
index=( | |
pipeline_llm_options.index(value) | |
if value in pipeline_llm_options | |
else 0 | |
), | |
key=f"llm_{original_idx}" | |
) | |
else: | |
st.markdown(f"**{key}:** `{value}`") | |
if st.button("💾 Save this llm", key=f"save_{pipe_name}"): | |
# ✅ use original index to update llm | |
st.session_state[ConfigState.PIPELINE_KEY]["pipes"][original_idx]["llm"] = selected_llm | |
st.success(f"✅ Updated pipeline `{pipe_name}` LLM to `{selected_llm}`") |
🤖 Prompt for AI Agents (early access)
In wren-ai-service/tools/providers-setup/ui_components.py around lines 352 to
379, the code assumes pipeline_llm_options is never empty when setting the
selectbox options, which can cause an error if no LLM models are defined. Add a
check before the selectbox to verify pipeline_llm_options is not empty; if it is
empty, either disable the selectbox or display a message indicating no LLM
models are available to select. This prevents runtime errors and improves user
feedback.
…prove error handling - Implemented validation for required configuration blocks (LLM, Embedder, Document Store, Pipeline) to ensure they are present and not empty. - Enhanced error handling by replacing print statements with Streamlit error messages for better user feedback during configuration loading failures. - Improved user experience by warning users about missing configuration blocks and indicating that default values will be used.
…ontainer launch logging - Updated the config extraction logic to return the first block of each type, improving clarity and efficiency. - Modified the launch command to dynamically find and log the available port for the Streamlit UI, enhancing user feedback during startup.
Create a UI to help users set up the model.
Summary by CodeRabbit
New Features
Chores