-
Notifications
You must be signed in to change notification settings - Fork 2
[Feature] Bayesian Auto-Tuning & MCP Agent Chat #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add BayesianTuningService using Optuna TPE sampler - Fix SGLang container startup command - Add real-time logs to tuning jobs - Simplify AutoTuning UI
- Add Agent Chat panel with streaming support - Add MCP client for backend - Add benchmark and web search tools - Refactor chat panel components
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds Bayesian Optimization–based auto-tuning and an MCP-powered “agent chat” experience, plus new MCP server tools (web search, benchmarking) to support more autonomous platform interaction.
Changes:
- Introduces Optuna TPE-based Bayesian tuning service and wires auto-tuning runner to it.
- Adds MCP agent chat (SSE streaming) with conversation persistence and a new chat panel “Agent/Chat” mode.
- Extends MCP server with
web_search,run_benchmark, and tuning job tools; updates deploy tool args (name,gpu_indexes,backend).
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| mcp-server/src/tools/webSearch.ts | Adds SearxNG-based web search tool and an LLM-config–focused helper search. |
| mcp-server/src/tools/benchmark.ts | Adds benchmark + tuning job helper functions for MCP tools. |
| mcp-server/src/index.ts | Registers new MCP tools and updates deploy tool schema/handler to use gpu_indexes + backend. |
| mcp-server/src/formatters.ts | Removes emoji-based status indicators in formatted outputs. |
| mcp-server/src/client.ts | Extends client with deploy args changes + auto-tuning/benchmark endpoints. |
| frontend/src/contexts/ChatPanelContext.tsx | Adds global chat panel context to open panel and send agent messages from anywhere. |
| frontend/src/components/chat/ChatInput.tsx | Tweaks placeholder/disabled behavior during streaming. |
| frontend/src/components/chat-panel/useAgentChat.ts | Implements agent chat SSE client with step/tool event tracking and conversation loading. |
| frontend/src/components/chat-panel/index.ts | Exports agent chat view/hook and related types. |
| frontend/src/components/chat-panel/TuningJobView.tsx | Enhances tuning job view with progress, benchmark result display, and better scrolling behavior. |
| frontend/src/components/chat-panel/ChatPanel.tsx | Adds “Agent/Chat” mode switch; integrates agent chat view and context registration. |
| frontend/src/App.tsx | Wraps app in ChatPanelProvider and removes tuning-job storage event handling. |
| backend/requirements.txt | Adds Optuna dependency. |
| backend/app/services/tuning_agent.py | Updates agent rules + progress/status updates; tracks benchmark results; tightens deployment stop behavior. |
| backend/app/services/mcp/types.py | Adds MCP client/JSON-RPC and tool result type definitions. |
| backend/app/services/mcp/client.py | Implements MCP stdio JSON-RPC client + a simple client pool. |
| backend/app/services/mcp/agent.py | Implements MCP-driven agent that streams reasoning/tool execution events. |
| backend/app/services/mcp/init.py | Exposes MCP client/agent public API. |
| backend/app/services/deployer.py | Fixes SGLang startup command by explicitly running sglang.launch_server. |
| backend/app/services/bayesian_tuner.py | Adds Bayesian auto-tuning implementation (Optuna TPE) with deployment+benchmark loop and logging. |
| backend/app/schemas/tuning.py | Extends tuning schemas with tuning_config, richer progress fields, and logs. |
| backend/app/models/tuning.py | Extends TuningJob model with tuning_config, conversation_id, and logs. |
| backend/app/models/conversation.py | Extends Conversation/Message models for agent conversations and tool-call metadata. |
| backend/app/database.py | Adds SQLite migrations for new conversation/message/tuning columns (partial). |
| backend/app/api/auto_tuning.py | Adds log parsing to responses; switches auto-tuning runner to Bayesian tuning. |
| backend/app/api/agent.py | Adds SSE agent chat API + conversation CRUD endpoints. |
| backend/app/api/init.py | Registers /agent routes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| raise HTTPException(status_code=400, detail="Deployment is not running") | ||
|
|
||
| worker = deployment.worker | ||
| llm_base_url = f"http://{worker.host}:{deployment.port}/v1" |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worker.host is not a field on the Worker model/schema (workers expose address/effective_address). This will raise an AttributeError when using provider="system". Build the LLM base URL from worker.effective_address (or worker.address.split(':')[0]) instead.
| llm_base_url = f"http://{worker.host}:{deployment.port}/v1" | |
| effective_address = getattr(worker, "effective_address", None) | |
| if effective_address: | |
| llm_base_url = f"{effective_address.rstrip('/')}/v1" | |
| else: | |
| host = (worker.address or "").split(":")[0] | |
| llm_base_url = f"http://{host}:{deployment.port}/v1" |
| # Count total | ||
| count_query = ( | ||
| select(Conversation) | ||
| .where(Conversation.user_id == current_user.id) | ||
| .where(Conversation.conversation_type == conversation_type) | ||
| ) | ||
| count_result = await db.execute(count_query) | ||
| total = len(count_result.all()) | ||
|
|
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Total conversation count is computed by fetching all matching rows and doing len(...), which is inefficient for large histories. Use a SQL COUNT(*) (e.g., select(func.count()).select_from(...)) to compute total without loading all rows.
|
|
||
|
|
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TuningJob model now includes a logs JSON column and the tuner writes to self.job.logs, but the SQLite migration list here never adds a logs column to tuning_jobs. On upgraded databases this will cause runtime SQL errors when tuning tries to persist logs. Add a migration step to ALTER TABLE tuning_jobs ADD COLUMN logs JSON (and any other new tuning_job columns that must exist).
| # Migration: Add logs to tuning_jobs (for tuning job log persistence) | |
| if not await column_exists("tuning_jobs", "logs"): | |
| logger.info("Adding 'logs' column to tuning_jobs table...") | |
| await conn.execute(text("ALTER TABLE tuning_jobs ADD COLUMN logs JSON")) | |
| logger.info("'logs' column added!") |
| let output = `## Benchmark Results\n\n`; | ||
| output += `**Deployment:** ${deployment.name}\n`; | ||
| output += `**Model:** ${deployment.model?.name || "Unknown"}\n`; | ||
| output += `**Engine:** ${deployment.engine || "vllm"}\n\n`; |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deployment objects from the backend use the backend field (see DeploymentResponse.backend), but this formatter references deployment.engine. As a result, the displayed engine will be incorrect/always fall back. Use deployment.backend when printing the engine/backend.
| output += `**Engine:** ${deployment.engine || "vllm"}\n\n`; | |
| output += `**Engine:** ${deployment.backend || "vllm"}\n\n`; |
| } | ||
|
|
||
| // Send the message (async, but we don't wait) | ||
| fn(message, model); |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sendAgentMessage invokes the async fn(message, model) without awaiting or handling rejections. If the send fails, this can surface as an unhandled promise rejection in the browser. Consider void fn(...).catch(...) (and optionally surface an error toast) to ensure failures are handled.
| fn(message, model); | |
| // Attach a catch handler to avoid unhandled promise rejections. | |
| void fn(message, model).catch((error) => { | |
| console.error("ChatPanel: Failed to send agent message", error); | |
| }); |
| if first_token_time is None: | ||
| first_token_time = time.perf_counter() | ||
| token_count += 1 | ||
| except json.JSONDecodeError: |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| "tokens": token_count, | ||
| "total_time": end - start, | ||
| } | ||
| except Exception: |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| self._read_task.cancel() | ||
| try: | ||
| await self._read_task | ||
| except asyncio.CancelledError: |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except asyncio.CancelledError: | |
| except asyncio.CancelledError: | |
| # Expected when cancelling the read task during client cleanup; safe to ignore. |
| logger.error(f"Error processing response: {e}") | ||
|
|
||
| except asyncio.CancelledError: | ||
| pass |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| pass | |
| # Expected during normal shutdown when the response reader task is cancelled. | |
| logger.debug("Response reader task cancelled") | |
| return |
| except Exception: | ||
| pass |
Copilot
AI
Jan 27, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'except' clause does nothing but pass and there is no explanatory comment.
| except Exception: | |
| pass | |
| except Exception as inner_exc: | |
| logger.warning( | |
| f"Failed to mark deployment {deployment_id} as stopped after stop failure: {inner_exc}" | |
| ) |
Summary
Changes
Auto-Tuning
BayesianTuningServiceusing Optuna TPE sampler for efficient hyperparameter search