[Feature] Bayesian Auto-Tuning & MCP Agent Chat #6

ricky-chaoju · 2026-01-27T11:53:47Z

Summary

Replaced LLM Agent-based auto-tuning with Bayesian Optimization (Optuna TPE)
Added MCP Agent Chat for natural language interaction with the platform
Extended MCP Server with new tools (benchmark, web search)

Changes

Auto-Tuning

New BayesianTuningService using Optuna TPE sampler for efficient hyperparameter search
Supports vLLM and SGLang engine comparison
Real-time logs display during tuning process
Fixed SGLang container startup command
Simplified UI (removed manual parameter inputs)

- Add BayesianTuningService using Optuna TPE sampler - Fix SGLang container startup command - Add real-time logs to tuning jobs - Simplify AutoTuning UI

- Add Agent Chat panel with streaming support - Add MCP client for backend - Add benchmark and web search tools - Refactor chat panel components

Copilot

Pull request overview

Adds Bayesian Optimization–based auto-tuning and an MCP-powered “agent chat” experience, plus new MCP server tools (web search, benchmarking) to support more autonomous platform interaction.

Changes:

Introduces Optuna TPE-based Bayesian tuning service and wires auto-tuning runner to it.
Adds MCP agent chat (SSE streaming) with conversation persistence and a new chat panel “Agent/Chat” mode.
Extends MCP server with web_search, run_benchmark, and tuning job tools; updates deploy tool args (name, gpu_indexes, backend).

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 17 comments.

Show a summary per file

File	Description
mcp-server/src/tools/webSearch.ts	Adds SearxNG-based web search tool and an LLM-config–focused helper search.
mcp-server/src/tools/benchmark.ts	Adds benchmark + tuning job helper functions for MCP tools.
mcp-server/src/index.ts	Registers new MCP tools and updates deploy tool schema/handler to use `gpu_indexes` + `backend`.
mcp-server/src/formatters.ts	Removes emoji-based status indicators in formatted outputs.
mcp-server/src/client.ts	Extends client with deploy args changes + auto-tuning/benchmark endpoints.
frontend/src/contexts/ChatPanelContext.tsx	Adds global chat panel context to open panel and send agent messages from anywhere.
frontend/src/components/chat/ChatInput.tsx	Tweaks placeholder/disabled behavior during streaming.
frontend/src/components/chat-panel/useAgentChat.ts	Implements agent chat SSE client with step/tool event tracking and conversation loading.
frontend/src/components/chat-panel/index.ts	Exports agent chat view/hook and related types.
frontend/src/components/chat-panel/TuningJobView.tsx	Enhances tuning job view with progress, benchmark result display, and better scrolling behavior.
frontend/src/components/chat-panel/ChatPanel.tsx	Adds “Agent/Chat” mode switch; integrates agent chat view and context registration.
frontend/src/App.tsx	Wraps app in ChatPanelProvider and removes tuning-job storage event handling.
backend/requirements.txt	Adds Optuna dependency.
backend/app/services/tuning_agent.py	Updates agent rules + progress/status updates; tracks benchmark results; tightens deployment stop behavior.
backend/app/services/mcp/types.py	Adds MCP client/JSON-RPC and tool result type definitions.
backend/app/services/mcp/client.py	Implements MCP stdio JSON-RPC client + a simple client pool.
backend/app/services/mcp/agent.py	Implements MCP-driven agent that streams reasoning/tool execution events.
backend/app/services/mcp/init.py	Exposes MCP client/agent public API.
backend/app/services/deployer.py	Fixes SGLang startup command by explicitly running `sglang.launch_server`.
backend/app/services/bayesian_tuner.py	Adds Bayesian auto-tuning implementation (Optuna TPE) with deployment+benchmark loop and logging.
backend/app/schemas/tuning.py	Extends tuning schemas with `tuning_config`, richer progress fields, and logs.
backend/app/models/tuning.py	Extends TuningJob model with `tuning_config`, `conversation_id`, and `logs`.
backend/app/models/conversation.py	Extends Conversation/Message models for agent conversations and tool-call metadata.
backend/app/database.py	Adds SQLite migrations for new conversation/message/tuning columns (partial).
backend/app/api/auto_tuning.py	Adds log parsing to responses; switches auto-tuning runner to Bayesian tuning.
backend/app/api/agent.py	Adds SSE agent chat API + conversation CRUD endpoints.
backend/app/api/init.py	Registers `/agent` routes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-27T14:27:57Z

backend/app/api/agent.py

+                raise HTTPException(status_code=400, detail="Deployment is not running")
+
+            worker = deployment.worker
+            llm_base_url = f"http://{worker.host}:{deployment.port}/v1"


worker.host is not a field on the Worker model/schema (workers expose address/effective_address). This will raise an AttributeError when using provider="system". Build the LLM base URL from worker.effective_address (or worker.address.split(':')[0]) instead.

Suggested change

llm_base_url = f"http://{worker.host}:{deployment.port}/v1"

effective_address = getattr(worker, "effective_address", None)

if effective_address:

llm_base_url = f"{effective_address.rstrip('/')}/v1"

else:

host = (worker.address or "").split(":")[0]

llm_base_url = f"http://{host}:{deployment.port}/v1"

Copilot · 2026-01-27T14:27:57Z

backend/app/api/agent.py

+    # Count total
+    count_query = (
+        select(Conversation)
+        .where(Conversation.user_id == current_user.id)
+        .where(Conversation.conversation_type == conversation_type)
+    )
+    count_result = await db.execute(count_query)
+    total = len(count_result.all())
+


Total conversation count is computed by fetching all matching rows and doing len(...), which is inefficient for large histories. Use a SQL COUNT(*) (e.g., select(func.count()).select_from(...)) to compute total without loading all rows.

Copilot · 2026-01-27T14:27:58Z

backend/app/database.py

+

The TuningJob model now includes a logs JSON column and the tuner writes to self.job.logs, but the SQLite migration list here never adds a logs column to tuning_jobs. On upgraded databases this will cause runtime SQL errors when tuning tries to persist logs. Add a migration step to ALTER TABLE tuning_jobs ADD COLUMN logs JSON (and any other new tuning_job columns that must exist).

Suggested change

# Migration: Add logs to tuning_jobs (for tuning job log persistence)

if not await column_exists("tuning_jobs", "logs"):

logger.info("Adding 'logs' column to tuning_jobs table...")

await conn.execute(text("ALTER TABLE tuning_jobs ADD COLUMN logs JSON"))

logger.info("'logs' column added!")

Copilot · 2026-01-27T14:27:58Z

mcp-server/src/tools/benchmark.ts

+    let output = `## Benchmark Results\n\n`;
+    output += `**Deployment:** ${deployment.name}\n`;
+    output += `**Model:** ${deployment.model?.name || "Unknown"}\n`;
+    output += `**Engine:** ${deployment.engine || "vllm"}\n\n`;


Deployment objects from the backend use the backend field (see DeploymentResponse.backend), but this formatter references deployment.engine. As a result, the displayed engine will be incorrect/always fall back. Use deployment.backend when printing the engine/backend.

Suggested change

output += `**Engine:** ${deployment.engine || "vllm"}\n\n`;

output += `**Engine:** ${deployment.backend || "vllm"}\n\n`;

Copilot · 2026-01-27T14:27:58Z

frontend/src/contexts/ChatPanelContext.tsx

+    }
+
+    // Send the message (async, but we don't wait)
+    fn(message, model);


sendAgentMessage invokes the async fn(message, model) without awaiting or handling rejections. If the send fails, this can surface as an unhandled promise rejection in the browser. Consider void fn(...).catch(...) (and optionally surface an error toast) to ensure failures are handled.

Suggested change

fn(message, model);

// Attach a catch handler to avoid unhandled promise rejections.

void fn(message, model).catch((error) => {

console.error("ChatPanel: Failed to send agent message", error);

});

Copilot · 2026-01-27T14:28:01Z

backend/app/services/bayesian_tuner.py

+                                        if first_token_time is None:
+                                            first_token_time = time.perf_counter()
+                                        token_count += 1
+                                except json.JSONDecodeError:


'except' clause does nothing but pass and there is no explanatory comment.

Copilot · 2026-01-27T14:28:01Z

backend/app/services/bayesian_tuner.py

+                                "tokens": token_count,
+                                "total_time": end - start,
+                            }
+                except Exception:


'except' clause does nothing but pass and there is no explanatory comment.

Copilot · 2026-01-27T14:28:01Z

backend/app/services/mcp/client.py

+            self._read_task.cancel()
+            try:
+                await self._read_task
+            except asyncio.CancelledError:


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except asyncio.CancelledError:

except asyncio.CancelledError:

# Expected when cancelling the read task during client cleanup; safe to ignore.

Copilot · 2026-01-27T14:28:01Z

backend/app/services/mcp/client.py

+                        logger.error(f"Error processing response: {e}")
+
+        except asyncio.CancelledError:
+            pass


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

pass

# Expected during normal shutdown when the response reader task is cancelled.

logger.debug("Response reader task cancelled")

return

Copilot · 2026-01-27T14:28:02Z

backend/app/services/tuning_agent.py

+            except Exception:
+                pass


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

except Exception:

pass

except Exception as inner_exc:

logger.warning(

f"Failed to mark deployment {deployment_id} as stopped after stop failure: {inner_exc}"

)

ricky-chaoju added 3 commits January 27, 2026 19:48

feat: replace LLM agent with Bayesian optimization for auto-tuning

728ede3

- Add BayesianTuningService using Optuna TPE sampler - Fix SGLang container startup command - Add real-time logs to tuning jobs - Simplify AutoTuning UI

feat: add MCP agent chat and tools

e164931

- Add Agent Chat panel with streaming support - Add MCP client for backend - Add benchmark and web search tools - Refactor chat panel components

fix: remove unused imports in AutoTuning

eb2c349

ricky-chaoju requested a review from Copilot January 27, 2026 14:16

Copilot started reviewing on behalf of ricky-chaoju January 27, 2026 14:16 View session

ricky-chaoju merged commit 1ec9cb5 into main Jan 27, 2026
8 checks passed

Copilot AI reviewed Jan 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Bayesian Auto-Tuning & MCP Agent Chat #6

[Feature] Bayesian Auto-Tuning & MCP Agent Chat #6

Uh oh!

ricky-chaoju commented Jan 27, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Copilot AI Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-            llm_base_url = f"http://{worker.host}:{deployment.port}/v1"
+            effective_address = getattr(worker, "effective_address", None)
+            if effective_address:
+                llm_base_url = f"{effective_address.rstrip('/')}/v1"
+            else:
+                host = (worker.address or "").split(":")[0]
+                llm_base_url = f"http://{host}:{deployment.port}/v1"

+    # Migration: Add logs to tuning_jobs (for tuning job log persistence)
+    if not await column_exists("tuning_jobs", "logs"):
+        logger.info("Adding 'logs' column to tuning_jobs table...")
+        await conn.execute(text("ALTER TABLE tuning_jobs ADD COLUMN logs JSON"))
+        logger.info("'logs' column added!")

	output += `Engine: ${deployment.engine \|\| "vllm"}\n\n`;
	output += `Engine: ${deployment.backend \|\| "vllm"}\n\n`;

-    fn(message, model);
+    // Attach a catch handler to avoid unhandled promise rejections.
+    void fn(message, model).catch((error) => {
+      console.error("ChatPanel: Failed to send agent message", error);
+    });

	except asyncio.CancelledError:
	except asyncio.CancelledError:
	# Expected when cancelling the read task during client cleanup; safe to ignore.

-            pass
+            # Expected during normal shutdown when the response reader task is cancelled.
+            logger.debug("Response reader task cancelled")
+            return

-            except Exception:
-                pass
+            except Exception as inner_exc:
+                logger.warning(
+                    f"Failed to mark deployment {deployment_id} as stopped after stop failure: {inner_exc}"
+                )

[Feature] Bayesian Auto-Tuning & MCP Agent Chat #6

[Feature] Bayesian Auto-Tuning & MCP Agent Chat #6

Uh oh!

Conversation

ricky-chaoju commented Jan 27, 2026

Summary

Changes

Auto-Tuning

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants