-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Config
# Frontier model configuration
export DEFAULT_MODEL__PROVIDER=openrouter
export DEFAULT_MODEL__NAME=anthropic/claude-3.5-sonnet
export DEFAULT_MODEL__TIMEOUT=30
export DEFAULT_MODEL__MAX_RETRIES=3
# API key (required for OpenRouter)
export OPENROUTER_API_KEY=sk-or-...
export TEF_API_KEY=sk-or-...
Github has 26 tools, so it times out with 120 seconds. The default is 60 seconds.
mtef tool-quality --server-urls http://localhost:8080/github/mcp --model-provider openrouter --model-name anthropic/claude-3.5-sonnet --url https://localhost:8000 --insecure --verbose --timeout 120
ℹ Using mcp-tef at https://localhost:8000
✗ Request timed out
The LLM evaluation may take longer for servers with many tools.
Try increasing the timeout with --timeout (e.g., --timeout 120)
Optimizer works fine.
(mcp-tef) nigels-MacBook-Pro:mcp-tef nigel$ mtef tool-quality --server-urls http://localhost:8080/mcp-optimizer/mcp --model-provider openrouter
--model-name anthropic/claude-3.5-sonnet --url https://localhost:8000 --insecure --verbose --timeout 120
ℹ Using mcp-tef at https://localhost:8000
Tool Quality Evaluation Results
============================================================
Tool: find_tool
Description: "
Find and return tools from RUNNING servers that can help accomplish the user's request.
This searches only currently running MCP servers. If no relevant tools are found,
use search_registry() to discover tools from servers available in the registry.
Use this function when you need to:
- Discover what tools are available for a specific task
- Find the right tool(s) before attempting to solve a problem
- Check if required functionality exists in the current environment
Args:
tool_description: Description of the task or capability needed
(e.g., "web search", "analyze CSV file", "send an email")
tool_keywords: Space-separated keywords of the task or capability needed.
These will be used for BM25 text search on available tools.
(e.g. "list issues github", "SQL query postgres", "Grafana requests slow").
Returns:
dict: A dictionary containing:
- tools: List of available tools matching the query, including:
* Tool names and descriptions
* Server names (in the mcp_server_name field)
* Required parameters and schemas
* Usage examples where applicable
- token_metrics: Token efficiency metrics showing:
* baseline_tokens: Total tokens for all running server tools
* returned_tokens: Total tokens for returned/filtered tools
* tokens_saved: Number of tokens saved by filtering
* savings_percentage: Percentage of tokens saved (0-100)
Example:
1) User query: "Find good restaurants in San Jose, California"
This query requires web search. Call find_tool with tool_description="search the web".
2) User query: "Get details of an issue in stacklok/toolhive github repository"
This query requires fetching issue details from github. Call find_tool with
tool_description="Get issue details from GitHub".
"
Clarity: 9/10 - The description clearly explains what the tool does (find tools from running servers), when to use it (for discovering
available tools and capabilities), and how to interpret the output (details the returned dictionary structure with tools and token metrics).
Completeness: 10/10 - The description is highly complete, covering all aspects: purpose, usage scenarios, detailed parameter explanations, return
value structure, and multiple practical usage examples. The input schema clearly defines required parameters.
Conciseness: 8/10 - The description is mostly concise but could be slightly more compact. The examples section, while valuable, could be
condensed without losing important information.
Suggested: "Find and return tools from RUNNING servers that match your requested capabilities. Searches only currently running MCP servers (use
search_registry() for offline servers).
Args:
tool_description: Task description (e.g., "web search", "analyze CSV file")
tool_keywords: Space-separated search keywords (e.g., "github issues", "postgres query")
Returns:
- List of matching tools with names, descriptions, server names, parameters, and examples
- Token efficiency metrics showing baseline, returned, and saved token counts
Use this to discover available tools for specific tasks or verify capabilities before solving problems. Falls back to search_registry() if no
matches found."
Tool: call_tool
Description: "
Execute a specific tool with the provided parameters.
Use this function to:
- Run a tool after identifying it with find_tool()
- Execute operations that require specific MCP server functionality
- Perform actions that go beyond your built-in capabilities
Args:
server_name: The name of the MCP server that provides the tool
(obtain this from find_tool() results - it's the mcp_server_name field)
tool_name: The name of the tool to execute
(obtain this from find_tool() results - it's the tool's name field)
parameters: Dictionary of arguments required by the tool
(structure must match the tool's schema from find_tool())
Returns:
CallToolResult: The output from the tool execution, which may include:
- Success/failure status
- Result data or content
- Error messages if execution failed
Important: Always use find_tool() first to get the correct server_name and tool_name
and parameter schema before calling this function.
"
Clarity: 9/10 - The description clearly explains the tool's purpose, when to use it, and its role in executing other tools. The input
requirements and return values are well explained with clear structure
Completeness: 8/10 - The description covers key aspects including purpose, usage, parameters, and returns. It includes important usage notes about
using find_tool() first. However, it could benefit from a concrete usage example
Conciseness: 9/10 - The description is well-organized and concise, with no unnecessary information. Each section serves a clear purpose in
explaining the tool's functionality
Suggested: "Execute a specific tool with the provided parameters.
Use this function to run tools identified through find_tool() that require specific MCP server functionality or go beyond built-in capabilities.
Required Parameters:
- server_name: MCP server name (from find_tool() mcp_server_name field)
- tool_name: Name of tool to execute (from find_tool() name field)
- parameters: Tool arguments matching schema from find_tool()
Returns CallToolResult containing:
- Success/failure status
- Result data/content
- Error messages (if failed)
Example:
result = call_tool(
server_name="server1",
tool_name="process_data",
parameters={"input": "data.txt"}
)
Note: Always use find_tool() first to get correct server name, tool name, and parameter schema."
Tool: list_tools
Description: "
List all available tools across all MCP servers.
Use this function when you need to:
- See all tools available in the current environment
- Browse the complete catalog of available tools
- Get an overview of all capabilities without filtering
Returns:
ListToolsResult: All available tools, including:
- Tool names and descriptions
- Server names (in the mcp_server_name field)
- Required parameters and schemas
- Usage examples where applicable
"
Clarity: 9/10 - The description clearly states what the tool does (list all tools), when to use it (for browsing catalog, getting overview),
and what to expect in the output (tool names, descriptions, server names, parameters, schemas, examples).
Completeness: 8/10 - The description covers main functionality, use cases, and output structure well. The input schema is appropriately empty
since no parameters are needed. However, it could benefit from a simple example of the returned data structure.
Conciseness: 10/10 - The description is very concise while covering all essential information. It uses bullet points effectively and has no
unnecessary information.
Suggested: "List all available tools across all MCP servers.
Use this function when you need to:
- See all tools available in the current environment
- Browse the complete catalog of available tools
- Get an overview of all capabilities without filtering
Returns a ListToolsResult object containing:
- Tool names and descriptions
- Server names (mcp_server_name)
- Required parameters and schemas
- Usage examples where applicable
Example return structure:
{
"tools": [
{
"name": "tool_name",
"description": "Tool description",
"mcp_server_name": "server_name",
"parameters": {...},
"examples": [...]
},
...
]
}"
✓ Evaluated 3 tool(s)
(mcp-tef) nigels-MacBook-Pro:mcp-tef nigel$
but is three tools enough?
Should we extend the default timeout to much more?
Do we want to use a session id so this can be asynchronous?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels