English | 中文
Introduction | Quick Start | Best Practices | Changelog | Exposed MCP Tools | AutoGen Multi Agent Example | Other Documentation
Roadmap · Upcoming (2026.1): Support for NoSQL
aka: SQL Safety Executor MCP for LLM
A secure database access gateway for AI Agents: Empowering LLM (Agents) with database access capabilities.
Enables Large Language Models (LLMs) to safely execute database queries via standardized MCP interfaces using authenticated SQL.
Provides protections such as allowlists, timeouts, and result truncation. Mitigates operational risks while preventing token cost overruns.
In addition to MySQL and SQLite, it also supports NoSQL. (in progress, NoSQL support is planned for future releases)
This project resolves the LLM database accessibility bottleneck. By coordinating with AI Agents, it expands the capability boundaries of LLMs and extends the application scope of large models in real-world business scenarios.
Traditional LLM-database integration faces the following limitations:
- Security Risks: LLM-generated SQL may contain dangerous operations (DELETE/UPDATE/DROP).
- Token Explosion: Large table queries returning massive amounts of data, leading to context overflow and uncontrolled costs.
- Tight Coupling: Database logic intertwined with LLM prompts and orchestration code, making maintenance and reuse difficult.
- Query Quality: Without table structure information, LLMs are prone to generating invalid or inefficient SQL.
- Scalability: Inability to guarantee personalized database connections and security configurations needed for each LLM instance.
- Enable safe SQL query execution for AI models (SELECT/SHOW/DESCRIBE/EXPLAIN only).
- Prevent Token Explosion: Result truncation (
MAX_RESULT_ROWS) + Table count limits (MAX_OVERVIEW_TABLES). - Provide a standardized MCP interface across different AI platforms.
- Support ReAct Pattern: Provide table structure information for LLM decision-making.
- Configurable Security Policies: Allowlists, timeouts, UNION control.
This project implements a standard MCP (Model Context Protocol) service to provide secure database access for LLMs:
┌─────────────────────────────────────────────────────────────────┐
│ MCP Client │
│ (VS Code Copilot / Claude Desktop / Gemini CLI) │
└─────────────────────────┬───────────────────────────────────────┘
│ MCP Protocol (stdio)
▼
┌─────────────────────────────────────────────────────────────────┐
│ sql-safety-executor (MCP Server) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ Tool Layer: query | list_tables | describe_table | ... │ │
│ ├────────────────────────────────────────────────────────────┤ │
│ │ Safety Layer: SQL Valid. | Allowlist | Truncation | Timeout│ │
│ └────────────────────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────────┘
│ SQLAlchemy (Connection Pool)
▼
┌─────────────────────────────────────────────────────────────────┐
│ Database │
└─────────────────────────────────────────────────────────────────┘
1. Security Validation: Only allows SELECT/SHOW/DESCRIBE/EXPLAIN, automatically intercepting dangerous statements.
2. Token Protection: Result truncation (MAX_RESULT_ROWS) + Table count limits (MAX_OVERVIEW_TABLES).
3. Tool Design:
- Adopts a Model-driven pattern, prioritizing decision rules over fixed workflows.
- Tools return context like
is_large/row_countto enable LLM autonomy. - Supports configuration-based policy/prompt injection (e.g., ALLOW_UNION, ALLOWED_TABLES, truncation thresholds), using shorter, more relevant guidance to reduce invalid tool calls.
- Error feedback optimized for LLMs: Clearly identifies failure reasons (security interception/table not allowed/syntax/timeout/truncation, etc.) and offers correction suggestions, reducing trial-and-error and invalid calls, while avoiding leakage of sensitive info (credentials, system table details, etc.).
- Adapted for ReAct Pattern: Thought → Action → Observation → Rethink.
4. Typical Workflow:
Structure Unknown: list_tables() → describe_table(target) → query(sql)
Structure Known: query(sql) directly
Large Table Scenario: Observe is_large=true → Use LIMIT or Aggregation
- Query Restrictions: Only SELECT / SHOW / DESCRIBE / EXPLAIN allowed.
- SQL Parsing Validation: Comprehensive query analysis using
sqlparse. - Connection Security: Environment-based credential management.
- Error Isolation: Comprehensive exception handling and reporting tailored for LLMs.
- Access Isolation: Access boundaries controlled by the host/runtime environment.
- Table Allowlist: Configurable restrictions on accessible tables.
- Result Truncation:
MAX_RESULT_ROWS/MAX_RESULT_CHARSto prevent Token overflow. - Query Timeout:
QUERY_TIMEOUT_SECONDSto prevent slow queries. - UNION Control: Disabled by default, requires allowlist to enable.
| File | Responsibility |
|---|---|
mcp_sql_server.py |
MCP tool definition, security validation, result processing |
sql_safety_checker.py |
SQL statement parsing and security checking |
db_adapter.py |
Database adapter abstraction (MySQL/SQLite support) |
start_server.py |
Service startup, environment verification |
The philosophy of this project originated in early 2025, partially influenced by GraphQL. Initially, the plan was for LLMs or Agents to serve as a frontend entry point, fetching information from the backend through explicit semantics. In reality, SQL itself is an excellent carrier for query information. Rather than passing GraphQL, it is better to pass SQL directly. Especially since current mainstream LLMs (as of late 2025) can generate common SQL quite stably without additional fine-tuning. However, three key issues need to be considered:
- Potential SQL Injection.
- The uncertainty of LLMs themselves: The safety of generated SQL.
- The accuracy, quality, and efficiency of the generated SQL queries.
Regarding Issue 1: This scenario is not actually "frontend directly passing SQL to backend for use". LLMs (Agents) here are more like programs running server-side on the server. All SQL is generated in a controllable server environment, communicating with other backend services via stdio mode (or via HTTP in a secure intranet environment). Agents in this scenario are programs with constrained input/output, interfacing with the frontend (external users) only via Prompts. As of late 2025, preventing Prompt Injection is a widespread and mature practice, and Agent authors can easily use various methods to avoid external Prompt Injection.
Regarding Issue 2: LLM generation is uncertain. Even with a very low probability, this can lead to generated SQL safety not being guaranteed. A SQL safety check tool is needed to inspect and filter generated SQL.
Regarding Issue 3: LLMs (Agents) cannot generate SQL out of thin air; they need a certain context foundation. This context can be natural language prompts or database documentation, but more importantly, database structure, query examples, and the data itself. For high-quality queries, as things stand (end of 2025), a ReAct approach should be adopted: a cycle of Thought --> Action --> Observation --> Rethink to decide the next action.
In summary, this project is the solution to Issue 2 and Issue 3.
Early on, the goal was to write a simple SQL safety check tool to inspect and filter SQL statements before execution, to be called (FunctionCall) by LLMs (Agents).
-
Later, in version (v1.0), the solution was refactored to add support for the standardized MCP service architecture. Decoupling was also performed to simplify maintenance while enhancing security and scalability.
-
In version (v2.0), optimizations for query efficiency and call risks were made for actual MCP usage scenarios.
- Optimized tool call efficiency by merging tools and adding new commonly used tools to reduce the number of tool calls.
- Focused on specific optimizations for Token Explosion risks in real-world usage (which can lead to massive LLM API costs).
- Added examples of multi-Agent calls to this MCP service, based on the AutoGen framework, to demonstrate the combination of Agents and this service.
-
In version (v2.1), the focus was on improving tool design and output consistency. Many tools were optimized and refactored to adhere as closely as possible to industry best practices. The overall design adopts the ReAct paradigm (Thought --> Action --> Observation --> Rethink) loop. This improves query accuracy and multi-step query quality while ensuring query efficiency. The AutoGen-based multi-agent example was also updated synchronously.
Through these iterations, this project evolved from an initial concept of an LLM/Agent-based user interface to an MCP-supported comprehensive SQL query service. It is worth acknowledging that although the starting point was different, the current project actually shares similarities with current Text2SQL solutions. In the early conception of this project (March-April 2025), such systems were relatively rare. At that time, similar Text2SQL practices were mainly stuck in the context of receiving prompts from relevant personnel and the LLM generating SQL statements once to assist their queries. Ideally, the core motivation of this project is to enable LLMs (Agents) to replace traditional frontends and become the new "frontend", capable of dynamic interaction with users regarding both the data in the interface and the interface itself. Making the entire system fully flexible and dynamic. For now, the core idea of this project is to empower LLMs (Agents) with the ability to enter the database. Combined with different Agents, different work scenarios can be developed and extended.
Agent Skills and Extensibility In the future, we plan to introduce support for Agent Skills. We also plan to improve the extensibility of this project. In existing practices, we realize the significance of providing (encapsulated) specific semantic tools. The industry also has corresponding best practice discussions:
"Offload the burden from the model and use code where possible."
"Don't make the model fill arguments you already know."
"Combine functions that are always called in sequence." — OpenAI, "Best practices for defining functions" (December 2025)
This indicates that in reasonable cases, a practical system should include and support adding extra tools tailored for specific scenarios. However, too many tools consume more context, reduce accuracy, and increase costs[1]. Progressive disclosure[2] of Agent Skills can avoid these issues. Therefore, we can envision a scheme where users or developers can write a large number of "plugins" (code snippets/tools) dependent on this MCP service, managed via SKILL.md, allowing for dynamic addition and configuration of tools. Agents can then load these "plugins" to flexibly extend their capabilities.
[1]: "Keep the number of functions small for higher accuracy."
[2]: "This filesystem-based architecture enables progressive disclosure: Claude loads information in stages as needed, rather than consuming context upfront."
Support for Multiple Database Types (SQLite, NoSQL, etc.)
- SQLite support added in v2.2 (January 2026)
- NoSQL support planned for future releases
Manually Defined Methods for Database Write Processes
Refined Permission Management Based on MCP Protocol
This project was originally created by Gemini CLI and primarily developed using GitHub Copilot after v1.0. When using AI to assist in developing this project, the following experiences were generally followed:
- Follow best practices on the web and GitHub as much as possible, such as those from Anthropic, Google, FastMCP, and Microsoft. Avoid hallucinations and local optima.
- Make the AI reflect on its own output as much as possible.
- Under the premise of satisfying 1 and 2, minimize constraints on the AI. Use the simplest prompts and steps to complete tasks, and let the AI complete the full workflow.
Keep context as full and complete as possible; keep prompts and constraints to a minimum.
This is why although the original GEMINI.md is retained, it is only for record-keeping, and AGENTS.md was not added. However, SKILL.md or similar "progressive" documentation is good practice. The project's related documents REFACTORING_LOG.md and PROMPT_ENGINEERING_BEST_PRACTICES.md reflect this practice.
In the practice of writing this project, a large amount of AI-assisted development was used. Although code reviews and tests have been conducted as much as possible, and a series of security settings have been added, due to the scope and experimental nature of the project, it cannot cover all situations, especially considering the involvement of LLMs. Therefore, do not directly connect to a production environment or pair with an Agent without testing. This may lead to unexpected consequences! Rashly connecting an untested Agent may lead to instability, infinite loops, Token explosion, massive queries, or other unverified negative effects. In recent updates, multiple efficiency optimizations have been carried out for this project, mainly focusing on reducing unnecessary tool call counts and increasing speed. Certain tests have been performed. However, due to the randomness of LLMs (Agents), unnecessary tool calls may still occur in actual use, although the probability is small.
-
Row count fields may be imprecise:
list_tables()/describe_table()/get_full_schema()returnrow_countas estimates:- MySQL: from
INFORMATION_SCHEMA.TABLES.TABLE_ROWS(InnoDB may have significant deviation or lag) - SQLite: from
sqlite_stat1(if ANALYZE has been run) or sampling strategy - Only recommended for "order of magnitude judgment/whether to add LIMIT/whether it is a large table" strategies.
- If an exact count is needed, please use
SELECT COUNT(*) ..., or enableENABLE_TABLE_SUMMARY=1and useget_table_summary(exact_count=True)(note that large tables may be slow).
- MySQL: from
-
Result truncation to avoid Token explosion:
query(),list_tables(),get_full_schema()will truncate output based onMAX_RESULT_ROWS/MAX_RESULT_CHARS/MAX_OVERVIEW_TABLES/MAX_SCHEMA_TABLES; therefore, "returned data/tables/columns" may not be the full set. When the full set is needed, please explicitly use smaller scope queries (addLIMIT, pagination by condition), or adjust relevant environment variables (at your own risk). -
Some "Total" fields have "Visible Range" semantics: For example,
total_tablesin tool output represents "the number of visible tables after allowlist parameter filtering (and considering truncation)", which is not necessarily equal to the actual total number of tables in the database; please avoid misinterpreting it as "whole database statistics".
-
Recommend trying with VS Code's GitHub Copilot first. GitHub Copilot in VS Code is a mature AI Agent tool. You can choose a free model (e.g., GPT-5 mini) to use in a test database, which offers higher security and avoids extra AI request costs.
-
Another benefit of using GitHub Copilot is: It allows empowering Copilot, this auxiliary coding AI, with the ability to enter the database, making it understand the structure and data distribution of the target database. This provides better development assistance and suggestions when writing programs.
-
(Taking GitHub Copilot as an example) When using it, you can add a reminder in the prompt like "To ensure the data and reasoning are accurate and sufficient, you need to query step by step, multiple times." This guides the AI to perform multiple steps of refinement, similar to a ReAct pattern query, for better results. This is especially useful when solving complex problems.
-
(Taking GitHub Copilot as an example) Explicitly attach the "#sql-safety-executor-mcp" tool when using it, which reminds the AI to prioritize using this tool.
-
(Taking GitHub Copilot as an example) Make good use of the "todo" tool provided by the Agent, which helps the AI plan query steps, improving performance and efficiency.
-
In recent modifications (as of Jan 7, 2026), multiple security optimizations have been made, such as truncation for large data volumes, use of special keywords (like union), table allowlist settings, and dynamic prompts for different configurations. However, higher security means lower performance/efficiency and higher consumption (e.g., more request parameters and Token consumption), so please configure security settings as appropriate.
-
Actually, AI clients like Claude Code, Codex, and Gemini CLI are similar to GitHub Copilot, but be aware that AI calls may generate significant Token costs. And current tests (including capability tests) are mainly completed on GitHub Copilot.
By configuring mcp.json in VS Code for quick integration, you can directly call the SQL tools of this project in GitHub Copilot Chat. This empowers GitHub Copilot Chat with database-oriented capabilities. Of course, it can also be used in other MCP-supported AI assistants.
- Ensure VS Code is the latest version.
- Install the GitHub Copilot Chat extension.
- Ensure project dependencies are installed (run
pip install -r requirements.txtin the project path). - Configure the environment
cp .env.example .envand edit .env with your database credentials Configure environment variables in .env.
Create a new folder .vscode in the project root directory (create if it doesn't exist), and create a file mcp.json inside it.
Copy the following content into mcp.json (if mcp.json already exists, append the configuration to it; VS Code respects mcp.json for MCP Server recognition). Be sure to modify it to your actual absolute path:
{
"mcpServers": {
"sql-safety-executor-mcp": {
"type": "stdio",
"command": "/absolute/path/to/python",
"args": ["/absolute/path/to/start_server.py"],
"cwd": "/absolute/path/to/project_root"
}
}
}Configuration Details:
command: Must point to the absolute path of the Python interpreter in the virtual environment (e.g.,.venv/bin/python), do not use systempythondirectly.args: Point to the absolute path ofstart_server.py.cwd: The absolute path of the project root directory, ensuring.envfile can be read.
You can also configure via VS Code GUI: (Recommended)
-
Complete "1. Preparation".
-
Open VS Code Command Palette (
Ctrl+Shift+P/Cmd+Shift+P). -
Type and select
MCP: Add Server. -
Add the above content step by step following the guide (please modify according to actual path).
Basically, both methods achieve the same goal; they generate the mcp.json file in the same location. In any case, you just need to ensure .vscode/mcp.json has the above configuration.
-
Restart VS Code, or use the Command Palette to reload the window.
-
Open GitHub Copilot Chat, ensure it is in Plan or Agent mode.
-
Click the Tool Icon next to the model selection box below the input field.
-
You should be able to see
sql-safety-executorand its provided tools (e.g.,query,list_tables). Ensure they are all checked. -
Send a question directly in the conversation: "List all tables" or "Query the first 5 rows of the users table".
-
Then you can see the MCP tool being called.
Note: Although tool usage has been optimized, it is still recommended to use free models (e.g., GPT-5 mini) in GitHub Copilot Chat to avoid extra request consumption.
- Cannot find tools? Check the
Outputpanel, switch to "GitHub Copilot" to see if there are errors. - Path Error: Windows users please note backslash escaping in JSON (e.g.,
C:\\Users\\...).
# Install dependencies including MCP support
pip install -r requirements.txt
# Configure environment
cp .env.example .env
# Edit .env with your database credentials
# Start MCP Server
python start_server.py
# Test MCP capabilities (internal functions)
python test_mcp_functions.py
# Test MCP Server via client (simulates a real MCP client)
python test_mcp_client.py# Run AutoGen Multi-Agent SQL Query System
# Requires: GEMINI_API_KEY, OPENAI_API_KEY, or USE_OLLAMA=true
python autogen_sql_agent.py
# Or run with a specific task
python autogen_sql_agent.py "List all tables and describe their structure"autogen_sql_agent.py demonstrates how to use a multi-Agent team (PlanningAgent, SQLExecutorAgent, AnalystAgent) from the Microsoft AutoGen framework to interact with the MCP server.
Add the server to your MCP-compatible client configuration (e.g., VS Code, Claude Desktop, or other MCP clients):
{
"mcpServers": {
"sql-safety-executor-mcp": {
"type": "stdio",
"command": "python",
"args": ["start_server.py"],
"cwd": "/path/to/llm-sql-safety-executor-mcp"
}
}
}- Replace
/path/to/with your actual project path. - The server loads credentials from the
.envfile in the working directory. - For virtual environments, use the full path to the Python interpreter.
# Database Type: 'mysql' (default) or 'sqlite'
DB_TYPE=mysqlDB_USER=your_database_user
DB_PASSWORD=your_database_password
DB_HOST=your_database_host
DB_NAME=your_database_name# Path to SQLite database file, or ':memory:' for in-memory database
SQLITE_DATABASE_PATH=./sample_data/demo.db
# SQLITE_DATABASE_PATH=:memory:Note 1:
./sample_data/demo.dbis a sample database provided for testing purposes.
Note 2:SQLITE_DATABASE_PATH:memory:creates a temporary in-memory database (which is empty upon initialization). Data is lost when the server restarts. Suitable for testing and other specialized use cases.
# Optional: Query timeout progress handler interval (default: 100)
# Lower value = more responsive timeout, higher CPU overhead
# SQLITE_PROGRESS_HANDLER_INTERVAL=100# Feature Switches (1=Enabled, 0=Disabled)
ENABLE_SCHEMA_TOOLS=1 # Controls sample() tool
ENABLE_TABLE_SUMMARY=0 # Controls get_table_summary() tool (Default: Disabled)
# describe_table() already provides estimated row counts
# Large Table Threshold, used for is_large flag and query suggestions
# Tables exceeding this row count will trigger LIMIT/Aggregation hints
LARGE_TABLE_THRESHOLD=1000
# Security Configuration (Recommended for Production)
QUERY_TIMEOUT_SECONDS=30 # Query timeout in seconds (P0 Safety)
CONNECT_TIMEOUT_SECONDS=10 # Connection timeout in seconds
# Table Allowlist (Comma separated, case insensitive)
# Only allow access to specific tables - Leave empty to allow all
# Use "*" to explicitly allow all tables (UNION requires this setting)
ALLOWED_TABLES=products,orders,customers
# UNION Query Policy
# Important: UNION requires double configuration to enable:
# 1. ALLOW_UNION=1
# 2. ALLOWED_TABLES=table1,table2 OR ALLOWED_TABLES=*
# If ALLOW_UNION=1 but ALLOWED_TABLES is empty, UNION will still be blocked.
ALLOW_UNION=0
# Token Optimization: Limit result size to prevent context overflow
# Set to 0 to disable truncation (for data export scenarios)
MAX_RESULT_ROWS=100 # Max rows returned per query (0=unlimited)
MAX_RESULT_CHARS=16000 # Max characters in response (0=unlimited)
MAX_SCHEMA_TABLES=50 # Max tables returned by get_full_schema (0=unlimited)
MAX_OVERVIEW_TABLES=100 # Max tables returned by list_tables (0=unlimited)For a complete client configuration example, please refer to mcp_config.json.
Added support for SQLite databases while maintaining full backward compatibility with MySQL:
- New Database Adapter Architecture: Introduced
db_adapter.pywith Abstract Base Class patternDatabaseAdapterABC defines unified interface for all database backendsMySQLAdapter: Preserves all existing MySQL functionalitySQLiteAdapter: New SQLite support with native timeout mechanismcreate_adapter()factory function for automatic adapter selection
- SQLite-Specific Features:
- Query timeout via
set_progress_handler()(native SQLite callback) StaticPoolconnection pooling (single connection, avoids file lock issues)sqlite_masterandPRAGMA table_info()for metadata queries- Row count estimation using
sqlite_stat1or sampling strategy
- Query timeout via
- New Environment Variables:
DB_TYPE=mysql|sqlite- Database type selection (default: mysql)SQLITE_DATABASE_PATH- Path to SQLite file or:memory:SQLITE_PROGRESS_HANDLER_INTERVAL- Timeout check frequency
- Backward Compatibility: All existing MySQL configurations continue to work unchanged
- New
db_typeField: Tool responses now includedb_typefield indicating active database - Comprehensive Test Suite: 53 tests covering both MySQL and SQLite adapters
For detailed design decisions, compromises, and implementation details, see SQLITE_ADAPTER_DESIGN.md. For change log details, see REFACTORING_LOG.md.
Focused on improvements in tool design and output consistency:
get_table_summaryis now optional: Disabled by default (ENABLE_TABLE_SUMMARY=0) asdescribe_table()already provides estimated row counts. Enable only when precise COUNT(*) is needed.- Enhanced
describe_table: Now returnsrow_count,row_count_approximate,is_largeflag, andrecommendationquery suggestions. - Refactored
list_tablesoutput:data→tables, clearer.- Added
database_name,returned_table_count,total_tables,truncated,truncation_note.
- Consistent Field Naming:
returned_table_countvstotal_tablesstandard applied to bothlist_tablesandget_full_schema. - New Environment Variables:
ENABLE_TABLE_SUMMARY=0- Controlsget_table_summary()tool.LARGE_TABLE_THRESHOLD=1000- Threshold foris_largeflag.MAX_OVERVIEW_TABLES=100- Max tables forlist_tables().
- AutoGen Agent Prompt Update: Removed
get_table_summary()reference, updated workflow tolist_tables() → describe_table()pattern.
For detailed changes, please refer to REFACTORING_LOG.md.
Major improvements following FastMCP best practices:
- Expanded SQL Support: Now supports multiple read-only statement types
SELECT: Standard data retrievalSHOW: Database metadata (SHOW TABLES, SHOW COLUMNS, etc.)DESCRIBE: Table structure informationEXPLAIN: Query execution plan analysis
- Tool Consolidation: Reduced from 6 tools to 5, then expanded to 7 via new optimized tools
validate_sql_query+execute_safe_sql→ Consolidated intoquery(Automatic Validation)- Added
list_tablestool for database discovery - Renamed tools for clarity:
check_connection,describe_table,sample - New (Dec 23): Added
get_full_schemaandget_table_summaryfor Token optimization
- Optimized Server Instructions: Reduced "exploratory behavior" (unnecessary tool calls) from LLMs
- Explicit tool priority:
queryfirst, others only on error - Expected reduction: From 4-5 tool calls per query to 1-2 calls
- Explicit tool priority:
- Security Enhancements (Dec 23, 2025):
- Query timeout protection (P0 Safety)
- Table Allowlist support (
ALLOWED_TABLES) - Configurable UNION policy (
ALLOW_UNION) - Result truncation to prevent Token overflow
- Code Quality: Reduced code by ~40% (approx 460 → 280 rows) while maintaining functionality
- Enhanced Metadata: Added
ToolAnnotationsto improve LLM tool selection - Lifecycle Management: Correct asynchronous resource lifecycle (FastMCP Best Practice)
- SQL Injection Protection: Identifier validation for dynamic table names
For detailed changes, please refer to REFACTORING_LOG.md.
This project has transitioned from direct function calls to a standardized MCP service architecture, providing:
- Service-Oriented Architecture: Converted direct LLM function calls into an independent MCP server.
- Standardized Protocol: Implemented MCP tools for consistent AI model integration.
- Enhanced Separation of Concerns: Separated server startup logic into dedicated
start_server.py. - Improved Scalability: Single server instance supports multiple concurrent LLM clients.
- Better Security: Service isolation and controlled access via MCP protocol.
The service exposes 5-7 standardized MCP tools (depending on configuration):
Usage: Execute read-only SQL queries with automatic security validation.
This is the main tool for all database operations. Security validation is automatic - only authenticated statements (SELECT, SHOW, DESCRIBE, EXPLAIN) are permitted.
Input:
{
"sql": "SELECT COUNT(*) as total FROM products"
}Output:
{
"success": true,
"query": "SELECT COUNT(*) as total FROM products",
"data": [
{"total": 150}
],
"row_count": 1
}Usage: Test database connection and configuration.
Output:
{
"connected": true,
"message": "Database connection successful"
}Usage: Database Overview - List all tables and their estimated row counts.
Lightweight initial exploration tool. Row counts are INFORMATION_SCHEMA estimates (InnoDB may have ±40% error).
Output:
{
"success": true,
"database_name": "mydb",
"returned_table_count": 2,
"total_tables": 2,
"tables": [
{"table_name": "users", "row_count": 150},
{"table_name": "products", "row_count": 500}
],
"row_count_approximate": true,
"truncated": false,
"truncation_note": null,
"hint": "Row counts are estimates (InnoDB ±40%). total_tables = visible after allowlist."
}Usage: Get Table Structure - Column info, estimated row count, and query suggestions.
Returns column details and estimated row counts from INFORMATION_SCHEMA (avoids COUNT(*) full table scans). Includes is_large flag for query planning.
Input:
{
"table_name": "users"
}Output:
{
"success": true,
"table_name": "users",
"row_count": 1500,
"row_count_approximate": true,
"column_count": 5,
"columns": [
{"column_name": "id", "data_type": "int", "nullable": "NO", "key_type": "PRI"},
{"column_name": "name", "data_type": "varchar", "nullable": "YES", "key_type": ""}
],
"is_large": true,
"recommendation": "Large table (~1500 rows). Use LIMIT or aggregation (COUNT/GROUP BY)."
}Usage: Retrieve sample data from a specified table.
Note: This tool is controlled by ENABLE_SCHEMA_TOOLS environment variable (Default: Enabled)
Input:
{
"table_name": "users",
"limit": 5
}Output:
{
"success": true,
"table_name": "users",
"data": [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"}
],
"row_count": 2,
"query": "SELECT * FROM `users` LIMIT 5"
}Usage: Fetch complete database schema (all tables and columns) in a single call.
Suitable for multi-table JOIN or scenarios requiring all table structures at once. For single table queries, describe_table() is recommended.
Output:
{
"success": true,
"schema": {
"users": {
"row_count": 150,
"columns": [
{"name": "id", "type": "int", "nullable": "NO", "key": "PRI"},
{"name": "name", "type": "varchar", "nullable": "YES", "key": ""}
]
}
},
"returned_table_count": 1,
"total_tables": 1,
"total_columns": 2,
"row_count_approximate": true,
"truncated": false,
"truncation_note": null,
"hint": "Row counts are estimates (InnoDB ±40%). Use LIMIT for large tables (row_count > 1000). total_tables = visible after allowlist."
}Usage: Get table statistics, supports optional exact row count calculation.
Note: This tool is controlled by ENABLE_TABLE_SUMMARY environment variable (Default: Disabled). describe_table() tool already provides estimated row counts, so this tool is needed only when precise counting is required.
Warning: exact_count=True will run COUNT(*), which may be slow on large InnoDB tables (full table scan).
Input:
{
"table_name": "users",
"exact_count": false
}Output:
{
"success": true,
"table_name": "users",
"row_count": 150,
"row_count_approximate": true,
"column_count": 5,
"columns": [...],
"is_large": true,
"recommendation": "Large table (~150 rows). Use LIMIT or aggregation (COUNT/GROUP BY)."
}- Python 3.12+
- MySQL Database
- Dependencies:
sqlparse,SQLAlchemy,PyMySQL,fastMCP,python-dotenv
The project includes two complementary test scripts:
Tests underlying functions directly, without passing through the MCP protocol:
python test_mcp_functions.pyThis script verifies:
- SQL validation logic (safe and unsafe queries)
- Database connection
- Query execution
- Schema introspection (if enabled)
- Sample data retrieval (if enabled)
Tests the server via MCP protocol using FastMCP Client:
python test_mcp_client.pyThis script:
- Connects to MCP Server using FastMCP
ClientAPI - Tests server info and database connection
- Executes multiple SQL queries (list tables, SELECT, COUNT)
- Tests optional schema tools (if enabled)
- Verifies data serialization format
- Feasibility Analysis: Detailed analysis of LLM to MCP conversion
- Original Context: Project background and development guide
- Refactoring Log: Dec 2025 Refactoring change documentation
- MCP Client Test Guide: Guide for testing MCP Server via client
- Prompt Engineering Best Practices: Guide for MCP tool descriptions and prompts
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - make changes and conduct appropriate testing
- Add tests for new features
- Update documentation as needed
- Submit a Pull Request
For technical issues, feature requests, or questions:
- Create an issue in the GitHub repository
- Include relevant error messages and configuration details
- Provide steps to reproduce the issue





