Fine-tune Google's FunctionGemma-270M model on custom Model Context Protocol (MCP) tools for advanced function calling capabilities. This project provides a complete pipeline to scan MCP tools, create training datasets, and fine-tune the model using efficient 4-bit quantization and LoRA (Low-Rank Adaptation).
- π Automatic MCP Tool Scanner - Recursively scans Python codebases to extract tool definitions and convert them to OpenAI-compatible function schemas
- π― Efficient Fine-tuning - Uses 4-bit quantization (QLoRA) for training on consumer GPUs (16GB VRAM)
- π Dataset Generation - Converts tool definitions into properly formatted training examples
- π Optimized for RTX 4060 Ti - Configured for 16GB VRAM with gradient accumulation
- π οΈ Full Pipeline - From tool scanning to model deployment
- Train language models to understand and call your custom tools
- Adapt FunctionGemma to domain-specific APIs and functions
- Create specialized AI assistants with custom function calling capabilities
- Build MCP-compatible tool-using agents
- Python 3.10 or higher
- NVIDIA GPU with 16GB+ VRAM (recommended: RTX 4060 Ti)
- Hugging Face account with access to google/functiongemma-270m-it
- Hugging Face token with read access
git clone https://github.com/yourusername/functiongemma-mcp-finetune.git
cd functiongemma-mcp-finetune# Create virtual environment
python -m venv gemma_finetune_env
# Activate (Windows)
gemma_finetune_env\Scripts\activate
# Activate (Linux/Mac)
source gemma_finetune_env/bin/activate
# Install dependencies
pip install torch transformers datasets peft trl bitsandbytes accelerate huggingface_hub# Option 1: Use environment variable
set HF_TOKEN=your_hf_token_here
# Option 2: Edit finetune.py and replace PASTE_YOUR_TOKEN_HEREImportant: Accept the terms at https://huggingface.co/google/functiongemma-270m-it before running.
If you have your own MCP tools codebase to scan:
python scan_mcp_tools.py /path/to/your/codebase --output data/tools.jsonThis will recursively scan Python files and extract tool definitions.
Place your training examples in data/finetune_dataset.jsonl in the format:
{
"id": "example-1",
"tool_names": ["get_weather"],
"messages": [
{"role": "user", "content": "What's the weather in Paris?"},
{"role": "assistant", "content": "", "tool_calls": [{"function": {"name": "get_weather", "arguments": {"city": "Paris", "unit": "celsius"}}}]},
{"role": "tool", "name": "get_weather", "content": "{\"temperature\": 18, \"conditions\": \"cloudy\"}"},
{"role": "assistant", "content": "It's 18Β°C and cloudy in Paris."}
]
}python finetune.pyThe script will:
- Load the base FunctionGemma-270M model
- Apply 4-bit quantization for memory efficiency
- Fine-tune using LoRA adapters
- Save checkpoints to
functiongemma-finetuned/
.
βββ finetune.py # Main fine-tuning script
βββ scan_mcp_tools.py # Tool definition scanner
βββ data/
β βββ tools.json # Extracted tool definitions
β βββ finetune_dataset.jsonl # Training examples
βββ functiongemma-finetuned/ # Output directory (git-ignored)
β βββ adapter_config.json
β βββ adapter_model.safetensors
β βββ checkpoint-*/
βββ tests/
βββ test_scan_mcp_tools.py # Unit tests
Edit these in finetune.py:
sft_config = SFTConfig(
per_device_train_batch_size=2, # Batch size per GPU
gradient_accumulation_steps=8, # Effective batch size = 2 * 8 = 16
learning_rate=2e-4, # Learning rate
max_steps=100, # Total training steps
save_steps=50, # Checkpoint frequency
bf16=True, # Use bfloat16 precision
)peft_config = LoraConfig(
lora_alpha=16,
lora_dropout=0.1,
r=64, # LoRA rank
target_modules=["q_proj", "k_proj", "v_proj", ...],
)The scan_mcp_tools.py script automatically extracts:
- Function names and descriptions (from docstrings)
- Parameter types and descriptions
- Required vs optional parameters
- Nested type annotations (List, Dict, Optional, etc.)
Example:
def get_weather(city: str, unit: str = "celsius") -> dict:
"""Get current weather for a city.
Args:
city: The city name
unit: Temperature unit (celsius or fahrenheit)
"""
passConverts to:
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "The city name"},
"unit": {"type": "string", "description": "Temperature unit"}
},
"required": ["city"]
}
}
}-
GPU Memory Issues?
- Reduce
per_device_train_batch_sizeto 1 - Increase
gradient_accumulation_stepsto 16 - Reduce
max_seq_lengthin dataset_kwargs
- Reduce
-
Better Results?
- Increase
max_steps(100 β 500+) - Add more diverse training examples
- Tune
learning_rate(try 1e-4 or 5e-5)
- Increase
-
Faster Training?
- Use
fp16=Trueinstead ofbf16(if supported) - Reduce
save_stepsto save less frequently
- Use
Hardware: RTX 4060 Ti (16GB VRAM)
- Training Speed: ~5 steps/minute
- Memory Usage: ~14GB VRAM
- Total Time: ~20 minutes for 100 steps
Run unit tests for the tool scanner:
python -m pytest tests/test_scan_mcp_tools.py -vThis project is licensed under the MIT License - see the LICENSE file for details.
- Google DeepMind for FunctionGemma-270M
- Hugging Face for transformers and PEFT libraries
- TRL for supervised fine-tuning utilities
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
For questions or issues, please open a GitHub issue or reach out via discussions.
If this project helps you, please consider giving it a star! β
Made with β€οΈ for the AI community