offload-mcp

MCP server for offloading routine coding-assistant work to a cheaper model.

The default model chain uses Gemma because the models are useful, open, and fun to experiment with. Running them locally can be heavy on RAM, GPU, and setup; the Gemini API (key from Google AI Studio) makes them easy to use for small routine tasks at almost no cost. You can use any supported model ID.

Install

Get a free API key from https://aistudio.google.com/apikey.

Choose one install method.

Option 1: npx (recommended)

npx downloads and runs offload-mcp@latest on demand. You do not need to install the package globally. Your MCP client runs this command whenever it starts the server.

JSON-style MCP config:

{
  "mcpServers": {
    "offload-mcp": {
      "command": "npx",
      "args": ["offload-mcp@latest"],
      "env": { "GOOGLE_AI_API_KEY": "your_key" }
    }
  }
}

TOML-style MCP config:

[mcp_servers.offload-mcp]
command = "npx"
args = ["offload-mcp@latest"]
env = { GOOGLE_AI_API_KEY = "your_key" }

To test that npm can resolve the package:

npx offload-mcp@latest

That starts an MCP stdio server, so it will wait for an MCP client instead of printing a normal CLI screen.

Option 2: global npm install

Install once:

npm install -g offload-mcp

Then use the binary directly in your MCP config.

JSON-style MCP config:

{
  "mcpServers": {
    "offload-mcp": {
      "command": "offload-mcp",
      "env": { "GOOGLE_AI_API_KEY": "your_key" }
    }
  }
}

TOML-style MCP config:

[mcp_servers.offload-mcp]
command = "offload-mcp"
env = { GOOGLE_AI_API_KEY = "your_key" }

To update a global install later:

npm update -g offload-mcp

Use

Ask your assistant to offload routine work:

offload a commit message for the current diff
offload this translation to Mexican Spanish: <text>
use offload to summarize src/index.ts

For local diffs and files, offload_source is the important path because the MCP server reads the input directly:

offload_source(task="commit_message", source="git_diff")
offload_source(task="pr_description", source="git_staged_diff")
offload_source(task="code_summary", source="file", path="src/index.ts")

Footer example:

—— Offloaded via gemma-4-31b-it · 307 model tokens · ~1,420 primary input tokens avoided · [offload-mcp](https://github.com/peterhadorn/offload-mcp)

model tokens come from the API response. primary input tokens avoided is an estimate and only appears when using offload_source.

Tasks

commit_message
pr_description
code_summary
translate
changelog_entry
naming_suggestion
classify
extract_data
code_review_single
docstring
subject_lines
freeform

Use freeform for anything else:

offload(task="freeform", content="ECONNREFUSED 10.0.1.5:5432", prompt="Rewrite as a user-friendly error message. Output only the message.")

Status

offload_status shows local usage counters:

Today: 47/14400 calls (0.3%), 28,500 model tokens processed
Month: 312 calls over 8 days (avg 39/day), 187,400 model tokens processed
Estimated primary input avoided: today ~12,800 tokens, month ~74,200 tokens
Tasks today:
  commit_message: 18
  docstring: 12
  code_summary: 9

Stats are stored locally at ~/.offload-mcp/usage.json by default. Only counters are stored, not task content.

Config

Env var	Default	Description
`GOOGLE_AI_API_KEY`	-	Required
`OFFLOAD_MODEL`	`gemma-4-31b-it`	Preferred model
`OFFLOAD_FALLBACK_MODELS`	`gemma-4-26b-a4b-it`	Comma-separated fallback models
`OFFLOAD_TIMEOUT_MS`	`20000`	Per-model request timeout
`OFFLOAD_RETRIES_PER_MODEL`	`1`	Attempts per model before falling back (1 = no retry)
`OFFLOAD_RPD_LIMIT`	`14400`	Local daily call limit. Lower it if your Gemini API account has a stricter quota.
`OFFLOAD_LOG_PATH`	`~/.offload-mcp/usage.json`	Local usage stats

By default, requests try gemma-4-31b-it first and fall back to gemma-4-26b-a4b-it on timeouts, rate limits, and transient server errors. Set OFFLOAD_FALLBACK_MODELS= to disable fallback.

Data

offload-mcp sends task content to the configured Gemini API model. Do not offload secrets, private customer data, proprietary code, or regulated data unless your policy allows it.

offload_source with source="file" reads any file path the MCP server process can access. Treat the path and cwd parameters as trusted local input from your MCP client.

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
assets		assets
plans		plans
rules		rules
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

offload-mcp

Install

Option 1: npx (recommended)

Option 2: global npm install

Use

Tasks

Status

Config

Data

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

offload-mcp

Install

Option 1: npx (recommended)

Option 2: global npm install

Use

Tasks

Status

Config

Data

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages