Skip to content

Conversation

@allozaur
Copy link
Collaborator

@allozaur allozaur commented Jan 7, 2026

WIP

ServeurpersoCom and others added 30 commits January 5, 2026 09:00
@ServeurpersoCom
Copy link
Collaborator

The simplified popover is great!

@bennmann
Copy link

sure would be nice to have a MCP server built into llama.cpp, instead of a client, or at least an intuitive user controlled way to install an apache 2.0 MCP server (from elsewhere) from the webui.

@ServeurpersoCom
Copy link
Collaborator

sure would be nice to have a MCP server built into llama.cpp, instead of a client, or at least an intuitive user controlled way to install an apache 2.0 MCP server (from elsewhere) from the webui.

Oh yes, we absolutely have to integrate an MCP server, the bare minimum of example Python/Node.js. There are also several C++ server implementations that could be integrated. Even a relay to SSH so that anyone can turn a Raspberry Pi or a VM into a sandbox is the ultimate in terms of fun, education and utility. An LLM with access to bash in a dedicated environment is just the MCP server to rule them all. I've validated scraping, RAGing on data, and even machine learning within a sandbox simply by prompting the LLM correctly. Nothing else is needed since everything is possible!

@strawberrymelonpanda
Copy link
Contributor

strawberrymelonpanda commented Jan 21, 2026

An LLM with access to bash in a dedicated environment is just the MCP server to rule them all.

Honestly I still like having some well defined MCPs, but "only access to Bash" seems to be exactly how SWE-Bench evaluates using their "100 lines of Python" mini-swe-agent. If it's good enough for SWE-Bench...

Performant: Scores >74% on the SWE-bench verified benchmark
Does not have any tools other than bash

@ServeurpersoCom
Copy link
Collaborator

ServeurpersoCom commented Jan 22, 2026

An LLM with access to bash in a dedicated environment is just the MCP server to rule them all.

Honestly I still like having some well defined MCPs, but "only access to Bash" seems to be exactly how SWE-Bench evaluates using their "100 lines of Python" mini-swe-agent. If it's good enough for SWE-Bench...

Performant: Scores >74% on the SWE-bench verified benchmark
Does not have any tools other than bash

Yes, this approach is interesting!
On current LLMs, we can see that they handle the "ed" command better than "sed".
In addition to "bash_tool", a set of three context-optimizing commands like "str_replace", "create_file", and the universal "view" that handles file system view, line numbering of text files (and image transfer) gives better results than bash alone on long loops. This is the command set Claude Web chose, and it works very well on all models. Forcing the model to complete a justification argument first with each tool call is essential to limit "context rot" and provide an additional "reasoning effect".

The documentation for the tools is identical to the model, which suggests prioritizing their use over bash_tool. This reduces entropy.

It works so well that one day, as an off-topic idea for this PR, we could integrate a similar set of optimized commands into a "virtual MCP server / fake terminal OS" for client-side creation/editing/execution of JavaScript and have a good computer-use sandbox mode embedded in the client. If we want the model to have memory, a create_file is stored in the indexdb. This could be very powerful. on the fake os, bash_file would essentially be run_file, simply the execution of a .js file within the sandbox iframe. etc....

@allozaur
Copy link
Collaborator Author

sure would be nice to have a MCP server built into llama.cpp, instead of a client, or at least an intuitive user controlled way to install an apache 2.0 MCP server (from elsewhere) from the webui.

First we focus on the MCP Client and having a solid core implementation. Then we can expand.

@allozaur
Copy link
Collaborator Author

I have WIP code for Prompts, Ellicitation and Sampling locally, but before i push it i need to re-review the PR, potentially clean up and remove some out-of-scope code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants