Skip to content

Prompt injection #11

@bwalderman

Description

@bwalderman

Tracking issue to discuss prompt injection and how we might mitigate this. Prompt injection in LLMs seems to be a largely unsolved issue.

A great blog post on the topic: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/. To recap, the "lethal trifecta" of agent capabilities is:

Access to your private data—one of the most common purposes of tools in the first place!
Exposure to untrusted content—any mechanism by which text (or images) controlled by a malicious attacker could become available to your LLM
The ability to externally communicate in a way that could be used to steal your data

Another interesting quote from a related presentation: https://simonwillison.net/2025/Aug/9/bay-area-ai/

Which brings me to my biggest problem with how MCP works today. MCP is all about mix-and-match: users are encouraged to combine whatever MCP servers they like.

This means we are outsourcing critical security decisions to our users! They need to understand the lethal trifecta and be careful not to enable multiple MCPs at the same time that introduce all three legs, opening them up data stealing attacks.

WebMCP, being essentially a port of MCP capabilities to the web, is subject to these risks as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions