LLM Web Browser

A browser-like tool for LLMs that provides websites in markdown format with grep-like filtering capabilities and SQLite caching for improved performance.

Features

Web to Markdown Conversion: Converts websites to clean, readable markdown format
Advanced Grep-like Filtering: Filter content with powerful pattern matching before sending to LLMs
Permanent SQLite Caching: Cache websites for faster access and bandwidth conservation
MCP Integration: Uses the Model Context Protocol to provide a standardized interface for LLMs
Platform-Appropriate Storage: Stores cache in platform-standard locations

Installation

git clone https://github.com/fredrikangelsen/llm-browser.git
cd llm-browser
uv tool install --editable .

Usage

Running the MCP Server

# Run the server with default settings
llm-browser server

# Specify a custom database location
llm-browser server --db-path /path/to/your/custom_cache.db

Managing the Cache

# View cache statistics
llm-browser cache stats

# Clear the cache
llm-browser cache clear

Using with LLMs

The browser provides the following MCP tools:

1. browse_url

Fetches a webpage, converts it to markdown, and optionally filters with grep-like functionality.

browse_url(
    url="https://example.com",
    grep_pattern="optional regex pattern",
    context_lines=2,         # Show 2 lines before and after matches
    invert_match=False,      # If True, show non-matching lines
    show_line_numbers=True,  # Show line numbers with results
    whole_words=False        # Match whole words only
)

2. search_cached_content

Searches all cached pages for content matching a grep pattern with advanced filtering options.

search_cached_content(
    grep_pattern="search term",
    context_lines=0,         # Number of context lines to show
    invert_match=False,      # If True, show non-matching lines
    show_line_numbers=False, # Show line numbers with results
    whole_words=False        # Match whole words only
)

3. clear_cache

Manually clear the entire web browsing cache.

clear_cache()

4. get_cache_stats

Gets statistics about the cache, including location, size, and stored URLs.

get_cache_stats()

Technical Implementation Details

URL Normalization

The tool normalizes URLs before caching to prevent duplicate entries by removing fragments and preserving the main URL structure.

Database Implementation

The tool uses SQLAlchemy ORM for database interactions, with the database stored in platform-specific standard locations:

Linux: ~/.local/share/llm-browser/web_cache.db
macOS: ~/Library/Application Support/llm-browser/web_cache.db
Windows: %LOCALAPPDATA%\llm-browser\web_cache.db

You can specify a custom database location by setting the environment variable LLM_BROWSER_DB_PATH.

HTML to Markdown Conversion

The tool uses markdownify for high-quality HTML to Markdown conversion, maintaining formatting while focusing on the main content.

Example Integration with Claude or Other LLMs

When integrating with Claude or other LLMs:

Run the web browser MCP server in one process
Connect your LLM client to the server
The LLM can then request web content through the MCP protocol

See the examples directory for detailed integration examples.

Adding to Claude Code

You can add llm-browser as an MCP server to Claude Code:

# Start by installing the tool
git clone https://github.com/fredrikangelsen/llm-browser.git
cd llm-browser
uv tool install --editable .

# Add the MCP server to Claude Code
# Basic syntax (note the -- separator when using flags)
claude mcp add llm-browser -- llm-browser server

# With custom options (e.g., custom database path)
claude mcp add llm-browser -s global -- llm-browser server --db-path /custom/path/cache.db

# Verify it was added correctly
claude mcp list

# Use Claude Code with the browser tool
claude

The -s global flag stores the configuration globally rather than just for the current project.

Once added, Claude will have access to web browsing capabilities through these tools:

browse_url: Fetch and convert webpages to markdown
search_cached_content: Search across all cached webpages
get_cache_stats: View cache statistics
clear_cache: Clear the webpage cache

MCP Resources

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
jupyter		jupyter
llm_browser		llm_browser
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Web Browser

Features

Installation

Usage

Running the MCP Server

Managing the Cache

Using with LLMs

1. browse_url

2. search_cached_content

3. clear_cache

4. get_cache_stats

Technical Implementation Details

URL Normalization

Database Implementation

HTML to Markdown Conversion

Example Integration with Claude or Other LLMs

Adding to Claude Code

MCP Resources

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

angelsen/llm-browser

Folders and files

Latest commit

History

Repository files navigation

LLM Web Browser

Features

Installation

Usage

Running the MCP Server

Managing the Cache

Using with LLMs

1. browse_url

2. search_cached_content

3. clear_cache

4. get_cache_stats

Technical Implementation Details

URL Normalization

Database Implementation

HTML to Markdown Conversion

Example Integration with Claude or Other LLMs

Adding to Claude Code

MCP Resources

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages