-
Notifications
You must be signed in to change notification settings - Fork 14
feat: integrate llms.txt documentation tools #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… guidance - Add new llms_txt.py module with domain-secured documentation fetching - Implement list_doc_sources, fetch_llms_txt_content, fetch_documentation, and add_doc_source functions - Add MCP tool definitions in server.py with clear 'WHEN TO USE' guidance for clients - Update tool descriptions to guide MCP clients (Claude, Cursor) on proper usage for Atlan product questions - Add httpx dependency for async HTTP requests - Configure default Atlan documentation source with domain security - All examples updated to reflect Atlan-specific use cases
- Fix trailing whitespace in llms_txt.py and server.py - Apply Ruff formatting and linting fixes - All pre-commit checks now passing
- Increase HTTP timeout from 10s to 30s for documentation fetching - Enable redirects for documentation URLs - Improve HTTP exception handling with specific error types - Fix trailing comma syntax in HTTP client configuration - Remove unsupported request_timeout parameter from FastMCP initialization
- Add Documentation Tools section to Available Tools table - Include list_doc_sources, fetch_llms_txt, fetch_docs, and add_doc_source tools - Update Tool Access Control section with documentation tool restrictions - Organize tools into Asset Management and Documentation categories - Add usage guidance for documentation tools with domain security notes
I believe we should try merging all the docs related functions into a single tool rather than having 4 tools here. |
- Combine list_doc_sources, fetch_llms_txt, fetch_docs, and add_doc_source into one tool - Add action-based interface with list_sources, fetch_index, fetch_content, add_source actions - Simplify API from 4 tools to 1 unified tool while maintaining all functionality - Update README documentation to reflect single documentation tool - Streamline tool access control configuration - Tested successfully with streamable HTTP transport
- Remove add_source action to restrict tool to Atlan documentation only - Eliminate llms_txt_url and allowed_domains parameters - Update documentation to reflect Atlan-only focus - Simplify available actions to list_sources, fetch_index, fetch_content - Prevent customers from adding external documentation sources
@rahul-madaan I've combined all tools into a single doc tool. Thanks |
- Remove add_source() method from LLMSTxtManager class - Remove add_doc_source() function from docs.py - Update tools/__init__.py imports to exclude add_doc_source - Remove add_doc_source from __all__ exports - Ensure Atlan-only documentation access with no external source addition - Successfully tested with streamable HTTP transport
modelcontextprotocol/server.py
Outdated
APIs, integrations, or need any documentation-related assistance. | ||
This powerful unified tool handles all Atlan documentation operations through different actions: | ||
- list_sources: Discover available Atlan documentation sources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just add list sources ? Considering our llms.txt is at the default path we can return this path and it will automatically fetch it and use it. Do we necessarily need to fetch it in out source code and return it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But some mcp clients can't fetch content from urls like claude does
…action Remove complex llms.txt parsing, update FastMCP to 2.12.3, reduce timeouts for better MCP compatibility
modelcontextprotocol/tools/docs.py
Outdated
async def _fetch_content(self, url: str) -> str: | ||
"""Fetch content from URL with proper error handling.""" | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree with @firecast, we can just give out the url and cursor or claude should be able to fetch the relevant context from them.
there is only single source, and we anyways give out all the links in the llms.txt, instead we give it the llms.txt endpoint itself -> let agents do the job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know claude can fetch content from urls natively. Does cursor can do that too? @Hk669
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's possible in other mcp clients
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets let clients fetch the url. I don't think we should do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.

@firecast I've removed fetch_content. Claude is not fetching the llms.txt endpoint and started to do a web search instead. It happens everytime
Clients should fetch content directly via provided llms.txt URLs
This reverts commit 9b97711.
This reverts commit bb96da5.
Align with supported version for Atlan MCP server
- Remove beautifulsoup4 dependency, use markdownify directly - Add comprehensive JavaScript and navigation cleanup - Achieve 88% token reduction while preserving content links - Improve security with better domain validation - Clean up unnecessary return fields for API response - Update README.md with correct documentation tool information
This pull request introduces a new unified documentation tool for Atlan documentation access, enhances security for documentation retrieval, and updates dependencies to support these features. The changes include both backend implementation and documentation updates to describe the new tool and its usage.
Key changes:
Documentation Tool Addition and Security
documentation_tool
to the MCP server, providing unified access to Atlan documentation with actions for listing sources and fetching content, including domain validation for security. [1] [2]modelcontextprotocol/tools/docs.py
with aDocumentationManager
that manages documentation sources, enforces allowed domains, and handles secure fetching and error reporting.list_doc_sources
andfetch_documentation
in the tools module and server imports for use by the MCP server. [1] [2] [3]Documentation and Tooling Updates
README.md
to document the new "Documentation Tools" section, including usage instructions, available actions, and security features. Also updated the tool access control section to include the new documentation tool. [1] [2] [3] [4]Dependency and Build Updates
pyproject.toml
to requirefastmcp==2.12.3
and addhttpx
for async HTTP requests. Added a dev dependency group for pre-commit. [1] [2]