Skip to content

Add llms.txt support for AI/LLM website crawling #15361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

asafashirov
Copy link
Contributor

Summary

Implements the official llms.txt standard to provide structured content for Large Language Models and AI systems.

Features

  • Standards Compliant: Follows the official llmstxt.org specification
  • Organized Content: Structured sections including documentation, tutorials, blog posts, guides, and case studies
  • Smart Filtering: Front matter exclusion support with llms_exclude: true
  • CI Optimized: Environment-aware generation that skips for preview builds
  • Performance Focused: Minimal impact (~29KB file, negligible build time)

Implementation Details

  • Generates /llms.txt at site root using Hugo's TXT output format
  • Markdown format with proper H1/H2 structure per specification
  • Comprehensive content coverage organized by section type
  • Recent blog posts limited to 20 most recent entries

Benefits

  • Helps AI systems understand and navigate site content more effectively
  • Improves discoverability for LLM-powered tools and applications
  • Maintains compatibility with existing web standards (robots.txt, sitemaps)
  • Zero impact on existing functionality

Test Plan

  • Verify llms.txt generation in production builds
  • Confirm preview builds skip generation
  • Test front matter exclusion functionality
  • Validate official standard compliance
  • Measure performance impact (negligible)

Addresses growth experiment requirements for enhanced AI/LLM content accessibility.

Implements the official llms.txt standard (llmstxt.org) to provide
structured content for Large Language Models.

Features:
- Generates /llms.txt at site root following official specification
- Organized sections: docs, tutorials, blog posts, guides, case studies
- Front matter exclusion support with llms_exclude: true
- Environment-aware generation (skips for preview builds)
- Minimal performance impact (~29KB file, negligible build time)

The llms.txt file helps AI systems understand and navigate the
site's content more effectively while maintaining compatibility
with existing web standards.
@pulumi-bot
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants