Skip to content

Add llms.txt file to F# documentation. #47144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ScottArbeit
Copy link

Summary

This PR adds a new llms.txt file to the F# documentation.

llms.txt is a proposed new standard that is meant to allow AI and large language models to get a summary and guidance for how to use a web site effectively. The major frontier LLM providers have indicated their support for the file and for using markdown as the format for it.

Notes

In addition to replicating the high-level table of contents for F# documentation, I've included two other sections.

Coding Standards Guidance

This section is intended to reinforce modern F# coding standards and best practices for LLM code generation. My hope is that this guidance overrides some of the older format F# code found in open-source repositories.

I particularly welcome feedback on this section... additional guidance is more than welcome. This is just what I could think of quickly, and mostly comes from the instructions I've ended up giving to ChatGPT about F#.

Additional F# Community Resources

F# has a strong community, and these websites are the most common ones I see show up in web searches and the ones that I've relied on learning the language.

Because we've referred to these sites in other documentation and blog posts (and videos and conference sessions, etc.) I feel safe adding them, but if referring to non-Microsoft sites in official documentation is a problem, I can remove them.

@ScottArbeit ScottArbeit requested review from BillWagner and a team as code owners July 8, 2025 21:49
@dotnetrepoman dotnetrepoman bot added this to the July 2025 milestone Jul 8, 2025
@dotnet-policy-service dotnet-policy-service bot added dotnet-fsharp/svc community-contribution Indicates PR is created by someone from the .NET community. labels Jul 8, 2025
@ScottArbeit
Copy link
Author

@dotnet-policy-service agree company="GitHub"

@T-Gro
Copy link
Member

T-Gro commented Jul 9, 2025

@BillWagner :

Are there any centralized ideas/efforts/plans for doing the equivalent of this for other docs as well?
The sources are in markdown already, so perharps a script that flattens them into a single massive file per category (category being F#, VB.NET etc.)

@BillWagner
Copy link
Member

Adding @IEvangelist @gewarren @adegeo on my team.

Yes, we are looking at a way to do this for the full repo.

I like the direction. Please give us a few days to work through some details and we'll get a review and next steps.

@IEvangelist
Copy link
Member

Hi @ScottArbeit — thank you for this PR! And hey @T-Gro, appreciate the thoughtful question and discussion.

Ideally, this kind of functionality should be integrated into our internal build system. Some modern docs platforms — like Starlight — already support it out of the box via plugins like this one. While many tools today scrape rendered HTML to generate llms.txt files, we have a key advantage: we control the source Markdown. So rather than hand-authoring or maintaining this file separately, it really should be auto-generated as part of the docs build process, from the source MD.

Separately, the Microsoft Learn platform team has been discussing expectations and ecosystem-wide guidance around llms.txt. I'll follow up with them to see where that landed. For this PR, I'm totally fine moving forward — just noting that we likely won’t maintain this manually long-term.

@adegeo
Copy link
Contributor

adegeo commented Jul 9, 2025

The docfx will need to be updated to ship llms.txt in any folder. Right now it's only adding image files as extra content resources.

@IEvangelist Looks like Gopher is making a comeback 😜

@ScottArbeit
Copy link
Author

Thanks for the discussion, everyone. I just wanted to get the ball rolling on this, expecting that it will evolve.

So rather than hand-authoring or maintaining this file separately, it really should be auto-generated as part of the docs build process, from the source MD.

I understand the temptation to want to auto-generate llms.txt, but here's where I'll push back. To me, auto-generating the file doesn't add a lot to what LLM's would already know about the site from pre-training + using a web search tool.

I encourage you all to look at llms.txt not as a new box to be checked, but as a new design surface that we, as humans, can use to concisely communicate guidance to LLM's, especially for code generation. We've entered an age where the most popular languages and frameworks get the best code generation because of the sheer number of examples. F# code generation - and F# has been my primary language for years - hasn't been nearly as good, because there's just less of it. llms.txt is our opportunity to add the guidance that we want to tell the LLM's exactly how they should approach generating code.

This applies very well to C# - where newer functional constructs should be preferred - and to C/C++ where the weight of older C99 / C++98 style code still affects code gen. I could go on, but you get the point. And not just languages, think llms.txt for Blazor and ASP.NET Core MVC and Aspire and everything else where we have experience and opinions about what works best.

Separately, the Microsoft Learn platform team has been discussing expectations and ecosystem-wide guidance around llms.txt. I'll follow up with them to see where that landed.

Although I'm not on the Microsoft Learn team, I'm in DevDiv and happy to participate. If I could design a process around llms.txt we would continually refine it by hand, with our standard measures for staleness to indicate when it should be looked at, because how effective it is in terms of influencing code generation is the primary indicator of its usefulness. Bonus points for having some eval that runs occasionally to see what the impact of it is.

If there's any automation behind creating the file, I'd love for it involve a combination of automatically generating the "table of contents" part + a hand-edited, well-discussed section about language (or framework or technology) standards that makes it explicit which constructs are preferred. That bit could come from another page in the documentation where we explicitly share those standards, and we could use an agentic workflow to take that page and summarize it for including in llms.txt, but it should be embedded in llms.txt somehow to make sure the LLM's are seeing it where they expect to.

Anyway, that's my 2₵. I just don't want to lose the opportunity to apply human design thinking here. It's soon-to-be the most important place for humans to tell the models how to use our technologies in their best possible ways. And, frankly, for F#, I need it. I'm very tired of seeing code generated with async { } and myArray.[0].

Looks like Gopher is making a comeback 😜

NowIFeelOldEdQuinnGIF

@T-Gro
Copy link
Member

T-Gro commented Jul 11, 2025

@adegeo :

Do I understand it correctly that right now we cannot ship any .txt to docs production anyway, until that update is done?

Otherwise I would be happy to approve this to get it in, before a more centralized solution comes in.

@ScottArbeit :
We can also consider the "best practices short summary" as a regular docs page to be added, BOTH humans and LLMs.
And then re-include it directly inside the llm.txt to make sure it is read every time.

@adegeo
Copy link
Contributor

adegeo commented Jul 11, 2025

Do I understand it correctly that right now we cannot ship any .txt to docs production anyway, until that update is done?

I believe so. It doesn't just mirror what's in GitHub; it must be marked as a resource to be published, something like **/llms.txt

docs/docfx.json

Lines 118 to 127 in 3d22cff

"resource": [
{
"files": [
"images/**",
"**/*.png",
"**/*.svg",
"**/*.jpg",
"**/*.gif",
"**/*.bmp"
],

@adegeo
Copy link
Contributor

adegeo commented Jul 11, 2025

@IEvangelist

Separately, the Microsoft Learn platform team has been discussing expectations and ecosystem-wide guidance around llms.txt. I'll follow up with them to see where that landed. For this PR, I'm totally fine moving forward — just noting that we likely won’t maintain this manually long-term.

We can also update one of the repo health scripts/checks we have to discover any llms.txt and check for valid links until the org catches up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-contribution Indicates PR is created by someone from the .NET community. dotnet-fsharp/svc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants