-
Notifications
You must be signed in to change notification settings - Fork 809
Reinstate caching in schema generation #5908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reinstate caching in schema generation #5908
Conversation
🎉 Good job! The coverage increased 🎉
Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=953290&view=codecoverage-tab |
We expect that to be something folks do, recreating an AIFunction for the same delegate over and over? I was expecting our guidance would be to create them once and cache them. |
src/Libraries/Microsoft.Extensions.AI.Abstractions/Utilities/AIJsonUtilities.Schema.cs
Outdated
Show resolved
Hide resolved
What do the numbers before/after look like if you first add 512 methods to the cache and then benchmark calling Create with a different one? I'm wondering what the overhead of accessing the ConditionalWeakTable, accessing the ConcurrentDictionary, running the GetHashCode, etc. looks like when it doesn't pay dividends. Presumably the common case will be that across the whole process, typically the default JsonSerializerOptions will be used, so the 512 limit will end up applying to all use of AIFunctionFactory across the process. |
When working on the demo I felt enticed to incorporate the AIFunction declaration directly on the method like so: chatOptions.Tools = [AIFunctionFactory.Create(GetNearestPorts)]; Which got me thinking that users would likely fall into the same performance trap.
That scenario should be represented by the benchmarks creating a new JSO every time. As you can see, it does introduce performance regression compared to what we have in main today. |
🎉 Good job! The coverage increased 🎉
Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=953615&view=codecoverage-tab |
Should we cache the AI Functions instead then? |
I tried that and have a working prototype locally. What I did discover is that schema generation accounts for ~95% of the cost (as can be seen from the benchmark results) and caching on the |
Those numbers are very suspect to me, e.g. CreateAIFunction | 56.31 ns | 0.404 ns | 0.378 ns | 0.0172 | 72 B That's suggesting with this change AIFunctionFactory.Create(delegate) only allocates 72 bytes and only took 56 nanoseconds? Without caching the AIFunction? It's doing a non-trivial amount of reflection. The AIFunction object alone must be close to 72 bytes, plus the delegates that get created for marshaling. So I pulled down your change, and instead I see numbers like this:
Can you re-confirm what it is you're measuring? |
Here's updated numbers from a new run:
And here's the same benchmarks with additional caching implemented on the AIFunction layer:
So the numbers I quoted originally were most likely recorded with those changes accidentally unstashed. I've updated the OP to reflect the correct numbers. |
@stephentoub as discussed, I pushed updates using caching on the AIFunctionFactory layer only. Here are the updated benchmark numbers:
|
🎉 Good job! The coverage increased 🎉
Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=955543&view=codecoverage-tab |
🎉 Good job! The coverage increased 🎉
Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=955619&view=codecoverage-tab |
src/Libraries/Microsoft.Extensions.AI.Abstractions/Utilities/AIJsonSchemaCreateOptions.cs
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.Utilities.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks.
🎉 Good job! The coverage increased 🎉
Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=956635&view=codecoverage-tab |
I spent a bit of time investigating the performance of
AIFunctionFactory.Create
where I discovered that schema generation is the most expensive factor by far is schema derivation. Even though the schema generation routines used to employ a caching mechanism in the past, this got removed during the recent refactorings since it no longer is necessary for leaf clients to resolve schemas on the fly.This PR reinstates a different version of the caching mechanism for the purpose of speeding up
AIFunctionFactory.Create
when applied multiple times on the same delegate.Benchmarks
Main
PR
Microsoft Reviewers: Open in CodeFlow