Skip to content

Reinstate caching in schema generation #5908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 19, 2025

Conversation

eiriktsarpalis
Copy link
Member

@eiriktsarpalis eiriktsarpalis commented Feb 15, 2025

I spent a bit of time investigating the performance of AIFunctionFactory.Create where I discovered that schema generation is the most expensive factor by far is schema derivation. Even though the schema generation routines used to employ a caching mechanism in the past, this got removed during the recent refactorings since it no longer is necessary for leaf clients to resolve schemas on the fly.

This PR reinstates a different version of the caching mechanism for the purpose of speeding up AIFunctionFactory.Create when applied multiple times on the same delegate.

Benchmarks

[MemoryDiagnoser]
public class Benchmark
{
    private static readonly Func<string, int, MyPoco?, MyPoco> _delegate = MyFunc;

    [Benchmark]
    public JsonElement CreateFunctionSchema() => AIJsonUtilities.CreateFunctionJsonSchema(_delegate.Method);

    [Benchmark]
    public JsonElement CreateFunctionSchema_CustomDescription() => AIJsonUtilities.CreateFunctionJsonSchema(_delegate.Method, description: "custom description");

    [Benchmark]
    public JsonElement CreateFunctionSchema_NewOptions() => AIJsonUtilities.CreateFunctionJsonSchema(_delegate.Method, serializerOptions: new(JsonSerializerOptions.Default));

    [Benchmark]
    public AIFunction CreateAIFunction() => AIFunctionFactory.Create(_delegate);

    [Benchmark]
    public AIFunction CreateAIFunction_CustomDescription() => AIFunctionFactory.Create(_delegate, description: "custom description");

    [Benchmark]
    public AIFunction CreateAIFunction_NewOptions() => AIFunctionFactory.Create(_delegate, serializerOptions: new(JsonSerializerOptions.Default));

    [Description("My awesome function")]
    public static MyPoco MyFunc([Description("Parameter 1")]string param1, int param2 = 42, MyPoco? other = null) => throw new NotImplementedException();

    public record MyPoco(int X, int Y);
}

Main

Method Mean Error StdDev Gen0 Allocated
CreateFunctionSchema 6.345 us 0.0848 us 0.0708 us 2.4109 9.95 KB
CreateFunctionSchema_CustomDescription 6.041 us 0.0357 us 0.0298 us 2.3499 9.7 KB
CreateFunctionSchema_NewOptions 6.438 us 0.0260 us 0.0230 us 2.1973 9.1 KB
CreateAIFunction 6.706 us 0.0344 us 0.0322 us 2.5940 10.65 KB
CreateAIFunction_CustomDescription 6.298 us 0.0393 us 0.0368 us 2.5330 10.45 KB
CreateAIFunction_CustomOptions 6.683 us 0.0300 us 0.0281 us 2.3804 9.85 KB

PR

Method Mean Error StdDev Gen0 Allocated
CreateFunctionSchema 6,679.98 ns 35.900 ns 31.824 ns 2.4109 10193 B
CreateFunctionSchema_CustomDescription 6,126.56 ns 43.308 ns 38.391 ns 2.3499 9929 B
CreateFunctionSchema_NewOptions 6,494.00 ns 59.036 ns 52.334 ns 2.1973 9318 B
CreateAIFunction 20.20 ns 0.097 ns 0.081 ns - -
CreateAIFunction_CustomDescription 28.19 ns 0.182 ns 0.171 ns 0.0134 56 B
CreateAIFunction_NewOptions 7,309.19 ns 72.739 ns 64.481 ns 2.5635 10854 B
Microsoft Reviewers: Open in CodeFlow

@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.AI.OpenAI Line 77 67.92 🔻
Microsoft.Extensions.AI.OpenAI Branch 77 49.8 🔻
Microsoft.Extensions.Caching.Hybrid Line 86 82.77 🔻
Microsoft.Extensions.AI.Ollama Line 80 78.2 🔻
Microsoft.Extensions.AI.Evaluation.Quality Line 88 7.57 🔻
Microsoft.Extensions.AI.Evaluation.Quality Branch 88 16.42 🔻
Microsoft.Extensions.AI.Evaluation Line 88 58.67 🔻
Microsoft.Extensions.AI.Evaluation Branch 88 56.67 🔻
Microsoft.Extensions.AI.Evaluation.Console Line 88 8.26 🔻
Microsoft.Extensions.AI.Evaluation.Console Branch 88 17.07 🔻
Microsoft.Extensions.AI.AzureAIInference Line 91 90.09 🔻
Microsoft.Extensions.AI.AzureAIInference Branch 91 86.76 🔻
Microsoft.Gen.MetadataExtractor Line 98 57.35 🔻
Microsoft.Gen.MetadataExtractor Branch 98 62.5 🔻
Microsoft.Extensions.AI.Evaluation.Reporting Line 88 72.06 🔻
Microsoft.Extensions.AI.Evaluation.Reporting Branch 88 64.8 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI 88 89
Microsoft.Extensions.AI.Abstractions 83 84

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=953290&view=codecoverage-tab

@stephentoub
Copy link
Member

This PR reinstates a different version of the caching mechanism for the purpose of speeding up AIFunctionFactory.Create when applied multiple times on the same delegate.

We expect that to be something folks do, recreating an AIFunction for the same delegate over and over? I was expecting our guidance would be to create them once and cache them.

@stephentoub
Copy link
Member

Benchmark

What do the numbers before/after look like if you first add 512 methods to the cache and then benchmark calling Create with a different one? I'm wondering what the overhead of accessing the ConditionalWeakTable, accessing the ConcurrentDictionary, running the GetHashCode, etc. looks like when it doesn't pay dividends. Presumably the common case will be that across the whole process, typically the default JsonSerializerOptions will be used, so the 512 limit will end up applying to all use of AIFunctionFactory across the process.

@eiriktsarpalis
Copy link
Member Author

We expect that to be something folks do, recreating an AIFunction for the same delegate over and over?

When working on the demo I felt enticed to incorporate the AIFunction declaration directly on the method like so:

chatOptions.Tools = [AIFunctionFactory.Create(GetNearestPorts)];

Which got me thinking that users would likely fall into the same performance trap.

I'm wondering what the overhead of accessing the ConditionalWeakTable, accessing the ConcurrentDictionary, running the GetHashCode, etc. looks like when it doesn't pay dividends.

That scenario should be represented by the benchmarks creating a new JSO every time. As you can see, it does introduce performance regression compared to what we have in main today.

@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.AI.Ollama Line 80 78.2 🔻
Microsoft.Extensions.AI.Evaluation Line 88 58.67 🔻
Microsoft.Extensions.AI.Evaluation Branch 88 56.67 🔻
Microsoft.Extensions.AI.Evaluation.Quality Line 88 7.57 🔻
Microsoft.Extensions.AI.Evaluation.Quality Branch 88 16.42 🔻
Microsoft.Extensions.AI.AzureAIInference Line 91 90.09 🔻
Microsoft.Extensions.AI.AzureAIInference Branch 91 86.76 🔻
Microsoft.Extensions.AI.Evaluation.Reporting Line 88 72.06 🔻
Microsoft.Extensions.AI.Evaluation.Reporting Branch 88 64.8 🔻
Microsoft.Gen.MetadataExtractor Line 98 57.35 🔻
Microsoft.Gen.MetadataExtractor Branch 98 62.5 🔻
Microsoft.Extensions.AI.OpenAI Line 77 67.92 🔻
Microsoft.Extensions.AI.OpenAI Branch 77 49.8 🔻
Microsoft.Extensions.AI.Evaluation.Console Line 88 8.26 🔻
Microsoft.Extensions.AI.Evaluation.Console Branch 88 17.07 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.AI.Abstractions 83 84
Microsoft.Extensions.Caching.Hybrid 86 87
Microsoft.Extensions.AI 88 89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=953615&view=codecoverage-tab

@stephentoub
Copy link
Member

When working on the demo I felt enticed to incorporate the AIFunction declaration directly on the method like so:

Should we cache the AI Functions instead then?

@eiriktsarpalis
Copy link
Member Author

When working on the demo I felt enticed to incorporate the AIFunction declaration directly on the method like so:

Should we cache the AI Functions instead then?

I tried that and have a working prototype locally. What I did discover is that schema generation accounts for ~95% of the cost (as can be seen from the benchmark results) and caching on the AIFunctionMetadata level is trickier since there are more variables at play. It seemed like an obvious win to do it on the schema generation APIs instead.

@stephentoub
Copy link
Member

stephentoub commented Feb 15, 2025

Those numbers are very suspect to me, e.g.

CreateAIFunction | 56.31 ns | 0.404 ns | 0.378 ns | 0.0172 | 72 B

That's suggesting with this change AIFunctionFactory.Create(delegate) only allocates 72 bytes and only took 56 nanoseconds? Without caching the AIFunction? It's doing a non-trivial amount of reflection. The AIFunction object alone must be close to 72 bytes, plus the delegates that get created for marshaling.

So I pulled down your change, and instead I see numbers like this:

Method Mean Error StdDev Allocated
CreateAIFunction 1.214 us 0.0194 us 0.0182 us 816 B

Can you re-confirm what it is you're measuring?

@eiriktsarpalis
Copy link
Member Author

Here's updated numbers from a new run:

Method Mean Error StdDev Gen0 Gen1 Allocated
CreateFunctionSchema 26.01 ns 0.093 ns 0.087 ns - - -
CreateFunctionSchema_CustomDescription 38.41 ns 0.172 ns 0.161 ns - - -
CreateFunctionSchema_NewOptions 7,505.62 ns 139.908 ns 130.870 ns 2.3804 - 10167 B
CreateAIFunction 612.79 ns 5.975 ns 5.589 ns 0.1945 - 816 B
CreateAIFunction_CustomDescription 224.36 ns 1.081 ns 0.903 ns 0.1452 - 608 B
CreateAIFunction_NewOptions 7,657.49 ns 61.634 ns 54.637 ns 2.5635 0.0610 10779 B

And here's the same benchmarks with additional caching implemented on the AIFunction layer:

Method Mean Error StdDev Gen0 Allocated
CreateFunctionSchema 27.15 ns 0.085 ns 0.079 ns - -
CreateFunctionSchema_CustomDescription 32.35 ns 0.215 ns 0.201 ns - -
CreateFunctionSchema_NewOptions 7,648.95 ns 140.465 ns 196.912 ns 2.4109 10167 B
CreateAIFunction 60.11 ns 0.307 ns 0.287 ns 0.0172 72 B
CreateAIFunction_CustomDescription 66.00 ns 0.256 ns 0.227 ns 0.0305 128 B
CreateAIFunction_NewOptions 8,604.40 ns 60.222 ns 53.385 ns 2.7466 11692 B

So the numbers I quoted originally were most likely recorded with those changes accidentally unstashed. I've updated the OP to reflect the correct numbers.

@eiriktsarpalis
Copy link
Member Author

@stephentoub as discussed, I pushed updates using caching on the AIFunctionFactory layer only. Here are the updated benchmark numbers:

Method Mean Error StdDev Gen0 Allocated
CreateFunctionSchema 6,679.98 ns 35.900 ns 31.824 ns 2.4109 10193 B
CreateFunctionSchema_CustomDescription 6,126.56 ns 43.308 ns 38.391 ns 2.3499 9929 B
CreateFunctionSchema_NewOptions 6,494.00 ns 59.036 ns 52.334 ns 2.1973 9318 B
CreateAIFunction 20.20 ns 0.097 ns 0.081 ns - -
CreateAIFunction_CustomDescription 28.19 ns 0.182 ns 0.171 ns 0.0134 56 B
CreateAIFunction_NewOptions 7,309.19 ns 72.739 ns 64.481 ns 2.5635 10854 B

@dotnet-comment-bot
Copy link
Collaborator

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.Caching.Hybrid 82 87
Microsoft.Gen.MetadataExtractor 57 70

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=955543&view=codecoverage-tab

@dotnet-comment-bot
Copy link
Collaborator

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Extensions.Caching.Hybrid 82 87
Microsoft.Gen.MetadataExtractor 57 70

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=955619&view=codecoverage-tab

@eiriktsarpalis eiriktsarpalis enabled auto-merge (squash) February 19, 2025 13:53
Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks.

@dotnet-comment-bot
Copy link
Collaborator

‼️ Found issues ‼️

Project Coverage Type Expected Actual
Microsoft.Extensions.AI Line 89 88.7 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project Expected Actual
Microsoft.Gen.MetadataExtractor 57 70

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=956635&view=codecoverage-tab

@eiriktsarpalis eiriktsarpalis merged commit 66eca03 into dotnet:main Feb 19, 2025
6 checks passed
@eiriktsarpalis eiriktsarpalis deleted the aifunctionfactory-caching branch February 19, 2025 17:27
@jeffhandley jeffhandley added the area-ai Microsoft.Extensions.AI libraries label Mar 7, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Apr 6, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-ai Microsoft.Extensions.AI libraries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants