Reinstate caching in schema generation #5908

eiriktsarpalis · 2025-02-15T01:21:07Z

I spent a bit of time investigating the performance of AIFunctionFactory.Create where I discovered that schema generation is the most expensive factor by far is schema derivation. Even though the schema generation routines used to employ a caching mechanism in the past, this got removed during the recent refactorings since it no longer is necessary for leaf clients to resolve schemas on the fly.

This PR reinstates a different version of the caching mechanism for the purpose of speeding up AIFunctionFactory.Create when applied multiple times on the same delegate.

Benchmarks

[MemoryDiagnoser]
public class Benchmark
{
    private static readonly Func<string, int, MyPoco?, MyPoco> _delegate = MyFunc;

    [Benchmark]
    public JsonElement CreateFunctionSchema() => AIJsonUtilities.CreateFunctionJsonSchema(_delegate.Method);

    [Benchmark]
    public JsonElement CreateFunctionSchema_CustomDescription() => AIJsonUtilities.CreateFunctionJsonSchema(_delegate.Method, description: "custom description");

    [Benchmark]
    public JsonElement CreateFunctionSchema_NewOptions() => AIJsonUtilities.CreateFunctionJsonSchema(_delegate.Method, serializerOptions: new(JsonSerializerOptions.Default));

    [Benchmark]
    public AIFunction CreateAIFunction() => AIFunctionFactory.Create(_delegate);

    [Benchmark]
    public AIFunction CreateAIFunction_CustomDescription() => AIFunctionFactory.Create(_delegate, description: "custom description");

    [Benchmark]
    public AIFunction CreateAIFunction_NewOptions() => AIFunctionFactory.Create(_delegate, serializerOptions: new(JsonSerializerOptions.Default));

    [Description("My awesome function")]
    public static MyPoco MyFunc([Description("Parameter 1")]string param1, int param2 = 42, MyPoco? other = null) => throw new NotImplementedException();

    public record MyPoco(int X, int Y);
}

Main

Method	Mean	Error	StdDev	Gen0	Allocated
CreateFunctionSchema	6.345 us	0.0848 us	0.0708 us	2.4109	9.95 KB
CreateFunctionSchema_CustomDescription	6.041 us	0.0357 us	0.0298 us	2.3499	9.7 KB
CreateFunctionSchema_NewOptions	6.438 us	0.0260 us	0.0230 us	2.1973	9.1 KB
CreateAIFunction	6.706 us	0.0344 us	0.0322 us	2.5940	10.65 KB
CreateAIFunction_CustomDescription	6.298 us	0.0393 us	0.0368 us	2.5330	10.45 KB
CreateAIFunction_CustomOptions	6.683 us	0.0300 us	0.0281 us	2.3804	9.85 KB

PR

Method	Mean	Error	StdDev	Gen0	Allocated
CreateFunctionSchema	6,679.98 ns	35.900 ns	31.824 ns	2.4109	10193 B
CreateFunctionSchema_CustomDescription	6,126.56 ns	43.308 ns	38.391 ns	2.3499	9929 B
CreateFunctionSchema_NewOptions	6,494.00 ns	59.036 ns	52.334 ns	2.1973	9318 B
CreateAIFunction	20.20 ns	0.097 ns	0.081 ns	-	-
CreateAIFunction_CustomDescription	28.19 ns	0.182 ns	0.171 ns	0.0134	56 B
CreateAIFunction_NewOptions	7,309.19 ns	72.739 ns	64.481 ns	2.5635	10854 B

Microsoft Reviewers: Open in CodeFlow

dotnet-comment-bot · 2025-02-15T01:56:17Z

‼️ Found issues ‼️

Project	Coverage Type	Expected	Actual
Microsoft.Extensions.AI.OpenAI	Line	77	67.92 🔻
Microsoft.Extensions.AI.OpenAI	Branch	77	49.8 🔻
Microsoft.Extensions.Caching.Hybrid	Line	86	82.77 🔻
Microsoft.Extensions.AI.Ollama	Line	80	78.2 🔻
Microsoft.Extensions.AI.Evaluation.Quality	Line	88	7.57 🔻
Microsoft.Extensions.AI.Evaluation.Quality	Branch	88	16.42 🔻
Microsoft.Extensions.AI.Evaluation	Line	88	58.67 🔻
Microsoft.Extensions.AI.Evaluation	Branch	88	56.67 🔻
Microsoft.Extensions.AI.Evaluation.Console	Line	88	8.26 🔻
Microsoft.Extensions.AI.Evaluation.Console	Branch	88	17.07 🔻
Microsoft.Extensions.AI.AzureAIInference	Line	91	90.09 🔻
Microsoft.Extensions.AI.AzureAIInference	Branch	91	86.76 🔻
Microsoft.Gen.MetadataExtractor	Line	98	57.35 🔻
Microsoft.Gen.MetadataExtractor	Branch	98	62.5 🔻
Microsoft.Extensions.AI.Evaluation.Reporting	Line	88	72.06 🔻
Microsoft.Extensions.AI.Evaluation.Reporting	Branch	88	64.8 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project	Expected	Actual
Microsoft.Extensions.AI	88	89
Microsoft.Extensions.AI.Abstractions	83	84

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=953290&view=codecoverage-tab

stephentoub · 2025-02-15T12:35:04Z

This PR reinstates a different version of the caching mechanism for the purpose of speeding up AIFunctionFactory.Create when applied multiple times on the same delegate.

We expect that to be something folks do, recreating an AIFunction for the same delegate over and over? I was expecting our guidance would be to create them once and cache them.

src/Libraries/Microsoft.Extensions.AI.Abstractions/Utilities/AIJsonUtilities.Schema.cs

stephentoub · 2025-02-15T12:48:55Z

Benchmark

What do the numbers before/after look like if you first add 512 methods to the cache and then benchmark calling Create with a different one? I'm wondering what the overhead of accessing the ConditionalWeakTable, accessing the ConcurrentDictionary, running the GetHashCode, etc. looks like when it doesn't pay dividends. Presumably the common case will be that across the whole process, typically the default JsonSerializerOptions will be used, so the 512 limit will end up applying to all use of AIFunctionFactory across the process.

eiriktsarpalis · 2025-02-15T14:42:59Z

We expect that to be something folks do, recreating an AIFunction for the same delegate over and over?

When working on the demo I felt enticed to incorporate the AIFunction declaration directly on the method like so:

chatOptions.Tools = [AIFunctionFactory.Create(GetNearestPorts)];

Which got me thinking that users would likely fall into the same performance trap.

I'm wondering what the overhead of accessing the ConditionalWeakTable, accessing the ConcurrentDictionary, running the GetHashCode, etc. looks like when it doesn't pay dividends.

That scenario should be represented by the benchmarks creating a new JSO every time. As you can see, it does introduce performance regression compared to what we have in main today.

dotnet-comment-bot · 2025-02-15T16:00:40Z

‼️ Found issues ‼️

Project	Coverage Type	Expected	Actual
Microsoft.Extensions.AI.Ollama	Line	80	78.2 🔻
Microsoft.Extensions.AI.Evaluation	Line	88	58.67 🔻
Microsoft.Extensions.AI.Evaluation	Branch	88	56.67 🔻
Microsoft.Extensions.AI.Evaluation.Quality	Line	88	7.57 🔻
Microsoft.Extensions.AI.Evaluation.Quality	Branch	88	16.42 🔻
Microsoft.Extensions.AI.AzureAIInference	Line	91	90.09 🔻
Microsoft.Extensions.AI.AzureAIInference	Branch	91	86.76 🔻
Microsoft.Extensions.AI.Evaluation.Reporting	Line	88	72.06 🔻
Microsoft.Extensions.AI.Evaluation.Reporting	Branch	88	64.8 🔻
Microsoft.Gen.MetadataExtractor	Line	98	57.35 🔻
Microsoft.Gen.MetadataExtractor	Branch	98	62.5 🔻
Microsoft.Extensions.AI.OpenAI	Line	77	67.92 🔻
Microsoft.Extensions.AI.OpenAI	Branch	77	49.8 🔻
Microsoft.Extensions.AI.Evaluation.Console	Line	88	8.26 🔻
Microsoft.Extensions.AI.Evaluation.Console	Branch	88	17.07 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project	Expected	Actual
Microsoft.Extensions.AI.Abstractions	83	84
Microsoft.Extensions.Caching.Hybrid	86	87
Microsoft.Extensions.AI	88	89

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=953615&view=codecoverage-tab

stephentoub · 2025-02-15T16:38:15Z

When working on the demo I felt enticed to incorporate the AIFunction declaration directly on the method like so:

Should we cache the AI Functions instead then?

eiriktsarpalis · 2025-02-15T16:54:22Z

When working on the demo I felt enticed to incorporate the AIFunction declaration directly on the method like so:

Should we cache the AI Functions instead then?

I tried that and have a working prototype locally. What I did discover is that schema generation accounts for ~95% of the cost (as can be seen from the benchmark results) and caching on the AIFunctionMetadata level is trickier since there are more variables at play. It seemed like an obvious win to do it on the schema generation APIs instead.

stephentoub · 2025-02-15T19:00:30Z

Those numbers are very suspect to me, e.g.

CreateAIFunction | 56.31 ns | 0.404 ns | 0.378 ns | 0.0172 | 72 B

That's suggesting with this change AIFunctionFactory.Create(delegate) only allocates 72 bytes and only took 56 nanoseconds? Without caching the AIFunction? It's doing a non-trivial amount of reflection. The AIFunction object alone must be close to 72 bytes, plus the delegates that get created for marshaling.

So I pulled down your change, and instead I see numbers like this:

Method	Mean	Error	StdDev	Allocated
CreateAIFunction	1.214 us	0.0194 us	0.0182 us	816 B

Can you re-confirm what it is you're measuring?

eiriktsarpalis · 2025-02-15T20:54:26Z

Here's updated numbers from a new run:

Method	Mean	Error	StdDev	Gen0	Gen1	Allocated
CreateFunctionSchema	26.01 ns	0.093 ns	0.087 ns	-	-	-
CreateFunctionSchema_CustomDescription	38.41 ns	0.172 ns	0.161 ns	-	-	-
CreateFunctionSchema_NewOptions	7,505.62 ns	139.908 ns	130.870 ns	2.3804	-	10167 B
CreateAIFunction	612.79 ns	5.975 ns	5.589 ns	0.1945	-	816 B
CreateAIFunction_CustomDescription	224.36 ns	1.081 ns	0.903 ns	0.1452	-	608 B
CreateAIFunction_NewOptions	7,657.49 ns	61.634 ns	54.637 ns	2.5635	0.0610	10779 B

And here's the same benchmarks with additional caching implemented on the AIFunction layer:

Method	Mean	Error	StdDev	Gen0	Allocated
CreateFunctionSchema	27.15 ns	0.085 ns	0.079 ns	-	-
CreateFunctionSchema_CustomDescription	32.35 ns	0.215 ns	0.201 ns	-	-
CreateFunctionSchema_NewOptions	7,648.95 ns	140.465 ns	196.912 ns	2.4109	10167 B
CreateAIFunction	60.11 ns	0.307 ns	0.287 ns	0.0172	72 B
CreateAIFunction_CustomDescription	66.00 ns	0.256 ns	0.227 ns	0.0305	128 B
CreateAIFunction_NewOptions	8,604.40 ns	60.222 ns	53.385 ns	2.7466	11692 B

So the numbers I quoted originally were most likely recorded with those changes accidentally unstashed. I've updated the OP to reflect the correct numbers.

eiriktsarpalis · 2025-02-18T19:30:49Z

@stephentoub as discussed, I pushed updates using caching on the AIFunctionFactory layer only. Here are the updated benchmark numbers:

Method	Mean	Error	StdDev	Gen0	Allocated
CreateFunctionSchema	6,679.98 ns	35.900 ns	31.824 ns	2.4109	10193 B
CreateFunctionSchema_CustomDescription	6,126.56 ns	43.308 ns	38.391 ns	2.3499	9929 B
CreateFunctionSchema_NewOptions	6,494.00 ns	59.036 ns	52.334 ns	2.1973	9318 B
CreateAIFunction	20.20 ns	0.097 ns	0.081 ns	-	-
CreateAIFunction_CustomDescription	28.19 ns	0.182 ns	0.171 ns	0.0134	56 B
CreateAIFunction_NewOptions	7,309.19 ns	72.739 ns	64.481 ns	2.5635	10854 B

dotnet-comment-bot · 2025-02-18T20:10:10Z

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project	Expected	Actual
Microsoft.Extensions.Caching.Hybrid	82	87
Microsoft.Gen.MetadataExtractor	57	70

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=955543&view=codecoverage-tab

dotnet-comment-bot · 2025-02-18T21:34:06Z

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project	Expected	Actual
Microsoft.Extensions.Caching.Hybrid	82	87
Microsoft.Gen.MetadataExtractor	57	70

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=955619&view=codecoverage-tab

src/Libraries/Microsoft.Extensions.AI.Abstractions/Utilities/AIJsonSchemaCreateOptions.cs

src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.Utilities.cs

src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.cs

stephentoub

Nice, thanks.

dotnet-comment-bot · 2025-02-19T17:16:22Z

‼️ Found issues ‼️

Project	Coverage Type	Expected	Actual
Microsoft.Extensions.AI	Line	89	88.7 🔻

🎉 Good job! The coverage increased 🎉
Update MinCodeCoverage in the project files.

Project	Expected	Actual
Microsoft.Gen.MetadataExtractor	57	70

Full code coverage report: https://dev.azure.com/dnceng-public/public/_build/results?buildId=956635&view=codecoverage-tab

Reinstate caching in schema generation

4c8f24d

eiriktsarpalis requested a review from a team as a code owner February 15, 2025 01:21

eiriktsarpalis requested review from SteveSandersonMS and stephentoub February 15, 2025 01:21

dotnet-policy-service bot assigned eiriktsarpalis Feb 15, 2025

stephentoub reviewed Feb 15, 2025

View reviewed changes

src/Libraries/Microsoft.Extensions.AI.Abstractions/Utilities/AIJsonUtilities.Schema.cs Outdated Show resolved Hide resolved

Address feedback.

568bf8f

Only cache on the AIFunctionFactory level.

84c9c44

eiriktsarpalis requested a review from stephentoub February 18, 2025 19:30

eiriktsarpalis added 2 commits February 18, 2025 19:33

Reorder properties,

fb3d1d7

Rename serialization helper.

d09aa78

Merge branch 'main' into aifunctionfactory-caching

15c0b2c

eiriktsarpalis enabled auto-merge (squash) February 19, 2025 13:53

stephentoub reviewed Feb 19, 2025

View reviewed changes

src/Libraries/Microsoft.Extensions.AI.Abstractions/Utilities/AIJsonSchemaCreateOptions.cs Show resolved Hide resolved

stephentoub reviewed Feb 19, 2025

View reviewed changes

src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.Utilities.cs Outdated Show resolved Hide resolved

stephentoub reviewed Feb 19, 2025

View reviewed changes

src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.cs Outdated Show resolved Hide resolved

stephentoub reviewed Feb 19, 2025

View reviewed changes

src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.cs Show resolved Hide resolved

stephentoub reviewed Feb 19, 2025

View reviewed changes

src/Libraries/Microsoft.Extensions.AI/Functions/AIFunctionFactory.cs Outdated Show resolved Hide resolved

stephentoub approved these changes Feb 19, 2025

View reviewed changes

Address feedback

33c9f8c

eiriktsarpalis merged commit 66eca03 into dotnet:main Feb 19, 2025
6 checks passed

eiriktsarpalis deleted the aifunctionfactory-caching branch February 19, 2025 17:27

jeffhandley added the area-ai Microsoft.Extensions.AI libraries label Mar 7, 2025

rnelson mentioned this pull request Mar 12, 2025

Bump Microsoft.Extensions.Caching.Hybrid and Microsoft.Extensions.Logging.Abstractions rnelson/aspnetcore-slow-down#26

Open

github-actions bot locked and limited conversation to collaborators Apr 6, 2025

Reinstate caching in schema generation #5908

Reinstate caching in schema generation #5908

Uh oh!

Conversation

eiriktsarpalis commented Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Main

PR

Microsoft Reviewers: Open in CodeFlow

Uh oh!

dotnet-comment-bot commented Feb 15, 2025

Uh oh!

stephentoub commented Feb 15, 2025

Uh oh!

Uh oh!

stephentoub commented Feb 15, 2025

Uh oh!

eiriktsarpalis commented Feb 15, 2025

Uh oh!

dotnet-comment-bot commented Feb 15, 2025

Uh oh!

stephentoub commented Feb 15, 2025

Uh oh!

eiriktsarpalis commented Feb 15, 2025

Uh oh!

stephentoub commented Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eiriktsarpalis commented Feb 15, 2025

Uh oh!

eiriktsarpalis commented Feb 18, 2025

Uh oh!

dotnet-comment-bot commented Feb 18, 2025

Uh oh!

dotnet-comment-bot commented Feb 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stephentoub left a comment

Choose a reason for hiding this comment

Uh oh!

dotnet-comment-bot commented Feb 19, 2025

Uh oh!

Uh oh!

Uh oh!

eiriktsarpalis commented Feb 15, 2025 •

edited

Loading

stephentoub commented Feb 15, 2025 •

edited

Loading