-
Notifications
You must be signed in to change notification settings - Fork 840
Eliminate ingestion cache from AI Chat Web template #6428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… to fetch literally everything from the vector DB in order to update ingestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
We'll need to update template snapshots before the template tests start passing again. Let me know if you'd like any help with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this, @SteveSandersonMS. I'll update the integration test snapshots and push to your branch to get the tests passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to eng/Versions.props LGTM
* Begin updating to latest MEVD * Reimplement JsonVectorStore to match updated MEVD APIs * Remove ingestion cache and track ingestion status inside the vector DB * Track the document metadata in a separate collection so we don't have to fetch literally everything from the vector DB in order to update ingestion * Fix equality comparison issue with Qdrant connector * Tidying * More tidying * Update MEAI.Templates test snapshots --------- Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com>
…es (#6451) * Translate OpenAI refusals to ErrorContent (#6393) Refusals in OpenAI are errors reported when the service can't generate an output that matches the requested schema. Translate refusals to ErrorContent now that we have it. * Add JSON schema transformation functionality to `AIJsonUtilities` (#6383) * Add initial schema transformation functionality and incorporate into the OpenAI leaf client. * Update all leaf client implementions, improve naming, add testing. * Remove redundant suppressions * Address feedback. * Add ChatOptions.RawRepresentationFactory (#6319) * Look for OpenAI.ChatCompletionOptions in top-level additional properties and stop looking for individually specific additional properties * Add RawRepresentation to ChatOptions and use it in OpenAI and AzureAIInference * Remove now unused locals * Add [JsonIgnore] and update roundtrip tests * Overwirte properties only if the underlying model don't specify it already * Clone RawRepresentation * Reflection workaround for ToolChoice not being cloned * Style changes * AI.Inference: Bring back propagation of additional properties * Don't use 0.1f, it doesn't roundtrip properly in .NET Framework * Add RawRepresentationFactory instead of object? property * Augment remarks to discourage returning shared instances * Documentation feedback * AI.Inference: keep passing TopK as AdditionalProperty if not already there * Fix streaming chat response example (#6408) * Move AIFunctionFactory down to M.E.AI.Abstractions (#6412) * Remove AIFunctionFactory dependency on M.E.DI This means reverting the recent changes to it that: - Special-cased KeyedServices - Special-cased IServiceProviderIsService - Used ActivatorUtilities.CreateInstance * Move AIFunctionFactory down to M.E.AI.Abstractions * Add CreateInstance delegate to AIFunctionFactoryOptions To enable use of ActivatorUtilities.CreateInstance or alternative. * Add some comments * Fix handling of tool calls with some OpenAI endpoints (#6405) * Fix handling of tool calls with some endpoints Most assistant messages containing tool calls don't contain text as well (though some can). In such a case, we were still creating the assistant with empty text. While OpenAI's service permits that, some other endpoints are more finicky about it. This avoids doing so. * Reduce to single iteration through assistant content * Delete Microsoft.Extensions.AI.Abstractions APIs marked [Obsolete] during preview (#6414) * Add WriteAsync overrides to stream helper in AIFunctionFactory (#6419) We use JsonSerializer.SerializeAsync but were missing the async overrides. As with MemoryStream, these don't need to queue. * Replace Type targetType AIFunctionFactory.Create parameter with a func (#6424) * Add an AIJsonSchemaTransformOptions property inside AIJsonSchemaCreateOptions and mark redundant properties in the latter as obsolete. (#6427) * Add an AIJsonSchemaTransformOptions property inside AIJsonSchemaCreateOptions and mark redundant properties in the latter as obsolete. * s/inferred/created * Eliminate ingestion cache from AI Chat Web template (#6428) * Begin updating to latest MEVD * Reimplement JsonVectorStore to match updated MEVD APIs * Remove ingestion cache and track ingestion status inside the vector DB * Track the document metadata in a separate collection so we don't have to fetch literally everything from the vector DB in order to update ingestion * Fix equality comparison issue with Qdrant connector * Tidying * More tidying * Update MEAI.Templates test snapshots --------- Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com> * Update the template test README with snapshot update instructions (#6431) * AI Chat Web template fixes for Azure AI Search (#6429) * AI Chat Web template fixes for Azure AI Search * Update snapshots * Add security comments for chat clients (#6386) * Remove unused select param (#6341) CreateRecordsForDocumentAsync includes `Select((pair, index) =>` but index is never used Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com> * Add RawRepresentationFactory to other options types (#6433) * Add RawRepresentationFactory to other options types * Undo changes in Azure.AI.Inference * Address documentation feedback * Ensure the type keyword is included when generating schemas for nullable enums. (#6440) * Remove obsolete members from AIJsonSchemaCreateOptions (#6432) * Use RawRepresentationFactory on AzureAIInference embedding generators (#6445) * Mark Microsoft.Extensions.AI and Microsoft.Extensions.AI.Abstractions as stable (#6446) * Bump ICSharpCode.Decompiler for record struct support in ApiChief tool Needed the fix for icsharpcode/ILSpy#3159 to fix "record struct" formatting (it was "recordstruct" before the fix). * Generate ApiChief baselines for MEAI libraries Ran .\scripts\MakeApiBaselines.ps1 and discarded other libraries' updates. * Hand-edit MEAI ApiChief baseline to fix params ReadOnlySpan Params collections are not yet supported in ICSharpCode.Decompiler: icsharpcode/ILSpy#829 The result is an emitted 'scoped' keyword instead of 'params'. This was edited by hand in the baseline MEAI file. * Mark Microsoft.Extensions.AI and Microsoft.Extensions.AI.Abstractions as stable * Update MEAI and MEAI.Abstractions NuGet package documentation * Update NuGet package documentation for MEAI implementation packages * Update MEAI.Templates package references, including SemanticKernel for a coherent build. * Lower OllamaSharp for integration tests to use version available on feed * Empty the ApiChief baselines for Ollama, AzureAIInference, and OpenAI adapters since they are not shipping stable * Apply code review feedback to the MEAI package READMEs * Update MEAI.Templates test snapshots for version bumps * Apply documentation review feedback to the MEAI package READMEs * Add comments to the MEAI API baseline file for the hand-editing required * Restore documentation blurb into Microsoft.Extensions.AI.AzureAIInference per other feedback * Use stable version for MEAI in templates * Mark Microsoft.Extensions.AI.Evaluation.* Libraries as stable (#6450) * Mark Microsoft.Extensions.AI packages stable All packages except Microsoft.Extensions.AI.Evaluation.Safety are being marked stable. * Remove primary constructors from API json files. * Remove more primary constructors from API Chief json --------- Co-authored-by: Jeff Handley <jeffhandley@users.noreply.github.com> --------- Co-authored-by: Stephen Toub <stoub@microsoft.com> Co-authored-by: Eirik Tsarpalis <eirik.tsarpalis@gmail.com> Co-authored-by: David Cantú <dacantu@microsoft.com> Co-authored-by: Genevieve Warren <24882762+gewarren@users.noreply.github.com> Co-authored-by: Steve Sanderson <SteveSandersonMS@users.noreply.github.com> Co-authored-by: Mackinnon Buck <mackinnon.buck@gmail.com> Co-authored-by: Jon Galloway <jongalloway@gmail.com> Co-authored-by: Peter Waldschmidt <pewaldsc@microsoft.com>
This makes use of MEVD's new
GetAsync(...)overload that can retrieve records using a search expression without doing nearest-neighbour search. It means we can eliminate the SQLite and EF dependencies at least in the qdrant and Azure AI Search cases, and eliminates cases where ingestion tracking can get out of sync with the chunk storage.Because of updating to newer MEVD, many APIs changed so I had to do a lot of tangential updates, including reimplementing
JsonVectorStorealmost entirely, hence the PR diff looking complex there. But conceptually I haven't changed how that works. I just had to implement a different interface.I did take the opportunity to do various renames and cleanups. It's still far from perfect but is a significant step forwards, and we can make it a whole lot better still when we can eliminate
JsonVectorStore.@roji I looked into using the InMemoryVectorStore's JSON dumping capability as discussed, but it didn't work out. It's not a good match for what we're doing here because it works on a per-collection basis and doesn't auto-write updates to disk when the vector store is changes, so it leaks out of the
IVectorStoreabstraction. While there may be ways to make it work well enough, since we're planning to eliminate it in favour of SQLite anyway, I concluded that it was a distraction for now.Microsoft Reviewers: Open in CodeFlow