-
Notifications
You must be signed in to change notification settings - Fork 841
Introduce set of built-in Enrichers #6957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a suite of built-in enricher classes for the data ingestion pipeline, leveraging AI chat models to enhance document chunks with additional metadata. These enrichers provide semantic analysis capabilities including summarization, sentiment analysis, keyword extraction, content classification, and image alternative text generation.
Key changes:
- Added five new enricher implementations that process ingestion chunks/documents using AI chat clients
- Implemented test suite with comprehensive coverage for all enrichers
- Added utility method
ToListAsyncfor async enumerable testing support
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Utils/IAsyncEnumerableExtensions.cs | Added ToListAsync helper method for converting async enumerables to lists in tests |
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Processors/SummaryEnricherTests.cs | Test suite for summary text generation enricher |
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Processors/SentimentEnricherTests.cs | Test suite for sentiment analysis enricher |
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Processors/KeywordEnricherTests.cs | Test suite for keyword extraction enricher |
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Processors/ClassificationEnricherTests.cs | Test suite for content classification enricher |
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Processors/AlternativeTextEnricherTests.cs | Test suite for image alternative text generation enricher |
| test/Libraries/Microsoft.Extensions.DataIngestion.Tests/Microsoft.Extensions.DataIngestion.Tests.csproj | Added reference to shared TestChatClient test utility |
| src/Libraries/Microsoft.Extensions.DataIngestion/Processors/SummaryEnricher.cs | Enricher implementation that generates summary text for chunks |
| src/Libraries/Microsoft.Extensions.DataIngestion/Processors/SentimentEnricher.cs | Enricher implementation that analyzes sentiment (Positive/Negative/Neutral/Unknown) |
| src/Libraries/Microsoft.Extensions.DataIngestion/Processors/KeywordEnricher.cs | Enricher implementation that extracts keywords from chunk content |
| src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ImageAlternativeTextEnricher.cs | Enricher implementation that generates alternative text descriptions for images |
| src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ClassificationEnricher.cs | Enricher implementation that classifies chunks into predefined categories |
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/KeywordEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ClassificationEnricher.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ClassificationEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ClassificationEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ClassificationEnricher.cs
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ClassificationEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ImageAlternativeTextEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ImageAlternativeTextEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/ClassificationEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/KeywordEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/SentimentEnricher.cs
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/SummaryEnricher.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.DataIngestion/Processors/SummaryEnricher.cs
Show resolved
Hide resolved
- use ChatOptions.Instructions - validate the responses
…e prompt message to better handle wordCount = 1
|
@stephentoub I believe I've addressed all your blocking concerns. Could you PTAL? Thanks! |
Microsoft Reviewers: Open in CodeFlow