Skip to content

Conversation

@alzarei
Copy link

@alzarei alzarei commented Sep 28, 2025

Modernize TavilyTextSearch and BraveTextSearch connectors with ITextSearch interface

Problem Statement

The TavilyTextSearch and BraveTextSearch connectors currently implement only the legacy ITextSearch interface, forcing users to use clause-based TextSearchFilter instead of modern type-safe LINQ expressions. Additionally, the existing LINQ support is limited to basic expressions (equality, AND operations).

Technical Approach

This PR modernizes both connectors with generic interface implementation and extends LINQ filtering to support OR operations, negation, and inequality operators. The implementation adds type-safe model classes and enhanced expression tree analysis capabilities.

Implementation Details

Core Changes

  • Both connectors now implement ITextSearch (legacy) and ITextSearch (modern)
  • Added type-safe model classes: TavilyWebPage and BraveWebPage
  • Extended AnalyzeExpression() methods to handle additional expression node types
  • Added support for OrElse, NotEqual, and UnaryExpression operations
  • Implemented array.Contains(property) pattern recognition
  • Enhanced error messaging with contextual examples

Enhanced LINQ Expression Support

  • OR Operations (||): Maps to multiple API parameter values or OR logic
  • NOT Operations (!): Converts to exclusion parameters where supported
  • Inequality Operations (!=): Provides helpful error messages suggesting NOT alternatives
  • Array Contains Pattern: Supports array.Contains(property) for multi-value filtering

Code Examples

Before (Legacy Interface)

var legacyOptions = new TextSearchOptions
{
    Filter = new TextSearchFilter()
        .Equality("topic", "general")
        .Equality("time_range", "week")
};

After (Generic Interface)

// Simple filtering
var modernOptions = new TextSearchOptions<TavilyWebPage>
{
    Filter = page => page.Topic == "general" && page.TimeRange == "week"
};

// Advanced filtering with OR and array Contains
var advancedOptions = new TextSearchOptions<BraveWebPage>
{
    Filter = page => (page.Country == "US" || page.Country == "GB") &&
                     new[] { "moderate", "strict" }.Contains(page.SafeSearch) &&
                     !(page.ResultFilter == "adult")
};

Implementation Benefits

Interface Modernization

  • Type-safe filtering with compile-time validation prevents property name typos
  • IntelliSense support for TavilyWebPage and BraveWebPage properties
  • Consistent LINQ-based filtering across all text search implementations

Enhanced Filtering Capabilities

  • OR operations enable multi-value property matching
  • NOT operations provide exclusion filtering where API supports it
  • Array Contains patterns simplify multi-value filtering syntax
  • Improved error messages reduce debugging time

Developer Experience

  • Better debugging experience with type information
  • Reduced learning curve - same patterns across all connectors
  • Enhanced error messages with usage examples and supported properties

Validation Results

Build Verification

  • Configuration: Release
  • Target Framework: .NET 8.0
  • Command: dotnet build --configuration Release --interactive
  • Result: Build succeeded - all projects compiled successfully

Test Results
Full Test Suite:

  • Passed: 8,829 (core functionality tests)
  • Failed: 1,361 (external API configuration issues)
  • Skipped: 389
  • Duration: 4 minutes 57 seconds

Core Unit Tests:

  • Command: dotnet test src\SemanticKernel.UnitTests\SemanticKernel.UnitTests.csproj --configuration Release
  • Result: 1,574 passed, 0 failed (100% core framework functionality)

Test Failure Analysis
The 1,361 test failures are infrastructure/configuration issues, not code defects:

  • Azure OpenAI Configuration: Missing API keys for external service integration tests
  • Docker Dependencies: Vector database containers not available in development environment
  • External Service Dependencies: Integration tests requiring live API services (Bing, Google, Brave, Tavily, etc.)
  • AWS/Azure Configuration: Missing credentials for cloud service integration tests

These failures are expected in development environments without external API configurations.

Code Quality

  • Formatting: Applied via dotnet format SK-dotnet.slnx
  • Enhanced documentation follows XML documentation conventions
  • Consistent with established LINQ expression handling patterns

Files Modified

dotnet/src/Plugins/Plugins.Web/Tavily/TavilyWebPage.cs (NEW)
dotnet/src/Plugins/Plugins.Web/Brave/BraveWebPage.cs (NEW)
dotnet/src/Plugins/Plugins.Web/Brave/BraveTextSearch.cs (MODIFIED)
dotnet/src/Plugins/Plugins.Web/Tavily/TavilyTextSearch.cs (MODIFIED)

Breaking Changes

None. All existing LINQ expressions continue to work unchanged with enhanced error message generation.

Multi-PR Context

This is PR 5 of 6 in the structured implementation approach for Issue #10456. This PR completes the modernization of remaining text search connectors with enhanced LINQ expression capabilities while maintaining full backward compatibility.

@moonbox3 moonbox3 added the .NET Issue or Pull requests regarding .NET code label Sep 28, 2025
@github-actions github-actions bot changed the title Net: feat: Modernize TavilyTextSearch and BraveTextSearch connectors with ITextSearch<TRecord> interface (microsoft#10456) .Net: Net: feat: Modernize TavilyTextSearch and BraveTextSearch connectors with ITextSearch<TRecord> interface (microsoft#10456) Sep 28, 2025
@alzarei alzarei marked this pull request as ready for review September 28, 2025 07:33
@alzarei alzarei requested a review from a team as a code owner September 28, 2025 07:33
@alzarei alzarei force-pushed the feature-text-search-linq-pr5 branch from c7a2b06 to caa67ec Compare October 20, 2025 10:47
@alzarei alzarei force-pushed the feature-text-search-linq-pr5 branch from edeb8bf to d771436 Compare October 31, 2025 06:21
…rchResults<TRecord>

- Change interface return type from KernelSearchResults<object> to KernelSearchResults<TRecord>
- Update VectorStoreTextSearch implementation with new GetResultsAsTRecordAsync helper
- Keep GetResultsAsRecordAsync for legacy ITextSearch interface backward compatibility
- Update 3 unit tests to use strongly-typed DataModel instead of object

Benefits:
- Improved type safety - no more casting required
- Better IntelliSense and developer experience
- Zero breaking changes to legacy ITextSearch interface
- All 19 unit tests pass

This is Part 2.1 of the Issue microsoft#10456 multi-PR chain, refining the API .
[Ifeat: Modernize TavilyTextSearch and BraveTextSearch with ITextSearch<TRecord> interface

- Add TavilyWebPage and BraveWebPage model classes for type-safe LINQ filtering
- Implement dual interface support (ITextSearch + ITextSearch<TRecord>) in both connectors
- Add LINQ-to-API conversion logic for Tavily and Brave search parameters
- Maintain 100% backward compatibility with existing ITextSearch usage
- Enable compile-time type safety and IntelliSense support for web search filters

Addresses microsoft#10456
…rror messages

- Added support for OrElse (||) operators in LINQ expressions
- Added NotEqual (!=) operator support with appropriate error handling
- Added UnaryExpression (NOT) support with proper limitations
- Enhanced Contains method support for array.Contains(property) patterns
- Improved error messages with usage examples
- Added constants for API parameter names for better maintainability
- Added TODO comments for potential code sharing opportunities

Changes apply to both BraveTextSearch and TavilyTextSearch implementations.
- Changed Url property from string? to Uri? in TavilyWebPage
- Changed Url property from string? to Uri? in BraveWebPage
- Updated constructors to accept Uri? parameters
- Added null-safe string-to-Uri conversions in factory methods
- Fixed image URL handling in TavilyTextSearch

Resolves all 31 CA1056 violations (URI properties should not be strings).
…h connectors

- Implement IsMemoryExtensionsContains() detection in TavilyTextSearch and BraveTextSearch
- Handle C# 14+ MemoryExtensions.Contains method resolution (static calls with 2-3 parameters)
- Provide clear error messages explaining API limitations and suggesting alternatives
- Maintain backward compatibility with C# 13- Enumerable.Contains patterns

Validation:  Build (0 warnings),  Tests (1574 passed),  AOT compatible

Fixes microsoft#10456
…on handling

- Added comprehensive generic interface tests for both Tavily and Brave connectors
  - TavilyTextSearchTests: 3 new tests for ITextSearch<TavilyWebPage> interface
  - BraveTextSearchTests: 3 new tests for ITextSearch<BraveWebPage> interface
  - Tests cover SearchAsync, GetSearchResultsAsync, and GetTextSearchResultsAsync methods
  - Added proper mocking with MultipleHttpMessageHandlerStub and test data

- Fixed collection Contains exception handling to prevent graceful degradation
  - Modified ConvertToLegacyOptions in both connectors to re-throw critical exceptions
  - Collection Contains patterns now properly throw NotSupportedException instead of degrading
  - Maintains graceful degradation for other unsupported LINQ patterns
  - Fixed previously failing CollectionContainsFilterThrowsNotSupportedExceptionAsync test

- Enhanced C# 14 MemoryExtensions.Contains compatibility
  - Improved error messages to cover both C# 13- (Enumerable.Contains) and C# 14+ (MemoryExtensions.Contains)
  - All collection Contains patterns now properly handled regardless of C# language version

Test Results:
- Tavily: 30 tests passed (including 3 new generic interface tests)
- Brave: 24 tests passed (including 3 new generic interface tests)
- All collection Contains exception tests now pass correctly
- Update BraveTextSearch and TavilyTextSearch to properly implement ITextSearch<TRecord>
- Fix unit tests to use strongly-typed generic interfaces
- Ensure backward compatibility with legacy ITextSearch interface
- All 105 text search tests passing, full validation completed

Resolves API type safety issues described in How-to-Fix-API-Type-Change-to-TRecord-PR5.md
@alzarei alzarei force-pushed the feature-text-search-linq-pr5 branch from d771436 to 4caa829 Compare November 3, 2025 05:41
@markwallace-microsoft markwallace-microsoft added kernel Issues or pull requests impacting the core kernel kernel.core labels Nov 3, 2025
…ame Generic→Linq

- Revert GetSearchResultsReturnsSuccessfullyAsync to use non-generic interface (var textSearch, KernelSearchResults<object>)
- This test should test backward compatibility with legacy ITextSearch interface
- Only generic interface tests should use ITextSearch<TRecord> and KernelSearchResults<TRecord>
- Rename 'Generic' test methods to 'Linq' to match naming convention from other PRs:
  - GenericSearchAsyncReturnsResultsSuccessfullyAsync → LinqSearchAsyncReturnsResultsSuccessfullyAsync
  - GenericGetSearchResultsAsyncReturnsResultsSuccessfullyAsync → LinqGetSearchResultsAsyncReturnsResultsSuccessfullyAsync
  - GenericGetTextSearchResultsAsyncReturnsResultsSuccessfullyAsync → LinqGetTextSearchResultsAsyncReturnsResultsSuccessfullyAsync
- Applied to both BraveTextSearchTests.cs and TavilyTextSearchTests.cs
- All 105 text search tests still passing

Fix maintains proper test separation:
- Non-generic tests: Test legacy ITextSearch interface compatibility
- LINQ tests: Test new ITextSearch<TRecord> strongly-typed interface
…arch

- Add SearchQueryFilterClause for query enhancement pattern
- Implement Title.Contains() → query modification for Brave and Tavily APIs
- Support LINQ expressions: results.Where(r => r.Title.Contains('AI'))
- Query enhancement: 'base query' + ' AI' for improved relevance
- Maintains backward compatibility with legacy ITextSearch interface
- All 130 web plugin tests passing

Technical approach:
- SearchQueryFilterClause: New filter clause type for search term addition
- ConvertToLegacyOptionsWithQuery: Extract and append search terms to query
- Different from Bing (direct operators) and Google (API parameters)
- Optimized for APIs without field-specific search capabilities
- Fix RCS1037 trailing whitespace errors on lines 12, 16, 18
- Add missing final newline to prevent FINALNEWLINE error
- Fix file encoding issues (BOM removal)
- Addresses GitHub CI/CD build failures for net8.0, net462, netstandard2.0 targets

All formatting issues resolved to match GitHub Actions requirements.
Changed SearchQueryFilterClause from public to internal in VectorData.Abstractions
to reduce public API surface area per reviewer feedback.

Since FilterClause has an internal constructor, SearchQueryFilterClause cannot be
moved to Plugins.Web. Using InternalsVisibleTo pattern (already used 20+ times in
codebase) to grant Plugins.Web access to this implementation detail.

- Changed 'public sealed class' to 'internal sealed class' in SearchQueryFilterClause.cs
- Added InternalsVisibleTo for Microsoft.SemanticKernel.Plugins.Web in VectorData.Abstractions.csproj
- Verified: 0 warnings, 0 errors, 54 web plugin tests passing
Reverting the internal + InternalsVisibleTo approach per reviewer feedback.
Making SearchQueryFilterClause public is the cleanest solution:

- Legitimate, reusable abstraction for text search scenarios
- Respects FilterClause architecture (stays in VectorData.Abstractions)
- No InternalsVisibleTo complexity or CS0436 warnings
- Precedent: Other FilterClause types are public (EqualToFilterClause, etc.)
- Other search connectors can benefit from this pattern

Trade-off: Adds one class to VectorData.Abstractions public API surface,
but it's a well-scoped, documented filter clause pattern with clear use cases.

Addresses reviewer question about SearchQueryFilterClause visibility.
…structor public

Per reviewer feedback, moving SearchQueryFilterClause from Microsoft.Extensions.VectorData to Microsoft.SemanticKernel.Plugins.Web namespace as internal class, where it's actually used (Brave/Tavily connectors).

Changes:
- FilterClause.cs: Changed constructor from 'internal' to 'public' to allow external inheritance
- SearchQueryFilterClause.cs: Moved to Plugins.Web/FilterClauses/ as internal sealed class
- VectorData.Abstractions.csproj: Added CA1012 suppression (abstract types with public constructors)

Trade-off: Opening FilterClause to external inheritance is accepted to minimize MEVD public API surface.
alzarei added a commit to alzarei/semantic-kernel that referenced this pull request Nov 16, 2025
…terClause decision in ADR-0065

Documents the architectural role of FilterClause as the common translation layer and updates the SearchQueryFilterClause visibility decision to reflect the implemented approach.

Changes:
- Added FilterClause architecture section with diagrams showing dual path convergence
- Updated SearchQueryFilterClause decision: Made FilterClause constructor public and moved SearchQueryFilterClause to Plugins.Web (internal)
- Clarified SearchQueryFilterClause is LINQ-only (never created by users)
- Documented trade-off: Opening FilterClause to external inheritance to minimize MEVD API surface
- Explained why FilterClause stays after legacy removal (LINQ translator needs it)
- Distinguished user-facing APIs (removed in Phase 3) from internal translation layer (stays)
- Added cross-reference between architecture explanation and decision rationale sections

This captures the architectural discussion and final decision from PR microsoft#13191 review with @westey-m.
@westey-m westey-m merged commit a5ba264 into microsoft:feature-text-search-linq Nov 21, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel.core kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants