Skip to content

Fix wildcard view resolution losing duplicate copies#149418

Open
craigtaverner wants to merge 6 commits into
elastic:mainfrom
craigtaverner:fix_deduplication_wildcard_views
Open

Fix wildcard view resolution losing duplicate copies#149418
craigtaverner wants to merge 6 commits into
elastic:mainfrom
craigtaverner:fix_deduplication_wildcard_views

Conversation

@craigtaverner
Copy link
Copy Markdown
Contributor

Previous work to ensure views and indexes are not de-duplicated was done at #145091, however that did not properly take into account wildcards.

Fixes #149416

Problem

When a wildcard pattern matches both a concrete index and a view that sources from that same index, the query should produce two independent copies of the underlying data — one from the direct index match, one from the view. This is the same behaviour as explicitly naming both (e.g. FROM logs-1, logs-2), which already worked correctly.

For example, given index logs-1 and view logs-2 = FROM logs-1:

  • FROM logs-1, logs-2 → two copies of logs-1 data ✅ (worked before)
  • FROM logs-* → should also produce two copies, but produced only one ❌

Root cause

Both ViewResolver and ViewCompaction contain a mergeIfPossible helper that decides whether two UnresolvedRelation nodes can be safely merged into one (combining their index pattern lists). The check only prevented merging when two patterns were identical strings. It did not account for wildcard overlap.

When resolving FROM logs-*:

  1. View logs-2 resolves to UnresolvedRelation("logs-1")
  2. The wildcard retains UnresolvedRelation("logs-*") for the direct index matches
  3. mergeIfPossible("logs-1", "logs-*") found no exact-string match, so it merged them into UnresolvedRelation("logs-1,logs-*")
  4. Index resolution then deduplicated logs-1 (since logs-* already covers it), silently dropping one copy

The bug required fixing in both places:

  • ViewResolver.mergeCompatibleUnresolvedRelations — performs the initial merge during resolution
  • ViewCompaction.mergeUnresolvedRelationEntries — re-runs after compaction unwraps NamedSubquery nodes, where the same merge would otherwise occur again

Fix

Extended mergeIfPossible to refuse merging when a wildcard pattern in one UnresolvedRelation matches a concrete (non-wildcard) name in the other. Since the two methods were now identical, the one in ViewCompaction is promoted to package-private and ViewResolver delegates to it, eliminating the duplication.

Tests

  • testWildcardAndViewSharingSourceIndexProduceTwoCopies — new test covering both the explicit form (regression guard, was already correct) and the wildcard form (was broken, now fixed)
  • testExclusionPreservedForIndexResolution, testReplaceViewsWildcardAll, testReplaceViewsIndexWildcardAll — updated expected plans: these tests assumed the old merged behaviour, but the merged form was always incorrect — it caused index-resolution to collapse what should be independent data sources into one

@craigtaverner craigtaverner added >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.5.0 branch:9.4 labels May 19, 2026
@craigtaverner craigtaverner changed the title ievFix wildcard view resolution losing duplicate copies Fix wildcard view resolution losing duplicate copies May 19, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @craigtaverner, I've created a changelog YAML for you.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

🔍 Preview links for changed docs

⏳ Building and deploying preview... View progress

This comment will be updated with preview links when the build is complete.

@github-actions
Copy link
Copy Markdown
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Copy link
Copy Markdown
Contributor

@idegtiarenko idegtiarenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prevents merging a pattern and a concrete expression matching the same pattern.
I think that this is a good improvement to merge and move forward, however we still have some cases that are not covered:

(1) Merging index and alias pointing to the same index

index-1
alias-1 -> index-1
View(view-1, FROM index-1)
Query: FROM alias-1,view-1

Above example contains 2 distinct concrete (non pattern) expressions matching the same index, however result set would have only one data copy

(2) Merging 2 distinct patterns

index-1
View(view-1, FROM index-*)
Query: FROM index*,view-1

Above contains 2 distinct patterns targeting the same index leading to the same issue.

(3) Merging patterns with exclusions

index-1
index-2
View(view-1, FROM index-*,-*1)
Query: FROM index-1,view-1

Above patterns could be merged today effectively excluding index-1 from the result set.

We should create a separate issue to track them.

…view/ViewCompaction.java

Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >bug serverless-linked Added by automation, don't add manually Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.3 v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Views does not produce correct duplicates when wildcard matches view and index

3 participants