Fix wildcard view resolution losing duplicate copies#149418
Fix wildcard view resolution losing duplicate copies#149418craigtaverner wants to merge 6 commits into
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @craigtaverner, I've created a changelog YAML for you. |
🔍 Preview links for changed docs⏳ Building and deploying preview... View progress This comment will be updated with preview links when the build is complete. |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
idegtiarenko
left a comment
There was a problem hiding this comment.
This prevents merging a pattern and a concrete expression matching the same pattern.
I think that this is a good improvement to merge and move forward, however we still have some cases that are not covered:
(1) Merging index and alias pointing to the same index
index-1
alias-1 -> index-1
View(view-1, FROM index-1)
Query: FROM alias-1,view-1
Above example contains 2 distinct concrete (non pattern) expressions matching the same index, however result set would have only one data copy
(2) Merging 2 distinct patterns
index-1
View(view-1, FROM index-*)
Query: FROM index*,view-1
Above contains 2 distinct patterns targeting the same index leading to the same issue.
(3) Merging patterns with exclusions
index-1
index-2
View(view-1, FROM index-*,-*1)
Query: FROM index-1,view-1
Above patterns could be merged today effectively excluding index-1 from the result set.
We should create a separate issue to track them.
…view/ViewCompaction.java Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@gmail.com>
Previous work to ensure views and indexes are not de-duplicated was done at #145091, however that did not properly take into account wildcards.
Fixes #149416
Problem
When a wildcard pattern matches both a concrete index and a view that sources from that same index, the query should produce two independent copies of the underlying data — one from the direct index match, one from the view. This is the same behaviour as explicitly naming both (e.g.
FROM logs-1, logs-2), which already worked correctly.For example, given index
logs-1and viewlogs-2 = FROM logs-1:FROM logs-1, logs-2→ two copies oflogs-1data ✅ (worked before)FROM logs-*→ should also produce two copies, but produced only one ❌Root cause
Both
ViewResolverandViewCompactioncontain amergeIfPossiblehelper that decides whether twoUnresolvedRelationnodes can be safely merged into one (combining their index pattern lists). The check only prevented merging when two patterns were identical strings. It did not account for wildcard overlap.When resolving
FROM logs-*:logs-2resolves toUnresolvedRelation("logs-1")UnresolvedRelation("logs-*")for the direct index matchesmergeIfPossible("logs-1", "logs-*")found no exact-string match, so it merged them intoUnresolvedRelation("logs-1,logs-*")logs-1(sincelogs-*already covers it), silently dropping one copyThe bug required fixing in both places:
ViewResolver.mergeCompatibleUnresolvedRelations— performs the initial merge during resolutionViewCompaction.mergeUnresolvedRelationEntries— re-runs after compaction unwrapsNamedSubquerynodes, where the same merge would otherwise occur againFix
Extended
mergeIfPossibleto refuse merging when a wildcard pattern in oneUnresolvedRelationmatches a concrete (non-wildcard) name in the other. Since the two methods were now identical, the one inViewCompactionis promoted to package-private andViewResolverdelegates to it, eliminating the duplication.Tests
testWildcardAndViewSharingSourceIndexProduceTwoCopies— new test covering both the explicit form (regression guard, was already correct) and the wildcard form (was broken, now fixed)testExclusionPreservedForIndexResolution,testReplaceViewsWildcardAll,testReplaceViewsIndexWildcardAll— updated expected plans: these tests assumed the old merged behaviour, but the merged form was always incorrect — it caused index-resolution to collapse what should be independent data sources into one