-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Exclude doc-value-only fields from all fields queries #82712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exclude doc-value-only fields from all fields queries #82712
Conversation
Pinging @elastic/es-search (Team:Search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Aha, the failing tests are because this new restriction will also exclude runtime fields. Which is an interesting question - do we want to exclude them, given that they also effectively do a table scan and are therefore very slow? Or do we want to be cleverer here? |
I think runtime fields should be excluded as well. If not, then I don't think doc-value-only fields should be excluded either, as runtime fields are generally even slower than doc-value-only fields. |
I think this is a bug in the query parsing code, or at least an inconsistency. According to #25726, we expand fields in the following ways:
But what actually happens seems to happen for
In other words, partial field expansions are excluding the wrong thing. On I'll change the doc-values (and runtime fields) exclusion to work on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left one comment. While I understand what the intent behind the current logic is, I find it highly confusing how it is currently represented / named.
@@ -132,6 +132,10 @@ private QueryParserHelper() {} | |||
// Ignore metadata fields | |||
continue; | |||
} | |||
if (acceptMetadataField == false && fieldType.isIndexed() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The acceptMetadataField
variable name is highly confusing if we're using it here to denote full
vs partial
wilcard. The javadocs for acceptMetadataField
say "Whether metadata fields should be added when a pattern is expanded.". Non-indexed fields shouldn't be added though, independently of whether they're metadata fields or not? Which metadata fields exist that are not indexed? I find this code now highly confusing, as the relation between indexed
and acceptMetadataField
isn't clear, at least with how things are currently named.
Do I understand correctly that a doc-value-only field would be excluded from |
Yes, and I agree that it's not ideal, but there's precedent in that this is what we already do for metadata fields. I'm mainly concerned that after #82409 kibana searches may get much slower because all of sudden we're doing a table scan for every query |
Thinking out loud, I wonder if there are other ways how we could do this, such as using a placeholder like |
So if you explicitly set |
Oh yes, something like that sounds nice! |
Hi @romseygeek, I've created a changelog YAML for you. |
Closing, we decided not to proceed with this change. See a summary of the discussion at #82710 |
Doc-value-only fields can be searched (slowly) by doing a table scan; this
commit ensures that these fields are not included in an all-field query, ie
a
query_string
or similar query against*
, to prevent these queriesbecoming unreasonably slow.
Fixes #82710