Skip to content

Profile the fetch phase #77064

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Sep 13, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
8ef3af2
Profile the fetch phase
nik9000 Aug 30, 2021
2c4803d
Skip bwc
nik9000 Aug 31, 2021
22f698f
Don't compare fetch profiles
nik9000 Aug 31, 2021
97d4111
Merge branch 'master' into profile_fetch_simple
nik9000 Aug 31, 2021
860764f
Use passed one
nik9000 Aug 31, 2021
e764344
no npe
nik9000 Aug 31, 2021
ffbad75
Do last rename
nik9000 Aug 31, 2021
a4fea75
Move method down
nik9000 Aug 31, 2021
ae9373a
serialization tests
nik9000 Aug 31, 2021
e38d269
Fix sneaky serialization
nik9000 Aug 31, 2021
680226c
Test for sneaky bug
nik9000 Aug 31, 2021
5ee43ca
license header
nik9000 Aug 31, 2021
86ca46a
Merge branch 'master' into profile_fetch_simple
nik9000 Sep 1, 2021
ebe2417
Document
nik9000 Sep 1, 2021
0c3b0a0
Fix test
nik9000 Sep 1, 2021
c4c9284
Merge branch 'master' into profile_fetch_simple
nik9000 Sep 1, 2021
10ea57d
newline
nik9000 Sep 1, 2021
74b2493
Merge branch 'master' into profile_fetch_simple
nik9000 Sep 8, 2021
da79b66
Restore assertion
nik9000 Sep 8, 2021
6feb027
unit test merging
nik9000 Sep 8, 2021
45a46e2
Merge branch 'master' into profile_fetch_simple
nik9000 Sep 8, 2021
c9df5ab
Handle inner hits
nik9000 Sep 8, 2021
cbf3a71
Fixup
nik9000 Sep 8, 2021
86c79e8
Merge branch 'master' into profile_fetch_simple
nik9000 Sep 9, 2021
d22facf
Revert unneeded
nik9000 Sep 9, 2021
9c7b01f
Merge branch 'master' into profile_fetch_simple
nik9000 Sep 9, 2021
0b865cc
Revert inner hits profiling
nik9000 Sep 9, 2021
4901034
Fix names
nik9000 Sep 9, 2021
b00d008
Fixup names
nik9000 Sep 9, 2021
851fd43
Move results building
nik9000 Sep 9, 2021
48b352e
Drop loaded_nested
nik9000 Sep 9, 2021
0577931
Checkstyle
nik9000 Sep 9, 2021
d4d27b4
Merge branch 'master' into profile_fetch_simple
nik9000 Sep 10, 2021
b6dd869
Fixup more
nik9000 Sep 10, 2021
381002a
Finish writeable cleanup
nik9000 Sep 10, 2021
1ad108e
Remove null checking builder
nik9000 Sep 10, 2021
013799a
Fix wire mistake
nik9000 Sep 10, 2021
b452a29
Rename
nik9000 Sep 10, 2021
1235c88
Remove funny builder
nik9000 Sep 10, 2021
c9b8d62
Remove name munging
nik9000 Sep 10, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@
import org.elasticsearch.common.document.DocumentField;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.unit.Fuzziness;
import org.elasticsearch.core.TimeValue;
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.core.TimeValue;
import org.elasticsearch.index.get.GetResult;
import org.elasticsearch.index.query.MatchQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
Expand Down Expand Up @@ -89,7 +89,7 @@
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
import org.elasticsearch.search.profile.ProfileResult;
import org.elasticsearch.search.profile.SearchProfileQueryPhaseResult;
import org.elasticsearch.search.profile.SearchProfileShardResult;
import org.elasticsearch.search.profile.aggregation.AggregationProfileShardResult;
import org.elasticsearch.search.profile.query.CollectorResult;
import org.elasticsearch.search.profile.query.QueryProfileShardResult;
Expand Down Expand Up @@ -499,15 +499,15 @@ public void testSearchRequestProfiling() throws IOException {

SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
// tag::search-request-profiling-get
Map<String, SearchProfileQueryPhaseResult> profilingResults =
Map<String, SearchProfileShardResult> profilingResults =
searchResponse.getProfileResults(); // <1>
for (Map.Entry<String, SearchProfileQueryPhaseResult> profilingResult : profilingResults.entrySet()) { // <2>
for (Map.Entry<String, SearchProfileShardResult> profilingResult : profilingResults.entrySet()) { // <2>
String key = profilingResult.getKey(); // <3>
SearchProfileQueryPhaseResult profileShardResult = profilingResult.getValue(); // <4>
SearchProfileShardResult profileShardResult = profilingResult.getValue(); // <4>
}
// end::search-request-profiling-get

SearchProfileQueryPhaseResult profileShardResult = profilingResults.values().iterator().next();
SearchProfileShardResult profileShardResult = profilingResults.values().iterator().next();
assertNotNull(profileShardResult);

// tag::search-request-profiling-queries
Expand Down
166 changes: 144 additions & 22 deletions docs/reference/search/profile.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,37 @@ The API returns the following result:
]
}
],
"aggregations": []
"aggregations": [],
"fetch": {
"type": "fetch",
"description": "",
"time_in_nanos": 660555,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual nanoTime for the entire fetch phase rather than the sum of the breakdown. Because we can do that in fetch. We couldn't really do that for aggs or queries.

"breakdown": {
"next_reader": 7292,
"next_reader_count": 1,
"load_stored_fields": 299325,
"load_stored_fields_count": 5
},
"debug": {
"stored_fields": ["_id", "_routing", "_source"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit right here is important to me. Some folks want to only fetch doc values and avoid stored fields entirely. With this PR you can profile the query and check that this array is empty. If it is then we skipped loading doc values. Unless something tricky is happening in the nested case that I didn't dig into.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the debug section too!

},
"children": [
{
"type": "FetchSourcePhase",
"description": "",
"time_in_nanos": 20443,
"breakdown": {
"next_reader": 745,
"next_reader_count": 1,
"process": 19698,
"process_count": 5
},
"debug": {
"fast_path": 5
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fetch phases can emit extra debugging information if they decide to do so. This has proven super useful in aggs land. I'm not sure how useful it is here, but I figured it'd be nice to be able to glance at a profile result and know if we were able to just shovel the bytes out the other side or if we had to parse the source.

}
}
]
}
}
]
}
Expand Down Expand Up @@ -196,7 +226,8 @@ The overall structure of the profile response is as follows:
"collector": [...] <4>
}
],
"aggregations": [...] <5>
"aggregations": [...], <5>
"fetch": {...} <6>
}
]
}
Expand All @@ -208,15 +239,14 @@ The overall structure of the profile response is as follows:
// TESTRESPONSE[s/"query": \[...\]/"query": $body.$_path/]
// TESTRESPONSE[s/"collector": \[...\]/"collector": $body.$_path/]
// TESTRESPONSE[s/"aggregations": \[...\]/"aggregations": []/]
// TESTRESPONSE[s/"fetch": \{...\}/"fetch": $body.$_path/]
<1> A profile is returned for each shard that participated in the response, and
is identified by a unique ID.
<2> Each profile contains a section which holds details about the query
execution.
<3> Each profile has a single time representing the cumulative rewrite time.
<4> Each profile also contains a section about the Lucene Collectors which run
the search.
<5> Each profile contains a section which holds the details about the
aggregation execution.
<2> Query timings and other debugging information.
<3> The cumulative rewrite time.
<4> Names and invocation timings for each collector.
<5> Aggregation timings, invocation counts, and debug information.
<6> Fetch timing and debug information.

Because a search request may be executed against one or more shards in an index,
and a search may cover one or more indices, the top level element in the profile
Expand Down Expand Up @@ -295,7 +325,7 @@ Using our previous `match` query example, let's analyze the `query` section:
]
--------------------------------------------------
// TESTRESPONSE[s/^/{\n"took": $body.took,\n"timed_out": $body.timed_out,\n"_shards": $body._shards,\n"hits": $body.hits,\n"profile": {\n"shards": [ {\n"id": "$body.$_path",\n"searches": [{\n/]
// TESTRESPONSE[s/]$/],"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": []}]}}/]
// TESTRESPONSE[s/]$/],"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": [], "fetch": $body.$_path}]}}/]
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
// TESTRESPONSE[s/"breakdown": \{...\}/"breakdown": $body.$_path/]
<1> The breakdown timings are omitted for simplicity.
Expand Down Expand Up @@ -347,7 +377,7 @@ Lucene execution:
}
--------------------------------------------------
// TESTRESPONSE[s/^/{\n"took": $body.took,\n"timed_out": $body.timed_out,\n"_shards": $body._shards,\n"hits": $body.hits,\n"profile": {\n"shards": [ {\n"id": "$body.$_path",\n"searches": [{\n"query": [{\n"type": "BooleanQuery",\n"description": "message:get message:search",\n"time_in_nanos": $body.$_path,/]
// TESTRESPONSE[s/}$/},\n"children": $body.$_path}],\n"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": []}]}}/]
// TESTRESPONSE[s/}$/},\n"children": $body.$_path}],\n"rewrite_time": $body.$_path, "collector": $body.$_path}], "aggregations": [], "fetch": $body.$_path}]}}/]
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]

Timings are listed in wall-clock nanoseconds and are not normalized at all. All
Expand Down Expand Up @@ -448,7 +478,7 @@ Looking at the previous example:
]
--------------------------------------------------
// TESTRESPONSE[s/^/{\n"took": $body.took,\n"timed_out": $body.timed_out,\n"_shards": $body._shards,\n"hits": $body.hits,\n"profile": {\n"shards": [ {\n"id": "$body.$_path",\n"searches": [{\n"query": $body.$_path,\n"rewrite_time": $body.$_path,/]
// TESTRESPONSE[s/]$/]}], "aggregations": []}]}}/]
// TESTRESPONSE[s/]$/]}], "aggregations": [], "fetch": $body.$_path}]}}/]
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]


Expand Down Expand Up @@ -569,7 +599,7 @@ GET /my-index-000001/_search
}
--------------------------------------------------
// TEST[setup:my_index]
// TEST[s/_search/_search\?filter_path=profile.shards.id,profile.shards.searches,profile.shards.aggregations/]
// TEST[s/_search/_search\?filter_path=profile.shards.id,profile.shards.searches,profile.shards.aggregations,profile.shards.fetch/]


This example has:
Expand Down Expand Up @@ -673,13 +703,15 @@ The API returns the following result:
]
}
],
"aggregations": [...] <1>
"aggregations": [...], <1>
"fetch": {...}
}
]
}
}
--------------------------------------------------
// TESTRESPONSE[s/"aggregations": \[\.\.\.\]/"aggregations": $body.$_path/]
// TESTRESPONSE[s/"fetch": \{\.\.\.\}/"fetch": $body.$_path/]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we like this PR I'll update the docs to add a section describing the fetch profile. The guts of a profile are in a kind of grey area where they are quite part of the public API but folks can look at them and kibana even visualizes then but we will change them when the underlying implementations change. Its a weird place.

// TESTRESPONSE[s/\.\.\.//]
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
// TESTRESPONSE[s/"id": "\[P6-vulHtQRWuD4YnubWb7A\]\[my-index-000001\]\[0\]"/"id": $body.profile.shards.0.id/]
Expand Down Expand Up @@ -918,6 +950,99 @@ to give you a feel for A) what machinery in {es} is actually eating time, and B)
the magnitude of differences in times between the various components. Like the
overall time, the breakdown is inclusive of all children times.

[[profiling-fetch]]
===== Profiling Fetch


All shards the fetched documents will have a `fetch` section in the profile.
Let's execute a small search and have a look a the fetch profile:

[source,console]
----
GET /my-index-000001/_search?filter_path=profile.shards.fetch
{
"profile": true,
"query": {
"term": {
"user.id": {
"value": "elkbee"
}
}
}
}
----
// TEST[continued]

And here is the fetch profile:

[source,console-result]
----
{
"profile": {
"shards": [
{
"fetch": {
"type": "fetch",
"description": "",
"time_in_nanos": 660555,
"breakdown": {
"next_reader": 7292,
"next_reader_count": 1,
"load_stored_fields": 299325,
"load_stored_fields_count": 5
},
"debug": {
"stored_fields": ["_id", "_routing", "_source"]
},
"children": [
{
"type": "FetchSourcePhase",
"description": "",
"time_in_nanos": 20443,
"breakdown": {
"next_reader": 745,
"next_reader_count": 1,
"process": 19698,
"process_count": 5
},
"debug": {
"fast_path": 4
}
}
]
}
}
]
}
}
----
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]

Since this is debugging information about the way that Elasticsearch executes
the fetch it can change from request to request and version to version. Even
patch versions may change the output here. That lack of consistency is what
makes it useful for debugging.

Anyway! `time_in_nanos` measures the time total time of the fetch phase.
The `breakdown` counts and times the our
per-link:{glossary}/terms.html#glossary-segment[segment] preparation in
`next_reader` and the time taken loading stored fields in `load_stored_fields`.
Debug contains miscellaneous non-timing information, specifically
`stored_fields` lists the stored fields that fetch will have to load. If it is
an empty list then fetch will entirely skip loading stored fields.

The `children` section lists the sub-phases that do the actual fetching work
and the `breakdown` has counts and timings for the
per-link:{glossary}/terms.html#glossary-segment[segment] preparation in
`next_reader` and the per document fetching in `process`.

NOTE: We try hard to load all of the stored fields that we will need for the
fetch up front. This tends to make the `_source` phase a couple of microseconds
per hit. In that case the true cost of `_source` phase is hidden in the
`load_stored_fields` component of the breakdown. It's possible to entirely skip
loading stored fields by setting
`"_source": false, "stored_fields": ["_none_"]`.

[[profiling-considerations]]
===== Profiling Considerations

Expand All @@ -936,16 +1061,13 @@ have a drastic effect compared to other components in the profiled query.
[[profile-limitations]]
===== Limitations

- Profiling currently does not measure the search fetch phase nor the network
overhead.
- Profiling currently does not measure the network overhead.
- Profiling also does not account for time spent in the queue, merging shard
responses on the coordinating node, or additional work such as building global
ordinals (an internal data structure used to speed up search).
- Profiling statistics are currently not available for suggestions,
- Profiling statistics are currently not available for suggestions,
highlighting, `dfs_query_then_fetch`.
- Profiling of the reduce phase of aggregation is currently not available.
- The Profiler is still highly experimental. The Profiler is instrumenting parts
of Lucene that were never designed to be exposed in this manner, and so all
results should be viewed as a best effort to provide detailed diagnostics. We
hope to improve this over time. If you find obviously wrong numbers, strange
query structures, or other bugs, please report them!
- The Profiler is instrumenting internals that can change from version to
version. The resulting json should be considered mostly unstable, especially
things in the `debug` section.
Original file line number Diff line number Diff line change
Expand Up @@ -116,3 +116,39 @@ teardown:
query:
match_all: {}
inner_hits: {}

---
profile fetch:
- skip:
version: ' - 7.99.99'
reason: fetch profiling implemented in 8.0.0 to be backported to 7.16.0

- do:
search:
index: test
body:
profile: true
query:
has_parent:
parent_type: question
query:
match_all: {}
inner_hits: {}

- gt: { profile.shards.0.fetch.time_in_nanos: 0 }
- gt: { profile.shards.0.fetch.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.breakdown.load_stored_fields_count: 0 }
- gt: { profile.shards.0.fetch.breakdown.load_stored_fields: 0 }
- match: { profile.shards.0.fetch.debug.stored_fields: [_id, _routing, _source] }
- length: { profile.shards.0.fetch.children: 2 }
- match: { profile.shards.0.fetch.children.0.type: FetchSourcePhase }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- match: { profile.shards.0.fetch.children.1.type: InnerHitsPhase }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }
Original file line number Diff line number Diff line change
Expand Up @@ -97,10 +97,13 @@
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.Consumer;

import static java.util.stream.Collectors.toList;
import static org.hamcrest.CoreMatchers.equalTo;
import static org.hamcrest.CoreMatchers.instanceOf;
import static org.hamcrest.Matchers.empty;
import static org.hamcrest.Matchers.greaterThan;
import static org.hamcrest.Matchers.greaterThanOrEqualTo;
import static org.hamcrest.Matchers.not;

/**
* This test class executes twice, first against the remote cluster, and then against another cluster that has the remote cluster
Expand Down Expand Up @@ -405,6 +408,10 @@ public void testProfile() throws Exception {
duelSearch(searchRequest, response -> {
assertHits(response);
assertFalse(response.getProfileResults().isEmpty());
assertThat(
response.getProfileResults().values().stream().filter(sr -> sr.getFetchPhase() != null).collect(toList()),
not(empty())
);
});
}

Expand Down Expand Up @@ -813,6 +820,14 @@ private static Map<String, Object> responseToMap(SearchResponse response) throws
List<Map<String, Object>> shards = (List <Map<String, Object>>)profile.get("shards");
for (Map<String, Object> shard : shards) {
replaceProfileTime(shard);
/*
* The way we try to reduce round trips is by fetching all
* of the results we could possibly need from the remote
* cluster and then merging *those* together locally. This
* will end up fetching more documents total. So we can't
* really compare the fetch profiles here.
*/
shard.remove("fetch");
}
}
return responseMap;
Expand Down
Loading