-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use chunked REST serialization for large REST responses #88311
Use chunked REST serialization for large REST responses #88311
Conversation
Adds chunked rest serialization infrastructure that tries to serialize only what can be flushed to the channel right away instead of fully materializing a response on heap first and then writing it to the channel. Makes use of the new infrastructure for get-snapshots as an example use case. Follow-ups will extend the usage of this logic to other problematic APIs like `/_cluster/state`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through one more round here and left a few more comments now that I looked at tests too.
...ort-netty4/src/test/java/org/elasticsearch/http/netty4/Netty4HttpPipeliningHandlerTests.java
Outdated
Show resolved
Hide resolved
); | ||
final ByteBuf content = Netty4Utils.toByteBuf(bytes); | ||
final boolean done = body.isDone(); | ||
final ChannelFuture f = ctx.write(done ? new DefaultLastHttpContent(content) : new DefaultHttpContent(content)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assert that:
assert done || length > 0;
to ensure progress of chunking in the wide variety of implementations we will eventually see?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can be more extreme IMO and just assert length > 0
and done == false
before that. We shouldn't be calling this once we're done and should never emit an empty buffer not even for the last invocation because that'll just be needless overhead for an empty last chunk. Adding those assertions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we can always have that guarantee for the last chunk, but as long as we emit json it sounds plausible so ok by me for now.
assertTrue(promise.isDone()); | ||
assertThat(messagesSeen, hasSize(chunks + 1)); | ||
assertThat(messagesSeen.get(0), instanceOf(Netty4ChunkedHttpResponse.class)); | ||
assertThat(messagesSeen.get(chunks), instanceOf(LastHttpContent.class)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd find it nice to also validate that the bytes we see are right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed dcf73e4 for this
assertThat(messagesSeen.get(0), instanceOf(Netty4ChunkedHttpResponse.class)); | ||
assertThat(messagesSeen.get(chunks), instanceOf(LastHttpContent.class)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we verify the in between messages too? At least in this first basic test, but perhaps we can add a helper method to include that in all tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes pushed dcf73e4 for this that adds that helper method
...ort-netty4/src/test/java/org/elasticsearch/http/netty4/Netty4HttpPipeliningHandlerTests.java
Outdated
Show resolved
Hide resolved
...ort-netty4/src/test/java/org/elasticsearch/http/netty4/Netty4HttpPipeliningHandlerTests.java
Outdated
Show resolved
Hide resolved
final HttpResponse response1 = request1.createResponse(RestStatus.OK, getRepeatedChunkResponseBody(chunks1, chunk)); | ||
final HttpResponse response2 = request2.createResponse(RestStatus.OK, getRepeatedChunkResponseBody(chunks2, chunk)); | ||
final ChannelPromise promise1 = embeddedChannel.newPromise(); | ||
embeddedChannel.write(response1, promise1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this (and the other tests) randomly write the second response first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure added some randomisation to that effect in 911e91d
...ort-netty4/src/test/java/org/elasticsearch/http/netty4/Netty4HttpPipeliningHandlerTests.java
Show resolved
Hide resolved
* @param channel channel the response will be written to | ||
* @return chunked rest response body | ||
*/ | ||
static ChunkedRestResponseBody fromXContent(ChunkedToXContent chunkedToXContent, ToXContent.Params params, RestChannel channel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test of this method as well as the encodeChunk
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, how about 2943f1b ?
@@ -322,6 +322,33 @@ public void testSortAfterStartTime() throws Exception { | |||
assertThat(allBeforeStartTimeDescending(startTime1 - 1), empty()); | |||
} | |||
|
|||
public void testLargeChunkedResponses() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is still value in this. For one, encodeChunk
does not run there. Not saying that that integration will lead to issues (I see none), but some verification of pipelining with chunking end-to-end seems valuable.
Thanks again Henning! I think I once again addressed all points :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Left a few comments that I hope you will address too.
...ransport-netty4/src/main/java/org/elasticsearch/http/netty4/Netty4HttpPipeliningHandler.java
Show resolved
Hide resolved
} | ||
|
||
private void finishChunkedWrite() { | ||
currentChunkedWrite.combiner.finish(currentChunkedWrite.onDone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think netty does catch all here, but would feel safer to clear currentChunkedWrite
and increment writeSequence
either before calling finish or in a finally in case any of the listeners throws.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea unless we have a bug where we finish the same thing twice we shouldn't get any throwing here, but makes sense to me still -> put a finally here.
...ort-netty4/src/test/java/org/elasticsearch/http/netty4/Netty4HttpPipeliningHandlerTests.java
Show resolved
Hide resolved
@@ -322,6 +322,33 @@ public void testSortAfterStartTime() throws Exception { | |||
assertThat(allBeforeStartTimeDescending(startTime1 - 1), empty()); | |||
} | |||
|
|||
public void testLargeChunkedResponses() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK to skip for now. I only thought this might be possible to provoke by doing random requests in parallel (i.e., serially but without waiting for the response) on the same client, but it might not support pipelining.
channel.sendResponse( | ||
new RestResponse( | ||
RestStatus.OK, | ||
(Objects.requireNonNullElse(request.getXContentType(), XContentType.JSON)).mediaType(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChunkedRestResponseBody
finds the xcontent-type through AbstractRestChannel.newBuilder
, can we instead use that by building the ChunkedRestResponseBody
first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: nevermind will fix this, found a neat way to do so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I haven't had a chance to look at this in more detail. I have one question tho: what happens if we add a response header (especially a deprecation warning) during REST serialization, which might now be after we've finished sending response headers to the client? I don't think we can do much in practice but I think it'd be wise to at least cause a test failure if this happens.
Good point. We could add these as trailers though, which should be understood by most clients? Maybe in a follow-up that I could get on asap? |
Jenkins run elasticsearch-ci/part-1 ( known + unrelated) |
TIL yet another dark corner of the HTTP spec... |
Alrighty, I'm going to merge this one. Thanks Henning + David! :) I'll follow up for the footer thing before I use this for anything that adds headers ( |
@original-brownbear @DaveCTurner I would avoid using trailers if possible, many HTTP libraries don't process them or make them available to users so wouldn't be actionable from clients team. |
Let's continue this on #89839. |
Adds chunked rest serialization infrastructure that tries to serialize
only what can be flushed to the channel right away instead of fully
materializing a response on heap first and then writing it to the channel.
Makes use of the new infrastructure for get-snapshots as an example use case.
Follow-ups will extend the usage of this logic to other problematic APIs like
/_cluster/state
.A working version of this with the cluster state crudely chunked can be found in https://github.com/original-brownbear/elasticsearch/pull/new/chunked-rest-response-with-cs
relates #77466