Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use chunked REST serialization for large REST responses #88311

Merged

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Jul 6, 2022

Adds chunked rest serialization infrastructure that tries to serialize
only what can be flushed to the channel right away instead of fully
materializing a response on heap first and then writing it to the channel.

Makes use of the new infrastructure for get-snapshots as an example use case.
Follow-ups will extend the usage of this logic to other problematic APIs like
/_cluster/state.

A working version of this with the cluster state crudely chunked can be found in https://github.com/original-brownbear/elasticsearch/pull/new/chunked-rest-response-with-cs

relates #77466

Adds chunked rest serialization infrastructure that tries to serialize
only what can be flushed to the channel right away instead of fully
materializing a response on heap first and then writing it to the channel.

Makes use of the new infrastructure for get-snapshots as an example use case.
Follow-ups will extend the usage of this logic to other problematic APIs like
`/_cluster/state`.
Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through one more round here and left a few more comments now that I looked at tests too.

);
final ByteBuf content = Netty4Utils.toByteBuf(bytes);
final boolean done = body.isDone();
final ChannelFuture f = ctx.write(done ? new DefaultLastHttpContent(content) : new DefaultHttpContent(content));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assert that:

assert done || length > 0;

to ensure progress of chunking in the wide variety of implementations we will eventually see?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be more extreme IMO and just assert length > 0 and done == false before that. We shouldn't be calling this once we're done and should never emit an empty buffer not even for the last invocation because that'll just be needless overhead for an empty last chunk. Adding those assertions

Copy link
Contributor

@henningandersen henningandersen Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we can always have that guarantee for the last chunk, but as long as we emit json it sounds plausible so ok by me for now.

assertTrue(promise.isDone());
assertThat(messagesSeen, hasSize(chunks + 1));
assertThat(messagesSeen.get(0), instanceOf(Netty4ChunkedHttpResponse.class));
assertThat(messagesSeen.get(chunks), instanceOf(LastHttpContent.class));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd find it nice to also validate that the bytes we see are right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed dcf73e4 for this

Comment on lines 227 to 228
assertThat(messagesSeen.get(0), instanceOf(Netty4ChunkedHttpResponse.class));
assertThat(messagesSeen.get(chunks), instanceOf(LastHttpContent.class));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify the in between messages too? At least in this first basic test, but perhaps we can add a helper method to include that in all tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes pushed dcf73e4 for this that adds that helper method

final HttpResponse response1 = request1.createResponse(RestStatus.OK, getRepeatedChunkResponseBody(chunks1, chunk));
final HttpResponse response2 = request2.createResponse(RestStatus.OK, getRepeatedChunkResponseBody(chunks2, chunk));
final ChannelPromise promise1 = embeddedChannel.newPromise();
embeddedChannel.write(response1, promise1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this (and the other tests) randomly write the second response first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure added some randomisation to that effect in 911e91d

* @param channel channel the response will be written to
* @return chunked rest response body
*/
static ChunkedRestResponseBody fromXContent(ChunkedToXContent chunkedToXContent, ToXContent.Params params, RestChannel channel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test of this method as well as the encodeChunk method?

Copy link
Member Author

@original-brownbear original-brownbear Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, how about 2943f1b ?

@@ -322,6 +322,33 @@ public void testSortAfterStartTime() throws Exception {
assertThat(allBeforeStartTimeDescending(startTime1 - 1), empty());
}

public void testLargeChunkedResponses() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is still value in this. For one, encodeChunk does not run there. Not saying that that integration will lead to issues (I see none), but some verification of pipelining with chunking end-to-end seems valuable.

@original-brownbear
Copy link
Member Author

Thanks again Henning! I think I once again addressed all points :)

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Left a few comments that I hope you will address too.

}

private void finishChunkedWrite() {
currentChunkedWrite.combiner.finish(currentChunkedWrite.onDone);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think netty does catch all here, but would feel safer to clear currentChunkedWrite and increment writeSequence either before calling finish or in a finally in case any of the listeners throws.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea unless we have a bug where we finish the same thing twice we shouldn't get any throwing here, but makes sense to me still -> put a finally here.

@@ -322,6 +322,33 @@ public void testSortAfterStartTime() throws Exception {
assertThat(allBeforeStartTimeDescending(startTime1 - 1), empty());
}

public void testLargeChunkedResponses() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK to skip for now. I only thought this might be possible to provoke by doing random requests in parallel (i.e., serially but without waiting for the response) on the same client, but it might not support pipelining.

channel.sendResponse(
new RestResponse(
RestStatus.OK,
(Objects.requireNonNullElse(request.getXContentType(), XContentType.JSON)).mediaType(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChunkedRestResponseBody finds the xcontent-type through AbstractRestChannel.newBuilder, can we instead use that by building the ChunkedRestResponseBody first?

Copy link
Member Author

@original-brownbear original-brownbear Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: nevermind will fix this, found a neat way to do so

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I haven't had a chance to look at this in more detail. I have one question tho: what happens if we add a response header (especially a deprecation warning) during REST serialization, which might now be after we've finished sending response headers to the client? I don't think we can do much in practice but I think it'd be wise to at least cause a test failure if this happens.

@original-brownbear
Copy link
Member Author

@DaveCTurner

if we add a response header (especially a deprecation warning) during REST serialization, which might now be after we've finished sending response headers to the client

Good point. We could add these as trailers though, which should be understood by most clients? Maybe in a follow-up that I could get on asap?

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/part-1 ( known + unrelated)

@DaveCTurner
Copy link
Contributor

We could add these as trailers though

TIL yet another dark corner of the HTTP spec...

@original-brownbear
Copy link
Member Author

Alrighty, I'm going to merge this one. Thanks Henning + David! :) I'll follow up for the footer thing before I use this for anything that adds headers (GetSnapshotsResponse seems safe to me).

@sethmlarson
Copy link
Contributor

@original-brownbear @DaveCTurner I would avoid using trailers if possible, many HTTP libraries don't process them or make them available to users so wouldn't be actionable from clients team.

@DaveCTurner
Copy link
Contributor

avoid using trailers if possible

Let's continue this on #89839.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Network Http and internode communication implementations >enhancement Team:Distributed Meta label for distributed team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants