Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for bug: https://github.com/opensearch-project/OpenSearch/issues/… #3665

Closed
wants to merge 5 commits into from

Conversation

kumjiten
Copy link
Contributor

@kumjiten kumjiten commented Jun 23, 2022

Description

Overriding isChunked method from GzipCompressingEntity to false as it didn't worked with SIGV4 AuthN/AuthZ,
to preserve data Overriding getContentLength from GzipCompressingEntity to return content length of gzip entity
which will be added in http header.

Issues Resolved

#3640

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@kumjiten kumjiten requested review from a team and reta as code owners June 23, 2022 11:24
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 469c5b664114d866540dd80deb03235b4739e328
Log 6246

Reports 6246

Signed-off-by: Jitendra Kumar <jkjitend@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success fdfd6c3021294b7f3858d5a2da71d534f89f9853
Log 6251

Reports 6251

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 45ec5ed
Log 6252

Reports 6252

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the licensing header update is a must, everything else is up to you

Do we need to do similar/related work in https://github.com/opensearch-project/opensearch-java?

* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new code and doesn't require a license to ElasticSearch, just the SPDX header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack will update in next commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do similar/related work in https://github.com/opensearch-project/opensearch-java?
from dependency it's also using opensearch-rest-client
https://github.com/opensearch-project/opensearch-java/blob/main/java-client/build.gradle.kts#L131
so it's only require version change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do check if there's anything else to do, I am not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see dependency not much change required in opensearch-java apart from it.

@dblock dblock added backport 1.2 backport 2.x Backport to 2.x branch and removed backport 1.2 labels Jun 23, 2022
@dblock
Copy link
Member

dblock commented Jun 23, 2022

@jiten1551 Reading https://stackoverflow.com/questions/5280633/gzip-compression-of-chunked-encoding-response compressed/gzipped content can be sent chunked as well. Are we preventing that with this change?

@kumjiten
Copy link
Contributor Author

Are we preventing that with this change?
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding
yes with this change Transfer encoding header will be removed from final request by below code:
https://github.com/apache/httpcomponents-core/blob/master/httpcore5/src/main/java/org/apache/hc/core5/http/protocol/RequestContent.java#L115

Signed-off-by: Jitendra Kumar <jkjitend@amazon.com>
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success d023308
Log 6259

Reports 6259

try (InputStream is = getContent()) {
size = is.readAllBytes().length;
} catch (IOException ex) {
size = -1L;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there other places where we return -1 when we cannot figure out the content length. I would expect an exception, not a negative number causing all kinds of wonderful problems of doing x.size() + content.getContentLength().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that why I think @reta said to return -1.
I don't see any other place where we're doing x.size + content.getContentLength();
or returning -1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine. The interface would have been "unsigned long" if it was meant to be otherwise.

@dblock
Copy link
Member

dblock commented Jun 23, 2022

@jiten1551 Meta question. Is this the right fix? We say "a gzip compressing entity doesn't work with chunked encoding with sigv4" - is that a limitation of sigv4? Feels like we're making a change that works around a problem that should be fixed elsewhere. Thus, naively, looking at this code change I'd expect a test that uses sigv4, but fails without this change, so I understand what is being fixed and don't break it in the future.

@kumjiten
Copy link
Contributor Author

understand what is being fixed and don't break it in the future.

we've already tested above change with sigv4 AuthN/AuthZ Opensearch cluster in AWS.

not sure this a limitation of sigv4 or not.

@dblock
Copy link
Member

dblock commented Jun 23, 2022

@reta WDYT of my concerns?

Signed-off-by: Jitendra Kumar <jkjitend@amazon.com>
@reta
Copy link
Collaborator

reta commented Jun 23, 2022

@reta WDYT of my concerns?

@dblock 💯 share your concerns, I am trying to find out what causes the signature validation to fail with chunked request (is it AWS [1] specific), but have strong feeling we should not disable chunked transfer by default (this change essentially).

[1] https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html

@dblock
Copy link
Member

dblock commented Jun 23, 2022

I googled it. Reading https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html it says to either (a) explicitly specify the total content length (object length in bytes plus metadata in each chunk), or (b) specify the Transfer-Encoding HTTP header, omitting content length. @jiten1551 have you tried either? It sounds like the code was doing (b), but still didn't work?

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 65a590c
Log 6260

Reports 6260

@reta
Copy link
Collaborator

reta commented Jun 23, 2022

understand what is being fixed and don't break it in the future.

we've already tested above change with sigv4 AuthN/AuthZ Opensearch cluster in AWS.

not sure this a limitation of sigv4 or not.

It seems to be 100% sigv4-related (as you mentioned in the issue), but I think we should have considered proper support for chunked + sigv4. Alternatively, we could introduce favor of CompressingEntity without chunking, so users could opt-in for it if necessary but not make it a default.

@kumjiten
Copy link
Contributor Author

kumjiten commented Jun 23, 2022

I googled it. Reading https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html it says to either (a) explicitly specify the total content length (object length in bytes plus metadata in each chunk), or (b) specify the Transfer-Encoding HTTP header, omitting content length. @jiten1551 have you tried either? It sounds like the code was doing (b), but still didn't work?

existing code is doing (b) that still fails we already tried that
for (a) this change was made which is working.

but as per https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html
both (a) and (b) should have x-amz-decoded-content-length header and we haven't tried to add this header
let quickly try it with both (a) and (b)

just tried both (a) and (b) with x-amz-decoded-content-length (b) still fails

@dblock
Copy link
Member

dblock commented Jun 23, 2022

just tried both (a) and (b) with x-amz-decoded-content-length (b) still fails

could you please post the code/repro for this somewhere?

@kumjiten
Copy link
Contributor Author

just tried both (a) and (b) with x-amz-decoded-content-length (b) still fails

could you please post the code/repro for this somewhere?

here is the code for repro: https://github.com/jiten1551/Java-Client/tree/master

@dblock
Copy link
Member

dblock commented Jun 24, 2022

@jiten1551 Take a look at kumjiten/Java-Client#1. I set x-Amz-Decoded-Content-Length to the length of the decompressed content, and removed any omitting headers logic for the signature and it seems to work.

@kumjiten
Copy link
Contributor Author

@jiten1551 Take a look at jiten1551/Java-Client#1. I set x-Amz-Decoded-Content-Length to the length of the decompressed content, and removed any omitting headers logic for the signature and it seems to work.

Hi @dblock it didn't worked with your changes in Apache Interceptor

please see kumjiten/Java-Client#1

Copy link
Member

@dblock dblock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first obvious problem here is that chunkedEnabled only works for a compressed entity as implemented, setting it to true with compressionEnabled=false has no effect. Chunked transfer encoding is something supported for non-compressed entities too AFAIK, so this looks misleading. I think you have to implement chunked for non-compressed entities too.

I think the isChunked and getContentLength parts are no longer needed I believe as the original Sigv4 problem would be solved by just setting chunked to false. But correct me if I'm wrong.

We also need tests that are clearly chunked vs. non-chunked.

@@ -131,6 +131,7 @@ public class RestClient implements Closeable {
private volatile NodeTuple<List<Node>> nodeTuple;
private final WarningsHandler warningsHandler;
private final boolean compressionEnabled;
private final boolean chunkedEnabled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be called chunkedTransferEncodingEnabled. I know it's a mouthful, but it's the name of the HTTP feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack will update in next commit

@Override
public boolean isChunked() {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer always the case. A compressing entity can be both chunked or not chunked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but we require that flag to tell it's a chunked or not and set correct headers

}

return size;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to get the size of the compressed entity directly without reading all the bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't see any method that can gives us that info directly

@kumjiten
Copy link
Contributor Author

The first obvious problem here is that chunkedEnabled only works for a compressed entity as implemented, setting it to true with compressionEnabled=false has no effect. Chunked transfer encoding is something supported for non-compressed entities too AFAIK, so this looks misleading. I think you have to implement chunked for non-compressed entities too.

@dblock I check current implementation of rest-client we see it's using content-length not transfer encoding for non-compressed request.
if we need chunking for non-compressed entities requires significant changes to support it.

I think the isChunked and getContentLength parts are no longer needed I believe as the original Sigv4 problem would be solved by just setting chunked to false. But correct me if I'm wrong.

@dblock no it's will not work only setting chunked to false it will require content-length and incorrect length will fail in server side see https://github.com/apache/httpcomponents-core/blob/master/httpcore5/src/main/java/org/apache/hc/core5/http/protocol/RequestContent.java#L115

We also need tests that are clearly chunked vs. non-chunked.

@dblock the tests we added with non-chunked and existing tests already using chunked

Signed-off-by: Jitendra Kumar <jkjitend@amazon.com>
@dblock
Copy link
Member

dblock commented Jun 30, 2022

@jiten1551 Thanks for the update and for hanging in here, this look a lot better!

  1. Can you please clarify/confirm what happens when I create an instance of RestClient with chunkedTransferEncoding = true and compressed = false?
  2. Please update the title/description of this PR with what we ended up implementing. The original idea of quietly disabling chunking is no longer what we have here.
  3. Check gradle check tests' failures, those need to pass.

@kumjiten
Copy link
Contributor Author

kumjiten commented Jul 1, 2022

@jiten1551 Thanks for the update and for hanging in here, this look a lot better!

  1. Can you please clarify/confirm what happens when I create an instance of RestClient with chunkedTransferEncoding = true and compressed = false?

@dblock we'll not take chunkedTransferEncoding into picture for now for uncompressed and document that it's not supported, if we through an exception that it's not supported that will break the current behaviour and require modifying existing tests but more important users will have now start getting exception that it's not supported.

and implementing chunked support require analysis and testing to fix that properly that will take time

  1. Please update the title/description of this PR with what we ended up implementing. The original idea of quietly disabling chunking is no longer what we have here.

@dblock ack will update it.

  1. Check gradle check tests' failures, those need to pass.
    @dblock tests are not failing see Jenkins Randomized Gradle Check Failed on 2.x #3700
    it failing while performing :distribution:bwc:minor:buildBwcLinuxTar

where it's checking out 2.x branch which didn't have rest client interface change
Building 2.1.0 didn't generate expected file /var/jenkins/workspace/gradle-check/search/distribution/bwc/minor/build/bwc/checkout-2.x/distribution/archives/linux-tar/build/distributions/opensearch-min-2.1.0-SNAPSHOT-linux-x64.tar.gz
not why it's checking out 2.x for bwc

Signed-off-by: Jitendra Kumar <jkjitend@amazon.com>
@kumjiten
Copy link
Contributor Author

kumjiten commented Jul 1, 2022

@dblock I tried to add chunk support for non-compressed request but after my change some tests are failing
if you get some time can you please check

org.apache.http.MalformedChunkCodingException: Bad chunk header: { "field": "value" }
	at __randomizedtesting.SeedInfo.seed([7BD850CF1A951B23:D269F22E0DF64B1F]:0)
	at app//org.apache.http.impl.nio.codecs.ChunkDecoder.readChunkHead(ChunkDecoder.java:134)
	at app//org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:205)
	at app//org.apache.http.nio.util.SimpleInputBuffer.consumeContent(SimpleInputBuffer.java:66)
	at app//org.opensearch.client.HeapBufferedAsyncResponseConsumer.onContentReceived(HeapBufferedAsyncResponseConsumer.java:113)
	at app//org.apache.http.nio.protocol.AbstractAsyncResponseConsumer.consumeContent(AbstractAsyncResponseConsumer.java:147)
	at app//org.apache.http.impl.nio.client.MainClientExec.consumeContent(MainClientExec.java:329)
	at app//org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.consumeContent(DefaultClientExchangeHandlerImpl.java:157)
	at app//org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:336)
	at app//org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
	at app//org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
	at app//org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
	at app//org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
	at app//org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
	at app//org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
	at app//org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
	at app//org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
	at app//org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at app//org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
	at java.base@17.0.3/java.lang.Thread.run(Thread.java:833)

@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2022

Gradle Check (Jenkins) Run Completed with:

@dblock
Copy link
Member

dblock commented Jul 1, 2022

@dblock I tried to add chunk support for non-compressed request but after my change some tests are failing if you get some time can you please check

org.apache.http.MalformedChunkCodingException: Bad chunk header: { "field": "value" }
	at __randomizedtesting.SeedInfo.seed([7BD850CF1A951B23:D269F22E0DF64B1F]:0)
	at app//org.apache.http.impl.nio.codecs.ChunkDecoder.readChunkHead(ChunkDecoder.java:134)
	at app//org.apache.http.impl.nio.codecs.ChunkDecoder.read(ChunkDecoder.java:205)
	at app//org.apache.http.nio.util.SimpleInputBuffer.consumeContent(SimpleInputBuffer.java:66)
	at app//org.opensearch.client.HeapBufferedAsyncResponseConsumer.onContentReceived(HeapBufferedAsyncResponseConsumer.java:113)
	at app//org.apache.http.nio.protocol.AbstractAsyncResponseConsumer.consumeContent(AbstractAsyncResponseConsumer.java:147)
	at app//org.apache.http.impl.nio.client.MainClientExec.consumeContent(MainClientExec.java:329)
	at app//org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.consumeContent(DefaultClientExchangeHandlerImpl.java:157)
	at app//org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:336)
	at app//org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
	at app//org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
	at app//org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
	at app//org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
	at app//org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
	at app//org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
	at app//org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
	at app//org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
	at app//org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at app//org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
	at java.base@17.0.3/java.lang.Thread.run(Thread.java:833)

I am not more familiar with this code than you are, what have you tried so far to fix this?

@dblock
Copy link
Member

dblock commented Jul 8, 2022

@jiten1551 We do want this feature. Are you still working on this?

@kumjiten
Copy link
Contributor Author

@jiten1551 We do want this feature. Are you still working on this?

@dblock busy with other tasks actively not working on this one

@dblock
Copy link
Member

dblock commented Jul 12, 2022

As is this PR disables chunked transfer encoding for all compressed requests. However, chunking works just fine, except when used with Sigv4 because we don't know how to generate a correct sigv4 signed payload in that case in the request interceptor. This library knows nothing about Sigv4, so I think we want to only add an option so that the user can do .setChunkedTransferEncoding(true/false). I tried to finish this PR along those lines in #3864. @jiten1551 can you please try it out and see if that works?

@kumjiten
Copy link
Contributor Author

As is this PR disables chunked transfer encoding for all compressed requests. However, chunking works just fine, except when used with Sigv4 because we don't know how to generate a correct sigv4 signed payload in that case in the request interceptor. This library knows nothing about Sigv4, so I think we want to only add an option so that the user can do .setChunkedTransferEncoding(true/false). I tried to finish this PR along those lines in #3864. @jiten1551 can you please try it out and see if that works?

@dblock sure will check #3864

@dblock
Copy link
Member

dblock commented Jul 13, 2022

Finished in #3864

@dblock dblock closed this Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants