Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correctly update ContentLength for uncompressed response body #498

Merged
merged 2 commits into from
Mar 2, 2024

Conversation

ruitianzhong
Copy link
Contributor

fix the bug that is found when reproducing in issue #491 .

When Content-Encoding is gzip, the uncompressed response body length is different from the compressed one. So the Content-Length need to be updated to stay consistent with the uncompressed body length.

More detail in #491 (comment)

Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
Copy link
Member

@cfc4n cfc4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a unit test?

* fix the wrong Content-Length in 952282712204824.bin, which leads to , and this test  should fail but actually not because of there is not detection in unit test for it.

* add unit test for gzipped http body and add detection for it.

Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
@ruitianzhong
Copy link
Contributor Author

ruitianzhong commented Mar 1, 2024

  • first I add three *.bin files. In particular, 952253698002000.bin file is response body which is compressed with gzip to detect the bug we fix in this PR.
    To see its content, run the following content
cat 952253698002000.bin | gunzip
  • Also error detection is added in TestEventProcessor_Serve() function to explicitly detect error through log message.

With all steps above, it can detect all related bugs on version that is only added unit test related code with the bug not fixed.

=== RUN   TestEventProcessor_Serve
    processor_test.go:81: 2024/03/01 20:00:21 [http response] DumpResponse error: unexpected EOF
    processor_test.go:81: 2024/03/01 20:00:21 [http response] DumpResponse error: http: ContentLength=1145 with Body length 2443
    processor_test.go:89: some errors occurred
--- FAIL: TestEventProcessor_Serve (3.00s)

It can be verified through my demo branch https://github.com/ruitianzhong/ecapture/tree/demo

Copy link
Member

@cfc4n cfc4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

pkg/event_processor/processor_test.go Show resolved Hide resolved
@ruitianzhong
Copy link
Contributor Author

ruitianzhong commented Mar 2, 2024

Further Explanation

On previous commit(before this PR):

When running:

wget -d --header 'Accept-Encoding: gzip' https://1.1.1.1

It works well and no DumpError occurred.

However, when running:

wget -d --header 'Accept-Encoding: gzip' https://www.baidu.com

DumpError occurred(mentioned in #491).

After further investigation, I found that it's because of the response structure.

for baidu, its response header looks like

HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Length: 1145
Server: bfe
Date: Fri, 01 Mar 2024 08:24:27 GMT
Content-Type: text/html;charset=UTF-8

for 1.1.1.1, its response header looks like:

HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked``

Obviously, Content-Length is different. According to the HTTP/1.1 standard, with Transfer-Encoding set, Content-Length is not set and in go it is zero, according to comment in http package comment:

// ContentLength records the length of the associated content. The
	// value -1 indicates that the length is unknown. Unless Request.Method
	// is "HEAD", values >= 0 indicate that the given number of bytes may
	// be read from Body.
	ContentLength [int64](https://pkg.go.dev/builtin#int64)

So DumpResponse() work well under this condition after response body is uncompressed.

However, when the Content-Length is set like first example, if we just uncompressed the response body without adjusting the Content-Length, error would be returned from DumpResponse.

Just to give more context info here.

Thanks!

@cfc4n
Copy link
Member

cfc4n commented Mar 2, 2024

LGTM, thanks.

In fact, when a larger string is called with SSL_Write, it will be split into multiple fragments and SSL_Write will be called multiple times. Moreover, there are concurrent scenarios in the upper-level business logic.
Therefore, there is an issue of disorder in the order of HOOK results for SSL_Write. The current solution cannot guarantee a perfect resolution.

Therefore, a better solution is to use the -m pcap mode.

Copy link
Member

@cfc4n cfc4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged

@cfc4n cfc4n merged commit 987b3f5 into gojue:master Mar 2, 2024
6 checks passed
@ruitianzhong ruitianzhong deleted the bug-fix branch March 2, 2024 12:01
@ruitianzhong
Copy link
Contributor Author

ruitianzhong commented Mar 2, 2024

I agree with you. There exists a semantic gap between eCapture and the application which means that eCapture might not fully understand the application-specific behavior (like parsing different interleaving protocol or concurrent scenario). When that really happen, some errors occur which is expected though.

@cfc4n cfc4n added the bug Something isn't working label Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants