Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Botocore 1.35.45 breaks S3:SelectObjectContent #3284

Open
1 task done
bpandola opened this issue Oct 22, 2024 · 2 comments
Open
1 task done

Botocore 1.35.45 breaks S3:SelectObjectContent #3284

bpandola opened this issue Oct 22, 2024 · 2 comments
Labels
bug This issue is a confirmed bug. p0 This issue is the highest priority potential-regression Marking this issue as a potential regression to be checked by team member s3

Comments

@bpandola
Copy link

Describe the bug

The S3 action SelectObjectContent fails with the latest version of botocore. An exception is raised in the recently-added _handle_200_error handler (see PR #3276).

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Confirmed working in previous version(s) of botocore.

Current Behavior

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../env/python3.11/lib/python3.11/site-packages/botocore/client.py:569: in _api_call
    return self._make_api_call(operation_name, kwargs)
../env/python3.11/lib/python3.11/site-packages/botocore/client.py:1005: in _make_api_call
    http, parsed_response = self._make_request(
../env/python3.11/lib/python3.11/site-packages/botocore/client.py:1029: in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
../env/python3.11/lib/python3.11/site-packages/botocore/endpoint.py:119: in make_request
    return self._send_request(request_dict, operation_model)
../env/python3.11/lib/python3.11/site-packages/botocore/endpoint.py:197: in _send_request
    success_response, exception = self._get_response(
../env/python3.11/lib/python3.11/site-packages/botocore/endpoint.py:239: in _get_response
    success_response, exception = self._do_get_response(
../env/python3.11/lib/python3.11/site-packages/botocore/endpoint.py:306: in _do_get_response
    self._event_emitter.emit(
../env/python3.11/lib/python3.11/site-packages/botocore/hooks.py:412: in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
../env/python3.11/lib/python3.11/site-packages/botocore/hooks.py:256: in emit
    return self._emit(event_name, kwargs)
../env/python3.11/lib/python3.11/site-packages/botocore/hooks.py:239: in _emit
    response = handler(**kwargs)
../env/python3.11/lib/python3.11/site-packages/botocore/handlers.py:1252: in _handle_200_error
    if _looks_like_special_case_error(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

status_code = 200, body = <urllib3.response.HTTPResponse object at 0x10f687a60>

    def _looks_like_special_case_error(status_code, body):
        if status_code == 200 and body:
            try:
                parser = ETree.XMLParser(
                    target=ETree.TreeBuilder(), encoding='utf-8'
                )
>               parser.feed(body)
E               TypeError: a bytes-like object is required, not 'HTTPResponse'

../env/python3.11/lib/python3.11/site-packages/botocore/handlers.py:174: TypeError

Reproduction Steps

import uuid
import gzip
import json
import boto3

NESTED_JSON = {"a1": {"b1": "b2"}, "a2": [True, False], "a3": True, "a4": [1, 5]}

client = boto3.client("s3")
bucket_name = str(uuid.uuid4())
client.create_bucket(Bucket=bucket_name)
client.put_object(
    Bucket=bucket_name,
    Key="json.gzip",
    Body=gzip.compress(json.dumps(NESTED_JSON).encode("utf-8")),
)
client.select_object_content(
    Bucket=bucket_name,
    Key="json.gzip",
    Expression="SELECT count(*) FROM S3Object",
    ExpressionType="SQL",
    InputSerialization={"JSON": {"Type": "DOCUMENT"}, "CompressionType": "GZIP"},
    OutputSerialization={"JSON": {"RecordDelimiter": ","}},
)

Possible Solution

The new handler has a guard clause checking if operation_model.has_streaming_output but it may also need to guard against has_event_stream_output.

Additional Information/Context

No response

SDK version used

1.35.45

Environment details (OS name and version, etc.)

MacOS, Python 3.11

@bpandola bpandola added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Oct 22, 2024
@github-actions github-actions bot added the potential-regression Marking this issue as a potential regression to be checked by team member label Oct 22, 2024
@zsaltys
Copy link

zsaltys commented Oct 22, 2024

+1 seeing the same issue, breaking a lot of stuff

@tim-finnigan tim-finnigan added the p0 This issue is the highest priority label Oct 22, 2024
@tim-finnigan
Copy link
Contributor

Thanks for reporting this issue, we were able to reproduce this in 1.35.45. A fix is pending release here: #3285.

@tim-finnigan tim-finnigan added s3 and removed needs-triage This issue or PR still needs to be triaged. labels Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. p0 This issue is the highest priority potential-regression Marking this issue as a potential regression to be checked by team member s3
Projects
None yet
Development

No branches or pull requests

3 participants