Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry s3 reads on socket exceptions. #2992

Merged
merged 2 commits into from
Jul 11, 2023

Conversation

asuresh8
Copy link
Contributor

@asuresh8 asuresh8 commented Jul 7, 2023

S3 will reset the conenction on their end frequently. To not lose data, data prepper should retry all socket exceptions by attempting to re-open the stream.

Description

Retries all socket exceptions

Issues Resolved

S3 socket exceptions causing records to get dropped

Check List

  • New functionality includes testing.
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Collaborator

@engechas engechas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me.

I think we also need to bubble up any fatal exceptions from this class here to prevent deleting the SQS message on read failure

@asuresh8
Copy link
Contributor Author

asuresh8 commented Jul 7, 2023

Changes look good to me.

I think we also need to bubble up any fatal exceptions from this class here to prevent deleting the SQS message on read failure

Good call. I've added a commit to bubble up that exception.

engechas
engechas previously approved these changes Jul 7, 2023
S3 will reset the conenction on their end frequently. To not lose data,
data prepper should retry all socket exceptions by attempting to re-open
the stream.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>
Signed-off-by: Adi Suresh <adsuresh@amazon.com>
@engechas engechas merged commit 9f78542 into opensearch-project:main Jul 11, 2023
26 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.3 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.3 2.3
# Navigate to the new working tree
cd .worktrees/backport-2.3
# Create a new branch
git switch --create backport/backport-2992-to-2.3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9f78542533dd24ed21e29a12950938c0c4b23636
# Push it to GitHub
git push --set-upstream origin backport/backport-2992-to-2.3
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.3

Then, create a pull request where the base branch is 2.3 and the compare/head branch is backport/backport-2992-to-2.3.

opensearch-trigger-bot bot pushed a commit that referenced this pull request Jul 11, 2023
* Retry s3 reads on socket exceptions.

S3 will reset the conenction on their end frequently. To not lose data,
data prepper should retry all socket exceptions by attempting to re-open
the stream.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

* Bubble up parquet exceptions.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

---------

Signed-off-by: Adi Suresh <adsuresh@amazon.com>
(cherry picked from commit 9f78542)
asifsmohammed pushed a commit that referenced this pull request Jul 11, 2023
* Retry s3 reads on socket exceptions.

S3 will reset the conenction on their end frequently. To not lose data,
data prepper should retry all socket exceptions by attempting to re-open
the stream.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

* Bubble up parquet exceptions.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

---------

Signed-off-by: Adi Suresh <adsuresh@amazon.com>
(cherry picked from commit 9f78542)

Co-authored-by: Adi Suresh <adsuresh@amazon.com>
chenqi0805 pushed a commit that referenced this pull request Jul 19, 2023
* Retry s3 reads on socket exceptions.

S3 will reset the conenction on their end frequently. To not lose data,
data prepper should retry all socket exceptions by attempting to re-open
the stream.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

* Bubble up parquet exceptions.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

---------

Signed-off-by: Adi Suresh <adsuresh@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
MaGonzalMayedo pushed a commit to MaGonzalMayedo/data-prepper that referenced this pull request Jul 25, 2023
* Retry s3 reads on socket exceptions.

S3 will reset the conenction on their end frequently. To not lose data,
data prepper should retry all socket exceptions by attempting to re-open
the stream.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

* Bubble up parquet exceptions.

Signed-off-by: Adi Suresh <adsuresh@amazon.com>

---------

Signed-off-by: Adi Suresh <adsuresh@amazon.com>
Signed-off-by: Marcos Gonzalez Mayedo <alemayed@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants