-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add S3 input to retrieve logs from AWS S3 buckets #12640
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see this going on! I left a few comments & questions
Question: When sqs message points to a s3 object that is not readable or does not exist anymore, we get error message |
This sounds like the kind of error I would like to be notified about, probably something is not configured well if this happens? |
I saw it happen when:
Maybe I should have a WARN log message when this happens and report an event with empty |
It sounds like this should be at ERROR level, as it's really something the user should look into. As we don't have any message retrieved I don't think we need to send anything to the output. |
Sounds good to me! Thanks! This is what I currently have so I will keep the same behavior :-) |
jenkins test this please |
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm excited to see this feature. I left a few minor issues and some questions.
This reverts commit a53de38c7912b2bbdd3eea72b2316e330c560a75.
Different logs from different services can be stored in S3. For example:
With all the different logs in S3 from different services, it will be good to have a dedicated Filebeat input to retrieve raw lines from S3 objects. To avoid significant lagging with a polling-based s3 only input, we agree on a combination of notification-based and polling-based approach: s3-sqs filebeat input. This requires extra setups in AWS S3 and SQS to add a notification configuration requesting S3 to publish specific type of events to SQS queue.
Right now with this PR, when s3 input is enabled, you can start seeing events in ES for log messages that are retrieved from S3 buckets:
For configuration:
Ideally with this config, s3 input will go to the specified
queueURLs
to get messages and read them. If the message haseventSource == aws:s3
&&eventName == ObjectCreated:Put
andbucket.name
is included bybucketNames
from the config, then read the S3 object that's specified in this SQS message.