-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filebeat][azure-blob-storage] - Added support for new features and removed partial save mechanism #36690
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial look. I will have a deeper read tomorrow.
if j.offset == 0 { | ||
r, j.isRootArray, err = evaluateJSON(bufio.NewReader(r)) | ||
if err != nil { | ||
return fmt.Errorf("failed to evaluate json for blob: %s, with error: %w", *j.blob.Name, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is j.blob.Name
always non-nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, a file/blob name will always exist, unless there is some unexpected issue from azure's end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check that at the site that the value becomes ours?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we fetch the blob pager, this value becomes localized on the machine.
x-pack/filebeat/input/azureblobstorage/mock/testdata/array-at-root.json
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few nits..
Overall the unit testing seems to be limited. How is this tested otherwise?
This pull request is now in conflicts. Could you fix it? 🙏
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But please wait for @efd6 approval
[float] | ||
==== `timestamp_epoch` | ||
|
||
This attribute can be used to filter out files/blobs which have a timestamp greater than the specified value. The value of this attribute should be in unix `epoch` (seconds) format. The timestamp value is compared with the `LastModified Timestamp` obtained from the blob metadata. This attribute can be specified both at the root level of the configuration as well at the container level. The container level values will always take priority and override the root level values if both are specified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v.Properties.LastModified.Unix() < *s.src.TimeStampEpoch { continue
The description doesn't match up with the code I saw. It seems like it filters blobs that are older than the value.
What is the use case for filtering based on a static value? I've seen inputs offer a relative time based filter when user specifies a duration since now like ignore_older: 168h
or since: -168h
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have updated the description. The use case here is that some customers in the past specially asked for the ability to filter out based on a specific date time, hence added this
// isValidUnixTimestamp checks if the timestamp is a valid Unix timestamp | ||
func isValidUnixTimestamp(timestamp int64) bool { | ||
// checks if the timestamp is within the valid range | ||
return minTimestamp <= timestamp && timestamp <= maxTimestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What determines the valid range? I can understand validating that the value is non-negative. But why is there an upper range?
The only reason I can think of for checking the upper value is as a sanity check. It probably does not make sense to configure any value that is in the future.
The bound checking can be added to the struct tag and ucfg will enforce it. Like validate:"min=1"
for >=1. https://pkg.go.dev/github.com/elastic/go-ucfg#Config.Unpack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to this being an optional pointer value, struct tags validation causes a nil pointer deference. And yes this is used as a sanity check for permissible timestamp values.
@efd6 updated the PR with the removal of the partial saves |
…emoved partial save mechanism (elastic#36690) * added new debug logs and updated deprecated library methods * added support for path prefix, date filter and root level array splitting * added support for fileselectors, timestamp filtering and expand event field * updated asciidoc, updated comments * updated changelog * addressed PR suggestions * addressed linting errors * addressed further PR suggestions * updated FileSelectors assignment condition * partial save mechanism removed for now due to concurrency issues * updated changelog
Type of change
Proposed commit message
Added support for the following features as part of a planned enhancement update:
These additions help make the input more robust for further use. Also added a couple of debug logs for better troubleshooting.
Checklist
- [ ] I have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
How to test this PR locally
Related issues
Use cases
Screenshots
Logs