-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filelogreceiver]: Add ability to sort by mtime #28691
[filelogreceiver]: Add ability to sort by mtime #28691
Conversation
@@ -133,11 +169,12 @@ func (m Matcher) MatchFiles() ([]string, error) { | |||
if len(files) == 0 { | |||
return files, errors.Join(fmt.Errorf("no files match the configured criteria"), errs) | |||
} | |||
if len(m.filterOpts) == 0 { | |||
|
|||
if m.f == nil || m.f.SkipFiltering() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need SkipFiltering
? Isn't it enough to check if m.f == nil
?
@@ -50,7 +50,8 @@ Tails and parses logs from files. | |||
| `retry_on_failure.initial_interval` | `1s` | [Time](#time-parameters) to wait after the first failure before retrying. | | |||
| `retry_on_failure.max_interval` | `30s` | Upper bound on retry backoff [interval](#time-parameters). Once this value is reached the delay between consecutive retries will remain constant at the specified value. | | |||
| `retry_on_failure.max_elapsed_time` | `5m` | Maximum amount of [time](#time-parameters) (including retries) spent trying to send a logs batch to a downstream consumer. Once this value is reached, the data is discarded. Retrying never stops if set to `0`. | |||
| `ordering_criteria.regex` | | Regular expression used for sorting, should contain a named capture groups that are to be used in `regex_key`. | | |||
| `ordering_criteria.method` | `regex` | Specifies the mode to use for ordering. Can be `regex` (sort using regex and sort_by rules) or `mtime` (sort by file modified time). | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this serve basically the same purpose as ordering_criteria.sort_by.sort_type
? Why not add mtime
as an additional option there?
This would also allow users to compose multiple types of ordering and filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got a little lost looking at the current implementation where every sort type requires the regex to be set, I'll take a stab at reworking to have to be a separate sort type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New implementation for sort_type here: #28850
Closing in favor of #28850 |
Description:
ordering_criteria.mode
config option to theordering_criteria
config, which can be used to select the way ordering criteria is applied (regex
ormtime
)mtime
modeIn order to do this, filtering was abstracted out into its own
Filter
interface, where there are now two implementations (a "Regex" implementation and a "mtime" implementation).An optional follow-up performance improvement may be made here, to have the finder return fs.DirEntry directly to query the mtime without making an extra call to os.Stat for each file.
Link to tracking Issue: #27812
Testing:
Documentation:
mode
parameter to filelogreceiver docs