Skip to content

file_chunk: add stricter checks for broken meta files #4998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Watson1978
Copy link
Contributor

@Watson1978 Watson1978 commented Jun 9, 2025

Which issue(s) this PR fixes:
Fixes #

What this PR does / why we need it:
This PR improves meta file corruption checking.

The meta file contains at least the following field values.

data = @metadata.to_h.merge({
id: @unique_id,
s: (update ? @size + @adding_size : @size),
c: @created_at,
m: (update ? Fluent::Clock.real_now : @modified_at),
})

It might be possible that the @size is 0.
@unique_id, @created_at, and @modified_at are set when FileChunk is initialized, so they definitely have some values.
I think these fields should be written in meta file.

So, this PR adds the id, c, and m fields check.

This PR reinforces #1874.

Without this changes, it might causes following error when launch fluentd every time with broken meta file:

2025-06-06 12:11:26 +0900 [error]: unexpected error while checking flushed chunks. ignored. error_class=NoMethodError error="undefined method '<' for nil"
  2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/output.rb:1479:in 'block in Fluent::Plugin::Output#enqueue_thread_run'
  2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/buffer.rb:548:in 'block in Fluent::Plugin::Buffer#enqueue_all'
  2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/buffer.rb:542:in 'Array#each'
  2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/buffer.rb:542:in 'Fluent::Plugin::Buffer#enqueue_all'
  2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/output.rb:1479:in 'Fluent::Plugin::Output#enqueue_thread_run'
  2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin_helper/thread.rb:78:in 'block in Fluent::PluginHelper::Thread#thread_create'

Docs Changes:

Release Note:

@Watson1978 Watson1978 force-pushed the file_chunk branch 4 times, most recently from 2487088 to ae638cb Compare June 10, 2025 08:16
Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
@bufdir = File.expand_path('../../tmp/broken_buffer_file', __FILE__)
FileUtils.rm_rf @bufdir rescue nil
FileUtils.mkdir_p @bufdir
@bufdir = Dir.mktmpdir
Copy link
Contributor Author

@Watson1978 Watson1978 Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, FileUtils.rm_rf sometimes fails to execute if @bufdir already exists on Windows platform, and it makes the test unstable.

Therefore Dir.mktmpdir is used instead in here.

@Watson1978 Watson1978 marked this pull request as ready for review June 11, 2025 04:08
@Watson1978 Watson1978 requested review from kenhys and daipom June 11, 2025 04:27
Comment on lines +223 to +225
raise FileChunkError, "invalid unique_id" unless data[:id]
raise FileChunkError, "invalid created_at" unless data[:c].to_i > 0
raise FileChunkError, "invalid modified_at" unless data[:m].to_i > 0
Copy link
Contributor Author

@Watson1978 Watson1978 Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restore_metadata has been invoked only when *.meta file is existed.
This means that write_metadata has been invoked before, and the meta file was created.

In here, it checks the field to be written by write_metadata has the appropriate value.
If the value is not appropriate, it is likely that some kind of error occurred during writing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant