Skip to content

file_chunk: add stricter checks for broken meta files #4998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions lib/fluent/plugin/buffer/file_chunk.rb
Original file line number Diff line number Diff line change
Expand Up @@ -219,13 +219,17 @@ def restore_metadata(bindata)
# old type of restore
data = Fluent::MessagePackFactory.msgpack_unpacker(symbolize_keys: true).feed(bindata).read rescue {}
end
raise FileChunkError, "invalid meta data" if data.nil? || !data.is_a?(Hash)
raise FileChunkError, "invalid unique_id" unless data[:id]
raise FileChunkError, "invalid created_at" unless data[:c].to_i > 0
raise FileChunkError, "invalid modified_at" unless data[:m].to_i > 0
Comment on lines +223 to +225
Copy link
Contributor Author

@Watson1978 Watson1978 Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restore_metadata has been invoked only when *.meta file is existed.
This means that write_metadata has been invoked before, and the meta file was created.

In here, it checks the field to be written by write_metadata has the appropriate value.
If the value is not appropriate, it is likely that some kind of error occurred during writing.


now = Fluent::Clock.real_now

@unique_id = data[:id] || self.class.unique_id_from_path(@path) || @unique_id
@unique_id = data[:id]
@size = data[:s] || 0
@created_at = data.fetch(:c, now.to_i)
@modified_at = data.fetch(:m, now.to_i)
@created_at = data[:c]
@modified_at = data[:m]

@metadata.timekey = data[:timekey]
@metadata.tag = data[:tag]
Expand Down
31 changes: 28 additions & 3 deletions test/plugin/test_buf_file.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1180,9 +1180,7 @@ def write_metadata(path, chunk_id, metadata, size, ctime, mtime)
sub_test_case 'there are existing broken file chunks' do
setup do
@id_output = 'backup_test'
@bufdir = File.expand_path('../../tmp/broken_buffer_file', __FILE__)
FileUtils.rm_rf @bufdir rescue nil
FileUtils.mkdir_p @bufdir
@bufdir = Dir.mktmpdir
Copy link
Contributor Author

@Watson1978 Watson1978 Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, FileUtils.rm_rf sometimes fails to execute if @bufdir already exists on Windows platform, and it makes the test unstable.

Therefore Dir.mktmpdir is used instead in here.

@bufpath = File.join(@bufdir, 'broken_test.*.log')

Fluent::Test.setup
Expand All @@ -1197,6 +1195,7 @@ def write_metadata(path, chunk_id, metadata, size, ctime, mtime)
@p.close unless @p.closed?
@p.terminate unless @p.terminated?
end
FileUtils.rm_rf(@bufdir) rescue nil
end

def setup_plugins(buf_conf)
Expand Down Expand Up @@ -1320,6 +1319,32 @@ def compare_log(plugin, msg)
compare_log(@p, 'enqueued meta file is broken')
assert { not File.exist?(p2) }
assert { File.exist?("#{@bufdir}/backup/worker0/#{@id_output}/#{@d.dump_unique_id_hex(c2id)}.log") }

# broken id, c, m fields in meta data
c3id, p3 = create_first_chunk('q')
metadata = File.read(p3 + '.meta')
File.open(p3 + '.meta', 'wb') { |f| f.write(metadata[0..6] + "\0" * (metadata.size - 6)) } # create enqueued broken meta file

Fluent::SystemConfig.overwrite_system_config('root_dir' => @bufdir) do
@p.start
end

compare_log(@p, 'enqueued meta file is broken')
assert { not File.exist?(p3) }
assert { File.exist?("#{@bufdir}/backup/worker0/#{@id_output}/#{@d.dump_unique_id_hex(c3id)}.log") }

# truncate meta data
c4id, p4 = create_first_chunk('q')
metadata = File.read(p4 + '.meta')
File.open(p4 + '.meta', 'wb') { |f| f.write(metadata[0..-2]) } # create enqueued broken meta file with last byte truncated

Fluent::SystemConfig.overwrite_system_config('root_dir' => @bufdir) do
@p.start
end

compare_log(@p, 'enqueued meta file is broken')
assert { not File.exist?(p4) }
assert { File.exist?("#{@bufdir}/backup/worker0/#{@id_output}/#{@d.dump_unique_id_hex(c4id)}.log") }
end

test '#resume throws away broken chunk with disable_chunk_backup' do
Expand Down