Skip to content
This repository was archived by the owner on Feb 2, 2021. It is now read-only.
This repository was archived by the owner on Feb 2, 2021. It is now read-only.

duplicate bytes are sent to blob when a buffer exceeds the append block size #24

Open
@avoltz

Description

@avoltz

There seems to be an off-by-one error when a buffer is sent in two append ops to storage.

To repro this, I made a large text file:

$ yes 'a' | head -n 5000000 > bytetest.txt

Then configured fluentd with this:

<source>
  @type tail
  path /var/log/bytetest
  pos_file /var/log/fluentd/bytetest.pos
  tag bytetest
  read_from_head true
  <parse>
    @type none
  </parse>
</source>
<match bytetest>
  @type azure-storage-append-blob
  azure_storage_account          <account>
  azure_storage_access_key       <key>
  azure_container                mycontainer
  auto_create_container  true
  path /
  azure_object_key_format           bytetest.log
  time_slice_format                 %Y-%m-%d/%Y-%m-%dT%H:%M:00
  <format>
      @type single_value
  </format>
  <buffer tag,time>
    @type file
    path /var/log/fluentd/azblob.bytetest
    flush_mode interval
    flush_at_shutdown false
    timekey 60 # 1 minute
    timekey_wait 60
  </buffer>
</match>

Then cat these contents to the file to get fluentd to buffer the entire thing:

$ sudo touch /var/log/bytetest
$ cat bytetest.txt | sudo tee -a /var/log/bytetest > /dev/null

The resulting file has extra bytes.

aadmin@atf5f7ce0c04c-linux-1:~$ diff -u bytetest.txt bytetest2.txt
--- bytetest.txt	2020-10-07 13:06:30.144485029 +0000
+++ bytetest2.txt	2020-10-07 13:16:54.164703237 +0000
@@ -2097150,6 +2097150,7 @@
 a
 a
 a
+
 a
 a
 a
@@ -4194301,7 +4194302,7 @@
 a
 a
 a
-a
+aa
 a
 a
 a

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions