Skip to content

Conversation

@dietWall
Copy link

@dietWall dietWall commented Nov 19, 2025

This PR adds some additional file handling in case a randomly generated machine-id is required.
This is required for some Unit-Tests in Docker environments, where the tests consistently fails if /etc/machine-id is not mounted. (And mounting it would cause too many issues on other systems)

The behaviour should match the implementation in calyptia plugin, implemented in #9691

More specifically:
first, flb tries to read the machine-id from the default locations (/etc/machine-id, /var/lib/dbus/machine-id)
if none of them exists, on the first run a new machine-id is generated and saved to a file
on subsequent runs, the default locations fail as well. In this case fluent-bit tries to read from the previous file and returns the machine-id that has been generated earlier.

if writing a newly generated uuid to a file fails, fluent-bit would generate a different uuid in subsequent runs, in this case users should see same behaviour as before


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Adds persistent fallback storage for generated UUIDs so newly generated IDs are saved for reuse; logs a warning if saving fails.
  • Improvements

    • Enhanced machine-ID retrieval to check and use a stored fallback ID when present.
    • Improved validation and logging around UUID generation and storage.

@coderabbitai
Copy link

coderabbitai bot commented Nov 19, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Added a new static helper write_uuid_to_file(...) and integrated a /tmp/flb-machine-id fallback into flb_utils_get_machine_id(): read and sanitize that file if present, otherwise generate a UUID and attempt to persist it (log a warning if write fails).

Changes

Cohort / File(s) Summary
UUID persistence & machine-id updates
src/flb_utils.c
Added static int write_uuid_to_file(const char *filename, char *uuid) (null checks, create/truncate, write, return FLB_TRUE/FLB_FALSE). Introduced fallback_id_file = "/tmp/flb-machine-id" in flb_utils_get_machine_id(), read/sanitize the fallback file if present, and after generating a new UUID attempt to persist it to that file; on write failure log a warning and continue returning the generated UUID. Minor formatting adjustments around the new flow.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant App as Application
    participant Utils as flb_utils_get_machine_id
    participant FS as Filesystem (/tmp/flb-machine-id)

    App->>Utils: request machine id
    Utils->>FS: check/read `/tmp/flb-machine-id`
    alt valid id in file
        FS-->>Utils: return sanitized id
    else missing/invalid
        Utils->>Utils: generate new UUID
        Utils->>FS: write_uuid_to_file(new UUID)
        alt write succeeds
            FS-->>Utils: FLB_TRUE
        else write fails
            FS-->>Utils: FLB_FALSE (warning logged)
            Utils->>Utils: continue using generated UUID
        end
    end
    Utils-->>App: return machine id
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Verify write_uuid_to_file() null checks, return codes, and static visibility.
  • Confirm file open flags/mode, write length checks, and proper cleanup (close, errno handling).
  • Validate fallback file read/sanitize logic and logging on write failure.

Poem

🐰 I tucked a tiny UUID in /tmp tonight,
Quietly written by moonbeam light,
If the pen slips and the write won't stay,
I hum a warning and bounce away,
The id lives on — snug, small, and bright.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly describes the main change: adding file handling for machine-id persistence in Fluent Bit.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/flb_utils.c Outdated
Comment on lines 2114 to 2128
if (access(fallback_id_file, F_OK) == 0)
{
ret = machine_id_read_and_sanitize(fallback_id_file, &id, &bytes);
if(ret == 0)
{
if (bytes == 0) {
/* guid is somewhat corrupted */
fallback = FLB_TRUE;
goto fallback;
}
}

*out_id = id;
*out_size = bytes;
return 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard against failed read of fallback machine-id file

When the fallback machine-id file exists but machine_id_read_and_sanitize() returns an error (for example due to permissions or a short read), the code still writes *out_id = id; and *out_size = bytes; even though id/bytes were never assigned. This returns an uninitialized pointer to callers and can lead to crashes or undefined behaviour. The outputs should only be filled after a successful read or the flow should jump to the fallback generation path.

Useful? React with 👍 / 👎.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see this argument, but it should apply for other locations (dbus_var and dbus_etc) as well?
Also testing for bytes==0 does not include any ids with shorter length (what if the file was not written fully?)

I need to check if machine-id files are always in the same format and length.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/flb_utils.c (2)

2046-2069: Use const char* for parameters and consider matching the reference signature.

The parameters filename and uuid are not modified within the function, so they should be declared as const char*. Additionally, the reference implementation in plugins/custom_calyptia/calyptia.c uses flb_sds_t for the filename parameter, which may provide better consistency across the codebase.

Apply this diff to use const parameters:

-int write_uuid_to_file(char* filename, char* uuid) {
+int write_uuid_to_file(const char* filename, const char* uuid) {

2114-2129: Minor formatting inconsistency.

Line 2117 is missing a space between if and the opening parenthesis, which is inconsistent with the rest of the codebase style.

Apply this diff:

-        if(ret == 0)
+        if (ret == 0)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f8c50b and 8eae3c0.

📒 Files selected for processing (1)
  • src/flb_utils.c (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/flb_utils.c (1)
plugins/custom_calyptia/calyptia.c (1)
  • write_uuid_to_file (470-493)

src/flb_utils.c Outdated
size_t bytes;
char *uuid;
int fallback = FLB_FALSE;
char *fallback_id_file = "machine-id"; //should reside in current working directory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fallback file path is unreliable — use an absolute or well-known directory.

Using "machine-id" without a directory path means the file will be created in the current working directory, which is unpredictable and environment-dependent. This can lead to:

  • Permission denied errors in restricted directories
  • Files scattered across unexpected locations
  • Inability to persist the machine-id reliably across restarts

Consider using a well-defined location such as /var/lib/fluent-bit/machine-id, /tmp/fluent-bit-machine-id, or a configurable path.

Example fix:

-    char *fallback_id_file = "machine-id";           //should reside in current working directory
+    char *fallback_id_file = "/var/lib/fluent-bit/machine-id";

Note: You may need to ensure the directory exists or handle directory creation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
char *fallback_id_file = "machine-id"; //should reside in current working directory
char *fallback_id_file = "/var/lib/fluent-bit/machine-id";
🤖 Prompt for AI Agents
In src/flb_utils.c around line 2078, the hardcoded fallback filename
"machine-id" is unreliable because it creates files in the current working
directory; change it to use a well-defined absolute path or a configurable
location (e.g. /var/lib/fluent-bit/machine-id or /tmp/fluent-bit-machine-id),
ensure the directory exists (create it if necessary with safe permissions), fall
back to a safe temp location if creation fails, and expose the path as a
configurable option so callers can override it.

Signed-off-by: dietWall <dietrich7@gmx.de>
@dietWall dietWall marked this pull request as draft November 19, 2025 13:51
@dietWall
Copy link
Author

implementing suggested changes

…ll/fluent-bit into dietwall/test_machine_id_in_docker
@dietWall
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 19, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/flb_utils.c (1)

2079-2128: Fix use of uninitialized id/bytes when reading fallback_id_file fails.

When machine_id_read_and_sanitize(fallback_id_file, &id, &bytes) returns non‑zero, id and bytes are uninitialized, but the code still assigns them to *out_id / *out_size and returns, leading to undefined behaviour and potential crashes. The outputs should only be used on ret == 0; otherwise fall back to the UUID path.

Also, note that machine_id_read_and_sanitize() currently dereferences buf + bytes - 1 even when bytes == 0, so an empty file can cause out‑of‑bounds access before you ever see bytes == 0 here.

Suggested restructuring:

-    if (access(fallback_id_file, F_OK) == 0){
-        ret = machine_id_read_and_sanitize(fallback_id_file, &id, &bytes);
-        if (ret == 0){
-            if (bytes == 0) {
-                /* guid is somewhat corrupted */
-                fallback = FLB_TRUE;
-                goto fallback;
-            }
-        }
-
-        *out_id = id;
-        *out_size = bytes;
-        return 0;
-    }
+    if (access(fallback_id_file, F_OK) == 0) {
+        ret = machine_id_read_and_sanitize(fallback_id_file, &id, &bytes);
+        if (ret == 0) {
+            if (bytes == 0) {
+                /* guid is somewhat corrupted */
+                fallback = FLB_TRUE;
+                goto fallback;
+            }
+
+            *out_id = id;
+            *out_size = bytes;
+            return 0;
+        }
+
+        /* Failed to read/parse fallback file: treat as corrupted and
+         * fall back to a fresh UUID. */
+        fallback = FLB_TRUE;
+        goto fallback;
+    }

Separately, consider hardening machine_id_read_and_sanitize() so that a zero‑length file is handled without touching buf[-1] (e.g., detect bytes == 0 right after the read and return an error).

🧹 Nitpick comments (1)
src/flb_utils.c (1)

2232-2241: Guard machine-id persistence to Linux and update stale comment.

The review concern is valid. The fallback_id_file is defined globally at line 2080 as "/tmp/flb-machine-id" outside all platform guards, and the fallback code at lines 2232-2241 runs on all platforms. Non-Linux systems (FreeBSD, Windows, macOS, etc.) that fall through to the fallback label will attempt to write to /tmp, which is inappropriate for those platforms and will fail with unnecessary warning logs. Additionally, the comment claiming the file "should reside in current working directory" is outdated and contradicts the absolute /tmp path now used.

Implement the suggested guard:

  • Wrap the write operation (lines 2232-2241) with #ifdef __linux__ / #endif
  • Update or remove the stale comment at line 2080
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8eae3c0 and b08cc3b.

📒 Files selected for processing (1)
  • src/flb_utils.c (4 hunks)
🧰 Additional context used
🪛 ast-grep (0.40.0)
src/flb_utils.c

[warning] 2055-2055: This call makes a world-writable file which allows any user on a machine to write to the file. This may allow attackers to influence the behaviour of this process by writing to the file.
Context: 0666
Note: [CWE-732]: Incorrect Permission Assignment for Critical Resource [REFERENCES]
- https://wiki.sei.cmu.edu/confluence/display/c/FIO06-C.+Create+files+with+appropriate+access+permissions

(world-writable-file-c)

Signed-off-by: dietWall <dietrich7@gmx.de>
@dietWall dietWall marked this pull request as ready for review November 19, 2025 19:58
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +2123 to +2127
}

*out_id = id;
*out_size = bytes;
return 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return uninitialized fallback ID on read failure

If a fallback machine-id file exists but machine_id_read_and_sanitize fails (e.g., unreadable or malformed), ret remains non‑zero yet the code still assigns id/bytes to the outputs and returns success because the return block sits outside the ret == 0 guard. In that scenario id and bytes are uninitialized, so callers can receive a garbage pointer/size or crash instead of regenerating an ID. The return should stay within the successful read path or jump to the fallback path when the read fails.

Useful? React with 👍 / 👎.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/flb_utils.c (1)

2115-2128: Critical: Unguarded assignment of output parameters on read failure.

When machine_id_read_and_sanitize fails (returns non-zero), the code still assigns *out_id = id and *out_size = bytes at lines 2125-2127, even though id and bytes may be uninitialized or contain error values. This can lead to crashes or undefined behavior.

Apply this diff to only assign outputs on successful read:

     if (access(fallback_id_file, F_OK) == 0){
         ret = machine_id_read_and_sanitize(fallback_id_file, &id, &bytes);
         if (ret == 0){
             if (bytes == 0) {
                 /* guid is somewhat corrupted */
                 fallback = FLB_TRUE;
                 goto fallback;
             }
+            *out_id = id;
+            *out_size = bytes;
+            return 0;
         }
-
-        *out_id = id;
-        *out_size = bytes;
-        return 0;
     }
🧹 Nitpick comments (2)
src/flb_utils.c (2)

2046-2070: Use const parameter and wrapped I/O functions for consistency.

The uuid parameter should be const char * since it's not modified, and the function should use flb_open, flb_write, and flb_close instead of the raw POSIX calls for consistency with the similar implementation in the calyptia plugin.

Apply this diff:

-static int write_uuid_to_file(const char* filename, char* uuid)
+static int write_uuid_to_file(const char* filename, const char* uuid)
 {
     int fd;
     size_t uuid_len;
 
     if (filename == NULL || uuid == NULL) {
         return FLB_FALSE;
     }
 
     /* write uuid to file */
-    fd = open(filename, O_CREAT | O_WRONLY | O_TRUNC, 0600);
+    fd = flb_open(filename, O_CREAT | O_WRONLY | O_TRUNC, 0600);
     if (fd == -1) {
         return FLB_FALSE;
     }
 
     uuid_len = strlen(uuid);
 
-    if (write(fd, uuid, uuid_len) != uuid_len) {
-        close(fd);
+    if (flb_write(fd, uuid, uuid_len) != uuid_len) {
+        flb_close(fd);
         return FLB_FALSE;
     }
 
-    close(fd);
+    flb_close(fd);
     return FLB_TRUE;
 }

Based on learnings


2079-2079: Remove or correct the misleading comment.

The comment states "should reside in current working directory" but the path is absolute (/tmp/flb-machine-id), not relative to the current working directory.

Apply this diff:

-    char *fallback_id_file = "/tmp/flb-machine-id";           /*should reside in current working directory*/
+    char *fallback_id_file = "/tmp/flb-machine-id";
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b08cc3b and 7164855.

📒 Files selected for processing (1)
  • src/flb_utils.c (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/flb_utils.c (1)
plugins/custom_calyptia/calyptia.c (1)
  • write_uuid_to_file (470-493)
🔇 Additional comments (1)
src/flb_utils.c (1)

2234-2241: LGTM: Correct failure handling for file write.

The logic correctly checks for write failure with if (ret == FLB_FALSE) and logs an appropriate warning message. The function continues to return the generated UUID even if persistence fails, which matches the intended behavior described in the PR objectives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant