Fix stderr closure bug in LogFile::check_fd() using fstat() #12407

JakeChampion · 2025-07-31T08:00:57Z

potentially fixes #12197 and #8955

The check_fd() function had a potential bug where it would close and reopen
log files to verify existence, which could inadvertently close stderr/stdout
when logging to these special files, breaking error reporting for the entire
process.

Also, the closing of stderr makes fd(2) available for whoever calls open() next, which for us, was a rocksdb sst file and caused our rocksdb instance to become corrupted.

Fixed by replacing the access() + close/reopen pattern with fstat()
on the open file descriptor. This approach:

Uses st_nlink == 0 to detect when regular files have been unlinked
Never triggers for special files like stderr/stdout (they maintain st_nlink > 0)

The check_fd() function had a critical bug where it would close and reopen log files to verify existence, which could inadvertently close stderr/stdout when logging to these special files, breaking error reporting for the entire process. Fixed by replacing the unsafe access() + close/reopen pattern with fstat() on the open file descriptor. This approach: - Uses st_nlink == 0 to detect when regular files have been unlinked - Never triggers for special files like stderr/stdout (they maintain st_nlink > 0) - Eliminates string comparisons and special-case handling - Provides universal safety across all file descriptor types - Maintains detection of externally moved/deleted log files for rotation

Copilot

Pull Request Overview

This PR fixes a critical bug in the LogFile::check_fd() function where closing and reopening log files could inadvertently close stderr/stdout when logging to these special files, potentially corrupting the process's error reporting and other file descriptors.

Replaces the access() + close/reopen pattern with fstat() on the open file descriptor
Uses st_nlink == 0 to detect unlinked regular files while preserving special files like stderr/stdout
Adds fallback logic for cases where fstat() fails or no valid file descriptor exists

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-02T21:46:38Z

src/proxy/logging/LogFile.cc

+    if (m_name && is_open()) {
+      int fd = get_fd();
+      if (fd >= 0) {
+        struct stat st;
+        if (fstat(fd, &st) == 0) {
+          // If the file has been unlinked, st_nlink will be 0
+          // This only happens for regular files, not special files like stderr/stdout
+          if (st.st_nlink == 0) {
+            close_file();
+          }
+        } else {
+          // fstat failed, the file descriptor may be invalid
+          // Fall back to path-based existence check
+          if (!LogFile::exists(m_name)) {
+            close_file();
+          }
+        }
+      } else {
+        // No valid fd, fall back to path-based check
+        if (!LogFile::exists(m_name)) {
+          close_file();
+        }
+      }
    }


The nested conditional structure creates excessive complexity with duplicated fallback logic. Consider extracting the fallback check into a helper function or restructuring the conditions to reduce nesting and eliminate the duplicate LogFile::exists(m_name) checks.

bneradt

Looks good. You might be fixing an issue commented in the autests too. Can you please try running with this applied to your patch:

diff --git a/tests/gold_tests/logging/log-filenames.test.py b/tests/gold_tests/logging/log-filenames.test.py
index a4fdbc73b..ec6e8928a 100644
--- a/tests/gold_tests/logging/log-filenames.test.py
+++ b/tests/gold_tests/logging/log-filenames.test.py
@@ -257,11 +257,4 @@ class stderrTest(LogFilenamesTest):
 DefaultNamedTest()
 CustomNamedTest()
 stdoutTest()
-
-# The following stderr test can be run successfully by hand using the replay
-# files from the sandbox. All the expected output goes to stderr. However, for
-# some reason during the AuTest run, the stderr output stops emitting after the
-# logging.yaml file is parsed. This is left here for now because it is valuable
-# for use during development, but it is left commented out so that it doesn't
-# produce the false failure in CI and developer test runs.
-# stderrTest()
+stderrTest()

That patch works fine for me locally with or without your patch, but I wonder whether it's related to what you are fixing.

JakeChampion · 2025-09-03T10:02:13Z

Looks good. You might be fixing an issue commented in the autests too. Can you please try running with this applied to your patch:

diff --git a/tests/gold_tests/logging/log-filenames.test.py b/tests/gold_tests/logging/log-filenames.test.py
index a4fdbc73b..ec6e8928a 100644
--- a/tests/gold_tests/logging/log-filenames.test.py
+++ b/tests/gold_tests/logging/log-filenames.test.py
@@ -257,11 +257,4 @@ class stderrTest(LogFilenamesTest):
 DefaultNamedTest()
 CustomNamedTest()
 stdoutTest()
-
-# The following stderr test can be run successfully by hand using the replay
-# files from the sandbox. All the expected output goes to stderr. However, for
-# some reason during the AuTest run, the stderr output stops emitting after the
-# logging.yaml file is parsed. This is left here for now because it is valuable
-# for use during development, but it is left commented out so that it doesn't
-# produce the false failure in CI and developer test runs.
-# stderrTest()
+stderrTest()

That patch works fine for me locally with or without your patch, but I wonder whether it's related to what you are fixing.

I pushed that patch up, the tests passed with it ☺️

bneradt

Thank you for contributing the fix!

JakeChampion force-pushed the jake/logfile-check_fd branch from 76b1841 to 7081c6c Compare July 31, 2025 08:21

JakeChampion changed the title ~~Fix potential stderr closure bug in LogFile::check_fd()~~ Fix stderr closure bug in LogFile::check_fd() using fstat() Jul 31, 2025

JakeChampion force-pushed the jake/logfile-check_fd branch from 7081c6c to c1faa2d Compare July 31, 2025 08:22

JakeChampion force-pushed the jake/logfile-check_fd branch from c1faa2d to e4157be Compare July 31, 2025 08:23

JakeChampion marked this pull request as ready for review July 31, 2025 09:58

bneradt self-requested a review August 4, 2025 22:11

bneradt assigned JakeChampion Aug 4, 2025

bneradt added Logging Bug labels Aug 4, 2025

bneradt added this to the 10.2.0 milestone Aug 4, 2025

bneradt requested a review from Copilot September 2, 2025 21:45

Copilot AI reviewed Sep 2, 2025

View reviewed changes

bneradt reviewed Sep 2, 2025

View reviewed changes

Enables stderr test in logging test suite

912fae0

bneradt approved these changes Sep 3, 2025

View reviewed changes

bneradt merged commit b13b985 into apache:master Sep 3, 2025
15 checks passed

JakeChampion deleted the jake/logfile-check_fd branch September 3, 2025 14:56

JakeChampion mentioned this pull request Sep 15, 2025

Logs to stdout or stderr stops after a few minutes #8955

Closed

bneradt linked an issue Dec 1, 2025 that may be closed by this pull request

Logs to stdout or stderr stops after a few minutes #8955

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix stderr closure bug in LogFile::check_fd() using fstat() #12407

Fix stderr closure bug in LogFile::check_fd() using fstat() #12407

Uh oh!

JakeChampion commented Jul 31, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 2, 2025

Uh oh!

bneradt left a comment

Uh oh!

JakeChampion commented Sep 3, 2025

Uh oh!

bneradt left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix stderr closure bug in LogFile::check_fd() using fstat() #12407

Fix stderr closure bug in LogFile::check_fd() using fstat() #12407

Uh oh!

Conversation

JakeChampion commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

bneradt left a comment

Choose a reason for hiding this comment

Uh oh!

JakeChampion commented Sep 3, 2025

Uh oh!

bneradt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JakeChampion commented Jul 31, 2025 •

edited

Loading