Skip to content

RollingFileAppender rotates files incorrectly on ext4 due to filesystem timestamp caching #3068

Open
@kelunik

Description

@kelunik

Description

RollingFile rotates files incorrectly if an event is logged shortly (in our case up to 10ms) after the rollover time and the application is restarted within the same rollover period on ext4 (Linux).

Configuration

Version: 2.24.1

Operating system: Debian 11 / Linux 5.10.0-32-amd64

JDK: 21.0.4 Corretto

<RollingFile name="app" fileName="app.log" filePattern="app-%d{yyyy-MM-dd}.log">
   <EcsLayout serviceName="foobar"/><!-- layout doesn't matter -->
   <Policies>
      <TimeBasedTriggeringPolicy modulate="true" interval="1"/>
   </Policies>
</RollingFile>

Reproduction

We've been into a deep rabbit hole to analyze these conditions, but here's what happens and how to reproduce:

  • Let's say we use daily rotation and an event gets logged at 2024-10-07 00:00:000.000000000 (you can also use minute based rotations for easier reproduction)
  • TimeBasedTriggeringPolicy#isTriggeringEvent will return true, which results in a rotation
    • The current app.log file will rotate to app-2024-10-07.log
    • RollingFileManager#createFileAfterRollover will create a new app.log (in FileManager#createOutputStream)
    • The new file apparently gets a creationTime set, but that call is ignored on ext3 / ext4 filesystems.
    • Instead, the created file will get its creation timestamp assigned by the OS, which happens to use current_fs_time() on ext4 – current_fs_time() returns a cached timestamp that's only updated on kernel clock ticks (see getconf CLK_TCK)
    • The default frequency for kernel clock ticks is 100 / second, so the cached timestamp from current_fs_time() might lag behind up to 10 milliseconds
    • We end up with a file creation time of 2024-10-06 23:59:59.997423036 (use stat app.log to confirm)
  • After the night, we restart the application (e.g. due to a deployment)
  • TimeBasedTriggeringPolicy#initialize reads RollingFileManager#getFileTime (initalized by RollingFileManager#initialFileTime), which returns the above borked timestamp.
  • It will also calculate the next rollover to be 2024-10-07 00:00:000.000000000, which in turn will trigger a new rollover right on application start.
  • This "unplanned" rollover will move the current app.log to app-2024-10-06.log, therefore leading to:
    • Loss of the original app-2024-10-06.log file contents
    • Log events of 2024-10-07 00:00:00 until the restart being unexpectedly in app-2024-10-06.log instead of app.log (if still on the same day) or app-2024-10-07.log (later)
    • Unexpected file rotation, causing tail -f app.log to hang (our original symptom) – tail will (by default) follow the file descriptor on the rename to app-2024-10-06.log and new events will be written to app.log instead.

Metadata

Metadata

Assignees

Labels

appenders:RollingAffects log file rolling functionalitybugIncorrect, unexpected, or unintended behavior of existing codewaiting-for-maintainer

Type

No type

Projects

Status

To triage

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions