Skip to content

Production-grade atomic file writer with crash safety, SHA-256 verification, and failure-aware design

License

Notifications You must be signed in to change notification settings

edbzed/fs-write-safe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fs-write-safe

License Perl CLI Platform

This tool demonstrates file handling in Perl, focusing on correctness, safety, and failure-aware design rather than convenience shortcuts.

Problem It Solves

Partial writes corrupt configs, exports, and reports during crashes. Traditional file writes are not atomic:

# DANGEROUS - Multiple failure points
open(my $fh, '>', 'config.json');  # File truncated - data lost if crash here
print $fh $data;                    # Partial data if crash here
close($fh);                         # Data may not reach disk yet

If power fails or the process crashes at any point, you get a corrupted or empty file.

How It Works

fs-write-safe guarantees atomicity through a proven sequence:

1. Write data to temp file (.tmp_file_XXXXXX)  → Original untouched
2. flush() + sync()                             → Data on disk
3. rename()                                     → Atomic operation (POSIX guarantee)

The rename() system call is atomic on POSIX systems. The target file instantly switches from old content to new content with no intermediate state visible to other processes.

Installation

# No external dependencies beyond core Perl
cd fs-write-safe
perl -Ilib bin/fs-write-safe --help

Usage

Command Line

# Write from stdin
echo '{"key": "value"}' | fs-write-safe /etc/app/config.json

# Write from file with verification
fs-write-safe --verify -i data.txt /path/to/output.txt

# Dry run - see what would happen without making changes
cat large_data.txt | fs-write-safe --dry-run --verify output.txt

# Create backup before overwriting
fs-write-safe --backup -i new_config.json /etc/app/config.json

# Verbose mode shows each step
fs-write-safe -v --verify /var/log/export.csv < report.csv

Perl API

use FSWriteSafe;

# Object-oriented interface
my $writer = FSWriteSafe->new(
    verify  => 1,        # SHA-256 verification after write
    backup  => 1,        # Keep .bak of original
    verbose => 1,        # Log each step
    mode    => 0600,     # File permissions
);

my $result = $writer->write_atomic('/path/to/file', $data);

if ($result->{success}) {
    print "Written: $result->{bytes} bytes\n";
    print "SHA-256: $result->{checksum}\n";
} else {
    die "Failed: $result->{error}\n";
}

# Verify existing file
my $check = $writer->verify_checksum('/path/to/file', $expected_sha256);
die "Corrupted!" unless $check->{valid};

Options

Option Description
-V, --verify Compute SHA-256 before write, verify after rename
-n, --dry-run Simulate without making changes
-b, --backup Create .bak of existing file
--backup-ext=EXT Custom backup extension (default: .bak)
-m, --mode=MODE File permissions in octal (default: 0644)
-i, --input=FILE Read data from FILE (use - for stdin)
-v, --verbose Show detailed progress

Exit Codes

Code Meaning
0 Success
1 Write failed
2 Sync failed
3 Atomic rename failed
4 Verification failed (checksum mismatch)
5 Target directory does not exist
6 Permission denied

Why This Matters

Durability Guarantees

The sync() call ensures data reaches physical storage before we proceed. Without this, data might sit in OS buffers and be lost on crash.

Crash Safety

At no point can a reader see partial content:

  • Before rename: They see the old file
  • After rename: They see the complete new file

Production-Grade Correctness

This pattern is used in:

  • Database write-ahead logs
  • Package managers
  • Configuration management systems
  • Any system where data integrity matters

Synthetic Data Generator

Included is fs-generate-data for creating test files of any size:

# Generate 10MB of log-like text
bin/fs-generate-data --size 10M --type text -o test.log

# Generate 1GB of binary data with progress indicator
bin/fs-generate-data --size 1G --type binary --progress -o large.bin

# Generate 1000 lines of JSON
bin/fs-generate-data --lines 1000 --type json -o data.jsonl

# Generate deterministic data (reproducible with same seed)
bin/fs-generate-data --size 1M --seed 42 -o reproducible.txt

# Pipe directly to fs-write-safe
bin/fs-generate-data --size 100M --type apache | bin/fs-write-safe --verify access.log

Data Types

Type Description
text Log-like text lines (default)
json JSONL format (one JSON object per line)
csv CSV with header row
binary Random binary bytes
pattern Repeating pattern string
zero Null bytes (sparse-file friendly)
sequence Incrementing bytes (corruption detection)
apache Apache access log format
syslog Syslog format

Running Tests

prove -l t/

Tests cover:

  • Basic write operations
  • SHA-256 verification
  • Dry-run mode
  • Backup creation
  • Binary data handling
  • Permission enforcement
  • Crash safety scenarios
  • Concurrent reader safety
  • Large file handling
  • Streaming writes

Design Decisions

  1. Temp file in same directory: Required for atomic rename (can't rename across filesystems)
  2. sync() before rename: Ensures durability, not just atomicity
  3. No symlink tricks: Direct rename is simpler and more portable
  4. Explicit error codes: Enables proper error handling in pipelines
  5. Generator support: Allows streaming writes without loading entire content into memory

Limitations

  • Atomic rename only works within the same filesystem
  • sync() has performance cost (use judiciously in hot paths)
  • Root user bypasses permission checks in tests

See Also

Author

Ed Bates — TECHBLIP LLC

License

Licensed under the Apache License, Version 2.0. See the LICENSE file for details.

About

Production-grade atomic file writer with crash safety, SHA-256 verification, and failure-aware design

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages