Skip to content

hungpdn/wal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WAL: High-Performance Write-Ahead Log in Go

Go Version License Go Report Card

A high-performance, concurrent-safe, and crash-resilient Write-Ahead Log (WAL) library for Go. Designed for building databases, message queues, or any system requiring data durability.

Features

  • 🚀 High Performance: Buffered I/O, optimized locking strategies and non-blocking background sync.
  • 🛡️ Data Integrity: CRC32 checksums (Castagnoli), append-only logic and automatic corruption repair on startup.
  • 🧵 Concurrency Safe: Thread-safe writers and readers using sync.RWMutex and atomic operations.
  • 🔄 Log Rotation: Automatic segment rotation based on configurable size.
  • 💾 Flexible Sync Strategies: Choose between Performance (Background), Safety (Always), or Balance (OSCache).
  • 🔍 Iterator API: Memory-efficient sequential reading of logs.
  • Optimized Startup: Uses reverse scanning to instantly recover the last segment state without reading the whole file.
  • 🧹 Retention Policies: Automatic cleanup based on TTL (Time-To-Live) or Total Size.

Roadmap

The following features are planned for future releases:

  • v0.1.1 - Compression Support: Add Snappy/Zstd compression for payloads to reduce disk usage.
  • v0.1.2 - Sparse Indexing: Implement a sidecar .idx file to support O(1) lookup time for Seek(SeqID).
  • v0.1.3 - Metrics & Observability: OpenTelemetry / Prometheus integration for monitoring throughput and latency.
  • v0.2.0 - Replication Hooks: APIs to support streaming WAL entries to other nodes (Raft/Paxos integration).

Architecture

On-Disk Format

Each segment file consists of a sequence of binary encoded entries.

+-------------------+-------------------+-------------------+----------------------+-------------------+
|   CRC32 (4 bytes) |   Size (8 bytes)  |   SeqID (8 bytes) |   Payload (N bytes)  |   Size (8 bytes)  |
+-------------------+-------------------+-------------------+----------------------+-------------------+
| Checksum of Data  | Length of Payload | Monotonic ID      | The actual data      | Backward Pointer  |
+-------------------+-------------------+-------------------+----------------------+-------------------+
  • CRC (Cyclic Redundancy Check): Ensures data integrity.
  • Size: Enable fast forward reading (skipping payloads).
  • SeqID: Global Sequence ID
  • Payload: The actual data.
  • Size (Footer): Enable fast reverse reading for optimized startup recovery.

Installation

go get github.com/hungpdn/wal

Usage

Writing Data

package main

import (
 "log"
 "github.com/hungpdn/wal"
)

func main() {
 cfg := wal.Config{
    SegmentSize:  10 * 1024 * 1024, // 10MB
    SyncStrategy: wal.SyncStrategyOSCache,
 }

 w, _ := wal.Open("./wal_data", &cfg)
 defer w.Close()

 // Write data
 w.Write([]byte("Hello WAL"))
 w.Write([]byte("Another log entry"))
}

Reading Data (Replay)

w, _ := wal.Open("", &cfg) // Auto-recovers on open

iter, _ := w.NewReader()
defer iter.Close()

for iter.Next() {
   data := iter.Value()
   log.Printf("Log Data: %s", string(data))
}

if err := iter.Err(); err != nil {
   log.Printf("Error reading wal: %v", err)
}

Config

Field Type Default Description
SegmentSize int64 10MB Max size of a single segment file before rotation.
BufferSize int 4KB Size of the in-memory buffer.
SyncStrategy int Background 0: Background, 1: Always (Fsync), 2: OSCache (Recm).
SyncInterval uint 1000ms Interval for background sync execution.
Mode int 0 0: debug, 1: prod.

Sync Strategies

  • SyncStrategyBackground (0): Fastest. Writes to buffer. OS handles disk sync. Risk of data loss on OS crash.
  • SyncStrategyAlways (1): Safest. fsync on every write. Slowest performance.
  • SyncStrategyOSCache (2): Recommended. Flushes to OS cache immediately, background fsync every interval. Safe against app crashes, slight risk on power loss.

Contributing

Contributions are welcome! Please fork the repository and open a pull request.

License

MIT License. See LICENSE file.

References