Skip to content

alialaee/logfile

Repository files navigation

logfile

Test

Note

I extracted this from an in-house database engine. It's a work in progress, but I wanted to share and improve it in a separate repo. Feedback and contributions are very welcome!

logfile is a concurrent, append-only log file optimized for SSDs and high throughput in Go.

Why

Most log file implementations flush after every write. This one batches writes into a single IO + fsync, which makes a huge difference on SSDs.

The library exposes two entry points so you can pick the right tradeoff between throughput and per-record durability:

  • Write blocks until the appended record is durable. Many concurrent Write calls naturally batch into one fsync (group commit). Use this when each record must be on disk before the caller continues and you have enough concurrent writers to amortize fsync latency.
  • WriteAsync returns as soon as the record is buffered. A single writer can keep submitting records while the background flusher fsyncs the previous batch. Call Flush at a checkpoint to wait for durability. This is the high-throughput path for a small number of writers.

Internally, a single background flusher drains the buffer using a ping-pong pattern, so new appends can be staged while the previous batch is being fsynced. Once the flush completes, all waiting writers are notified and the next batch can be flushed.

Features

  • Group commit: concurrent Write calls are batched into one write + fsync
  • Pipelined async path: a single writer can pipeline records via WriteAsync for very high throughput
  • Optimized for SSDs: minimizes write amplification with 4KB-aligned writes
  • Zero heap allocations per write
  • Safe: no torn writes. When Write returns nil, the data is on disk.
  • CRC32 checksums for data integrity (optional)
  • Small API surface: two write methods plus Flush and Close

Install

go get github.com/alialaee/logfile

Benchmarks

See benchmark/README.md.

Real-disk throughput on an M4 MacBook Air at 900-byte records:

Mode Throughput
Single writer, Write (fsync per call) 0.02 MB/s
Single writer, WriteAsync + Flush 411 MB/s
10 goroutines, WriteAsync 406 MB/s
10 goroutines, Write 1.16 MB/s
500 goroutines, Write 58.7 MB/s

Write throughput is bounded by fsync_latency / records_per_fsync, where the batch size is at most the number of concurrent writers. If you need high throughput from a small number of writers, use WriteAsync.

Cross-library comparison (20 concurrent writers, record sizes 512 bytes to 1 MB):

=== Logfile Write (This library)
Total bytes written: 477 MB
Time taken: 576.585416ms
Throughput: 828.85 MB/s

=== Direct File Write+Flush ===
Total bytes written: 119 MB
Time taken: 1.022756167s
Throughput: 116.82 MB/s

=== Tidwall WAL Write ===
Total bytes written: 119 MB
Time taken: 1.117973042s
Throughput: 106.87 MB/s

=== Hashicorp Raft Write ===
Total bytes written: 119 MB
Time taken: 1.197531459s
Throughput: 99.77 MB/s

=== Pebble Record Write ===
Total bytes written: 477 MB
Time taken: 678.86125ms
Throughput: 703.98 MB/s

Usage

Blocking writes (each call returns when the record is durable)

f, _ := os.OpenFile("my.log", os.O_CREATE|os.O_RDWR, 0644)

lf, _ := logfile.New(f, 0, 1024*1024, true) // 1MB buffer, CRC enabled

offset, _ := lf.Write(context.Background(), []byte("hello world"))

// Read back
reader := logfile.NewReader(f, 0)
data, _ := reader.ReadNext(nil)

lf.Close()
f.Close()

Pipelined writes (high throughput from one writer)

for _, rec := range records {
    _, _ = lf.WriteAsync(context.Background(), rec)
}
// Block until everything submitted above is on disk.
_ = lf.Flush(context.Background())

License

MIT

About

Logfile is a reliable append-only log file optimized for SSD and concurrency. Most suitable for implementing WALs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors