wal

package module
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 11, 2026 License: MIT Imports: 13 Imported by: 0

README

Write-Ahead-Log

WAL is a small, append-only write-ahead log written in Go. It’s built to be simple, predictable, and easy to reason about.

The core idea behind it is — append records to disk safely, split large records across fixed-size blocks, verify integrity with checksums, and read them back sequentially for recovery or replay.

Features

  • Append-only log with segment-based storage.
    • Uses fixed-size blocks internally with chunked records.
  • Supports large records spanning multiple blocks.
  • crc32 checksums for corruption detection.
  • Sequential reads with precise resume positions.
  • Startup traversal with block caching for recovery.
  • Supports concurrent read/write, has thread-safe functions.

Design Overview

wal-logo.png

Format

Format of a single segment file:

       +-----+-------------+--+----+----------+------+-- ... ----+
 File  | r0  |      r1     |P | r2 |    r3    |  r4  |           |
       +-----+-------------+--+----+----------+------+-- ... ----+
       |<---- BlockSize ----->|<---- BlockSize ----->|

  rn = variable size records
  P = Padding
  BlockSize = 32KB

Format of a single record:

+----------+-------------+-----------+--- ... ---+
| CRC (4B) | Length (2B) | Type (1B) |  Payload  |
+----------+-------------+-----------+--- ... ---+

CRC = 32-bit hash computed over the payload using CRC
Length = Length of the payload data
Type = Type of record
       (FullType, FirstType, MiddleType, LastType)
       The type is used to group a bunch of records together to represent
       blocks that are larger than BlockSize
Payload = Byte stream as long as specified by the payload size

Getting Started

func main() {
    waLog, _ := wal.Open(wal.DefaultOptions)
    defer func () {
        _ = waLog.Delete()
    }()

    // writing data to log
    pos1, _ := waLog.Write([]byte("example data one"))
    pos2, _ := waLog.Write([]byte("example data two"))

    // reading data sequentially
    data1, _ := waLog.Read(pos1)
    fmt.Printf("data1: %s\n", data1)

    data2, _ := waLog.Read(pos2)
    fmt.Printf("data1: %s\n", data2)

    fmt.Println()

    _, _ = waLog.Write([]byte("example data 3"))
    _, _ = waLog.Write([]byte("example data 4"))
    _, _ = waLog.Write([]byte("example data 5"))
    _, _ = waLog.Write([]byte("example data 6"))

    reader := waLog.NewReader()
    for {
        data, pos, err := reader.Next()
        if err != nil {
            break
        }
        fmt.Printf("pos: %v, data: %s\n", pos, data)
    }
}

How it works (on a high level)

  • Data is written to the active segment sequentially.
  • Each segment is divided into fixed-size blocks.
  • Records are split into chunks when needed.
  • Each chunk is checksummed independently.
  • Reads walk blocks and chunks to reconstruct records.
  • Startup traversal reuses cached blocks when possible.

Status

This is a learning project. As a result, it focuses on correctness and clarity over production features, and the API may change as the implementation evolves.

Documentation

Index

Constants

View Source
const (
	B  = 1
	KB = 1024 * B
	MB = 1024 * KB
	GB = 1024 * MB
)

Variables

View Source
var (
	ErrClosed       = errors.New("the segment file is closed")
	ErrInvalidCrc32 = errors.New("invalid CRC, the data may be corrupted")
)
View Source
var (
	ErrValueTooLarge       = errors.New("the data size can't be larger than segment size")
	ErrPendingSizeTooLarge = errors.New("the upper bound of pending writes can't be larger than segment size")
)
View Source
var DefaultOptions = Options{
	DirPath:     walTempDir(),
	SegmentSize: 1 * GB, SegmentFileExt: ".SEG",
	Sync:         false,
	BytesPerSync: 0, SyncInterval: 0,
}

Functions

func SegmentFileName

func SegmentFileName(dirPath, extName string, id uint32) string

SegmentFileName will return the filename of a segment file.

Types

type ChunkPosition

type ChunkPosition struct {
	SegmentId   SegmentId
	ChunkOffset int64  // The start offset of the chunk in the segment file.
	ChunkSize   uint32 // Number of bytes the chunk data holds up in the segment file.
	BlockNumber uint32 // The block number of the chunk in the segment file.
}

ChunkPosition represent the position of a chunk in the segment file. It is used to read the data from the segment file.

func DecodeChunkPosition

func DecodeChunkPosition(buf []byte) *ChunkPosition

DecodeChunkPosition decodes the chunk position from a byte slice; returns a pointer to a new ChunkPosition.

func (*ChunkPosition) Encode

func (cp *ChunkPosition) Encode() []byte

Encode encodes the chunk position to a byte slice; returns the slice with the actual occupied elements.

func (*ChunkPosition) EncodeFixedSize

func (cp *ChunkPosition) EncodeFixedSize() []byte

EncodeFixedSize encodes the chunk position to a byte slice; returns a slice of size "maxLen".

type ChunkType

type ChunkType = byte
const (
	ChunkTypeFull ChunkType = iota
	ChunkTypeFirst
	ChunkTypeMiddle
	ChunkTypeLast
)

type Options

type Options struct {
	// DirPath specifies the directory where the WAL segment files will
	// be stored.
	DirPath string

	// SegmentSize specifies the maximum size of each segment file in bytes.
	SegmentSize int64

	// SegmentFileExt specifies the file extension of the segment files.
	// The file extension must start with a dot ".", default value is ".SEG".
	// It is used to identify the different types of files in the directory.
	SegmentFileExt string

	// Sync is whether to synchronize writes through os buffer cache and down
	// onto the actual disk.
	// Setting sync is required for durability of a single write operation,
	// but also results in slower writes.
	//
	// If false, and the machine crashes, then some recent writes may be lost.
	// Note: that if it is just the process that crashes (machine does not),
	// then no writes will be lost.
	//
	// In other words, Sync being false has the same semantics as a 'write'
	// system call. Sync being true means write followed by fsync.
	Sync bool

	// BytesPerSync specifies the number of bytes to write before calling fsync.
	BytesPerSync uint32

	// SyncInterval is the time duration in which explicit synchronization is
	// performed. If SyncInterval is zero, no periodic synchronization is performed.
	SyncInterval time.Duration
}

Options represents the configuration options for the Write-Ahead-Log (WAL).

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader represents a reader for the Wal. It consists of segmentReaders, which is a slice of segmentReader sorted by segmentId; and currentReader, which is the index of the current segment reader in the slice.

The currentReader field is used to iterate over the segmentReader slice.

func (*Reader) CurrentChunkPosition

func (r *Reader) CurrentChunkPosition() *ChunkPosition

CurrentChunkPosition returns the position of the current data chunk.

func (*Reader) CurrentSegmentId

func (r *Reader) CurrentSegmentId() SegmentId

CurrentSegmentId returns the id of the current segment file while reading from the Wal.

func (*Reader) Next

func (r *Reader) Next() ([]byte, *ChunkPosition, error)

Next returns the next chunk data and its ChunkPosition in the Wal. If there is no data io.EOF is returned.

The position can be used to read the data from the segment file.

func (*Reader) SkipCurrentSegment

func (r *Reader) SkipCurrentSegment()

SkipCurrentSegment skips the current segment while reading the Wal.

type SegmentId

type SegmentId = uint32

type Wal

type Wal struct {
	// contains filtered or unexported fields
}

Wal represents a Write-Ahead Log structure that provides durability and fault-tolerance for incoming writes. It consists of an active segment, which is the current segment file used for new incoming writes; and older segments, which are a map of segment files used to read operations.

The options field stores various configuration options for the wal. Wal is safe for concurrent operations (access and modifications).

func Open

func Open(opt Options) (*Wal, error)

Open opens a Wal with the provided options. It opens all the segment files in the directory or creates a directory if not exists. If there is no segment file in the directory, it will create a new one.

func (*Wal) ActiveSegmentId

func (wal *Wal) ActiveSegmentId() SegmentId

ActiveSegmentId returns the id of the active segment file.

func (*Wal) AddPendingWrites

func (wal *Wal) AddPendingWrites(data []byte)

AddPendingWrites appends data to the pending batch. Size limits are enforced when the batch is flushed.

func (*Wal) ClearPendingWrites

func (wal *Wal) ClearPendingWrites()

ClearPendingWrites clears pending writes and resets pending size.

func (*Wal) Close

func (wal *Wal) Close() error

Close closes all the older segment files and the current active segment file.

func (*Wal) Delete

func (wal *Wal) Delete() error

Delete deletes all the older segment files and the current active segment file.

func (*Wal) IsEmpty

func (wal *Wal) IsEmpty() bool

IsEmpty returns whether the Wal is empty. The Wal is considered empty if there are 0 older segments and 1 empty active segment.

func (*Wal) IsFull

func (wal *Wal) IsFull(size int64) bool

IsFull checks if the active segment file can hold the incoming data or not.

func (*Wal) NewReader

func (wal *Wal) NewReader() *Reader

NewReader returns a new reader for the Wal. It will iterate all segment files and read all the data from them.

func (*Wal) NewReaderWithMax

func (wal *Wal) NewReaderWithMax(segId SegmentId) *Reader

NewReaderWithMax returns a new reader for the Wal, and the reader will only read the data from the segment file whose id is less than or equal to the given segId.

func (*Wal) NewReaderWithStart

func (wal *Wal) NewReaderWithStart(startPos *ChunkPosition) (*Reader, error)

NewReaderWithStart returns a new reader for the Wal, and the reader will only read the data from the segment file whose position is greater than or equal to the given position.

func (*Wal) OpenNewActiveSegment

func (wal *Wal) OpenNewActiveSegment() error

OpenNewActiveSegment opens a new segment file and sets it as the active segment. It is used when even though the active segment is not full yet, the user want to create a new segment file.

func (*Wal) Read

func (wal *Wal) Read(pos *ChunkPosition) ([]byte, error)

Read reads the data from the Wal according to the given position.

func (*Wal) RenameFileExt

func (wal *Wal) RenameFileExt(ext string) error

RenameFileExt renames all segment files' extension name and replaces the global Options.SegmentFileExt configuration for this Wal instance.

func (*Wal) SetIsStartupTraversal

func (wal *Wal) SetIsStartupTraversal(val bool)

SetIsStartupTraversal is only used if the Wal is during its startup traversal. When enabled, Wal reads are NOT thread-safe and must be performed by a single reader goroutine only.

func (*Wal) Sync

func (wal *Wal) Sync() error

Sync syncs the active segment file to stable storage like disk.

func (*Wal) Write

func (wal *Wal) Write(data []byte) (*ChunkPosition, error)

Write writes the given data to the active segment of the Wal. It returns the position of the data in the Wal, and an error if any.

func (*Wal) WriteAll

func (wal *Wal) WriteAll() ([]*ChunkPosition, error)

WriteAll write wal.pendingWrites to Wal and clears pending writes. It will not sync the segment files according to wal.options, you'll have to call sync manually.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL