anystore

package module
v2.0.0-alpha.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 22, 2026 License: MIT Imports: 24 Imported by: 0

README

Any Store

Go Reference Go Report Card MIT License

A document‑oriented database with a MongoDB‑like query language, built on an embedded Go btree/pager/wal storage engine derived from SQLite (with intentional drifts — not a literal port, and the on-disk format is our own). Any Store brings schema‑less flexibility, rich indexes, ACID transactions, page-level integrity and optional encryption to embedded Go applications — pure Go, no CGO.

⚠️ Status: pre‑1.0 – APIs may change. We actively dog‑food the library in production and welcome early adopters & contributors.

Features

  • Mongo‑style queries$in, $inc, comparison & logical operators out of the box.
  • Automatic indexes – create, ensure or drop compound & unique indexes at runtime.
  • ACID transactions – explicit read / write transactions plus convenience helpers.
  • Streaming iterators – low‑memory scans with cursor API.
  • Durability – db flush and protections mechanisms in case of power-loss.
  • Integrity & encryption – per-page XXH3-128 checksums by default; optional AES-GCM or ChaCha20-Poly1305 (AEAD doubles as integrity).
  • CLI – quick inspection, import/export and interactive shell.
  • Cross‑platform – pure Go, no CGO, runs anywhere Go runs.

Quick start

Install library
go get github.com/anyproto/any-store/v2
Install CLI (optional)
go install github.com/anyproto/any-store/cmd/any-store-cli2/v2@latest
Hello, Any Store
package main

import (
    "context"
    "fmt"
    "log"

    anystore "github.com/anyproto/any-store/v2"
    "github.com/anyproto/any-store/v2/anyenc"
)

func main() {
    ctx := context.Background()

    db, err := anystore.Open(ctx, "/tmp/demo.db", nil)
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()

    users, _ := db.Collection(ctx, "users")

    _ = users.Insert(ctx,
        anyenc.MustParseJson(`{"id": 1, "name": "John"}`),
        anyenc.MustParseJson(`{"id": 2, "name": "Jane"}`),
    )

    res, _ := users.Find(`{"id": {"$in": [1,2]}}`).Sort("-name").Iter(ctx)
    for res.Next() {
        doc, _ := res.Doc()
        fmt.Println(doc.Value().String())
    }

    // Inspect storage footprint: doc count, sizes, compression and per-index stats.
    st, _ := users.Stats(ctx)
    fmt.Printf("docs=%d total=%d bytes ratio=%.2fx\n",
        st.DocCount, st.TotalSizeBytes, st.CompressionRatio)
}

The full end‑to‑end example lives in example/ and in the API docs.

Documentation

Design highlights

Layer Responsibility
Query builder Parses Mongo‑like JSON filters and modifiers
Index engine Generates composite indexes, picks optimal index via cost estimator
Encoding arena Efficient AnyEnc value arena to minimise GC churn
Connection pool Separate read / write engine handles for concurrent workloads

Durability

Any Store automatically performs WAL checkpoints and fsync after idle periods to ensure data durability.

db, _ := anystore.Open(ctx, "data.db", &anystore.Config{
    Durability: anystore.DurabilityConfig{
        AutoFlush: true,
        IdleAfter: 20 * time.Second,  // Flush after 20s of inactivity
        FlushMode: anystore.FlushModeCheckpointPassive, // other options are FlushModeCheckpointFull, FlushModeCheckpointRestart
        Sentinel:  true,  // Track dirty db state for automatic quickCheck on start
    },
})

// Manual flush, e.g. before app suspension (ensure we have at least 100ms of idle, to ensure we finished pending writes)
db.Flush(ctx, 100*time.Millisecond, anystore.FlushModeCheckpointPassive)

Sentinel: When enabled, creates a .lock file to detect not explicitly persisted writes and run integrity check on open.

Integrity

Every non-encrypted Any Store database carries an XXH3-128 page-trailer checksum (16 bytes/page) by default — corruption is caught on read. There is no opt-out; the cost is <1% on writes and effectively zero on reads. Encrypted databases derive integrity from the cipher's AEAD authentication tag instead. File state is authoritative on reopen — existing plain databases stay plain, existing checksum databases auto-install the codec regardless of caller config.

Conceptually mirrors SQLite's cksumvfs, generalized to also surface AEAD failures via the same API.

db, _ := anystore.Open(ctx, "data.db", &anystore.Config{
    // Wire monitoring at Open time so failures during the first page-1
    // read (which happens inside Open) are observable.
    OnIntegrityError: func(e anystore.IntegrityError) {
        log.Printf("integrity: page %d %v: %v", e.PageNo, e.Kind, e.Inner)
    },
    // Default false: corrupt pages cause reads to fail. Flip to true
    // for forensic dumps where you'd rather read garbage than lose
    // access to the rest of the data.
    ContinueOnIntegrityError: false,
})
// db now has IntegrityChecksum mode automatically.

// Walk every page and report mismatches (works in encrypted mode too).
rep, _ := db.VerifyIntegrity(ctx)
fmt.Printf("scanned %d pages, %d errors\n", rep.Pages, len(rep.Errors))

Page-1 DB header (first 100 bytes) is not covered by the per-page hash; header invariants there are validated separately at open. See docs/btree/specs/integrity.md for the full design.

Contributing

  1. Fork & clone
  2. make test – run unit tests
  3. Create your feature branch
  4. Open a PR and sign the CLA

Please read our Code of Conduct before contributing.

⚖️ License

Any Store is released under the MIT License – see LICENSE for details.

Documentation

Index

Constants

View Source
const CipherAES256GCM = btree.CipherAES256GCM

CipherAES256GCM is AES-256 in Galois/Counter Mode. Default choice; hardware-accelerated on modern CPUs. Per-page overhead: 32 bytes.

View Source
const CipherChaCha20Poly1305 = btree.CipherChaCha20Poly1305

CipherChaCha20Poly1305 is ChaCha20 with Poly1305 authentication. Constant-time in pure software. Per-page overhead: 32 bytes.

View Source
const CipherXChaCha20Poly1305 = btree.CipherXChaCha20Poly1305

CipherXChaCha20Poly1305 is the 24-byte-nonce variant of ChaCha20- Poly1305. Per-page overhead: 48 bytes. Use only for workloads with very high per-key write volumes where 12-byte random-nonce collision is a concern.

View Source
const DefaultKDFIterations = btree.DefaultKDFIterations

DefaultKDFIterations is the PBKDF2 iteration count used when EncryptionConfig.KDFIterations is zero.

Variables

View Source
var (
	// ErrDocExists is returned when attempting to insert a document that already exists.
	ErrDocExists = errors.New("any-store: document already exists")

	// ErrDocNotFound is returned when a document cannot be found by its ID.
	ErrDocNotFound = errors.New("any-store: document not found")

	// ErrDocWithoutId is returned when a document is provided without a required ID.
	ErrDocWithoutId = errors.New("any-store: document missing ID")

	// ErrCollectionExists is returned when attempting to create a collection that already exists.
	ErrCollectionExists = errors.New("any-store: collection already exists")

	// ErrCollectionNotFound is returned when a collection cannot be found.
	ErrCollectionNotFound = errors.New("any-store: collection not found")

	// ErrIndexExists is returned when attempting to create an index that already exists.
	ErrIndexExists = errors.New("any-store: index already exists")

	// ErrIndexNotFound is returned when an index cannot be found.
	ErrIndexNotFound = errors.New("any-store: index not found")

	// ErrTxIsReadOnly is returned when a write operation is attempted in a read-only transaction.
	ErrTxIsReadOnly = errors.New("any-store: transaction is read-only")

	// ErrTxIsUsed is returned when an operation is attempted on a transaction that has already been committed or rolled back.
	ErrTxIsUsed = errors.New("any-store: transaction has already been used")

	// ErrTxOtherInstance is returned when an operation is attempted using a transaction from a different database instance.
	ErrTxOtherInstance = errors.New("any-store: transaction belongs to another database instance")

	// ErrUniqueConstraint is returned when a unique constraint violation occurs.
	ErrUniqueConstraint = errors.New("any-store: unique constraint violation")

	// ErrIterClosed is returned when operations are attempted on a closed iterator.
	ErrIterClosed = errors.New("any-store: iterator is closed")

	// ErrQuickCheckFailed is returned when we did a quick check (e.g. when opening db in a dirty state with sentinel config on) and it failed, indicating possible database corruption.
	ErrQuickCheckFailed = errors.New("any-store: quick check failed on db open")

	ErrDBIsClosed = btree.ErrClosed

	ErrDBIsNotOpened       = errors.New("any-store: database is not opened")
	ErrIncompatibleVersion = errors.New("any-store: incompatible version")

	// ErrPageBufferNotInitialized is returned when UseGlobalPageBuffer is set
	// but InitPageBuffer was not called before opening the database.
	ErrPageBufferNotInitialized = errors.New("any-store: global page buffer not initialized (call InitPageBuffer first)")
)
View Source
var ErrPageIntegrity = btree.ErrPageIntegrity

ErrPageIntegrity is the umbrella error returned (wrapped) by every codec when a page fails its on-disk integrity check, regardless of mode. Callers match it with errors.Is to handle both the cksum-only and AEAD modes uniformly:

if errors.Is(err, anystore.ErrPageIntegrity) { ... }

The wrapped inner detail carries the mode-specific message ("checksum mismatch" or "AEAD authentication failed"); callers that need a programmatic discriminator should use SweepError.Kind from VerifyIntegrity instead of parsing strings.

Functions

func DeriveKey

func DeriveKey(passphrase, salt []byte, iterations int) []byte

DeriveKey stretches a user passphrase to a 32-byte AES key using PBKDF2-HMAC-SHA256 against the supplied salt. Iterations defaults to 256,000 (SQLCipher v4 default) when <= 0. Exposed for users who need to derive keys for BYO codecs; normal usage should set EncryptionConfig.Passphrase directly.

func EnablePipelinePerfCounters

func EnablePipelinePerfCounters(enabled bool)

EnablePipelinePerfCounters toggles pipeline profiling counters.

func InitPageBuffer

func InitPageBuffer(pageSize, nPages int)

InitPageBuffer pre-allocates a global pool of nPages page-sized buffers. Must be called before opening any databases that use UseGlobalPageBuffer. Mirrors sqlite3_config(SQLITE_CONFIG_PAGECACHE). Call once at process startup.

Example: InitPageBuffer(4096, 5000) pre-allocates ~20MB of page buffers.

func ResetOpenRegistry

func ResetOpenRegistry()

ResetOpenRegistry clears the process-global registry of open databases. Tests that simulate process crashes (where Close is intentionally skipped) call this to allow a subsequent Open of the same file to succeed. Panics unless built with -tags vfs or GOOS=js GOARCH=wasm.

func ResetPipelinePerfCounters

func ResetPipelinePerfCounters()

ResetPipelinePerfCounters clears internal iterator/doc profiling counters.

func ResetVFS

func ResetVFS()

ResetVFS restores defaults. Panics unless built with -tags vfs or GOOS=js GOARCH=wasm.

func SetVFS

func SetVFS(vfs VFS)

SetVFS replaces OS-level operations. Panics unless built with -tags vfs or GOOS=js GOARCH=wasm.

func StampPageChecksumForTest

func StampPageChecksumForTest(page []byte)

StampPageChecksumForTest writes a valid XXH3-128 trailer into the last 16 bytes of `page`. Test/migration helper; production code never calls this directly — codec.Encrypt does it during normal page writes.

func VerifyPageChecksum

func VerifyPageChecksum(page []byte) bool

VerifyPageChecksum returns true iff the trailing 16 bytes of `page` are the XXH3-128 of page[:len(page)-16]. Returns false for invalid sizes. Public; mirrors SQLite's verify_checksum() SQL function. Computes over the full page; for the codec-equivalent (which excludes the page-1 DB header for page 1), use (DB).VerifyIntegrity instead.

Types

type CipherType

type CipherType = btree.CipherType

CipherType selects a built-in AEAD. See the individual constants for trade-offs.

type Codec

type Codec = btree.Codec

Codec is the pluggable page-encryption interface. Implement this to provide a custom AEAD (HSM-backed, external key management, or any authenticated cipher not offered by the built-in CipherType values).

The pager installs one Codec per DB and routes every file / WAL I/O site through it. Implementations must be safe for concurrent use: a single Codec serves the writer and all reader goroutines.

Overhead() bytes are reserved at the end of every page for codec metadata (nonce, tag, any padding). The btree cell layout automatically respects this via (page_size - reserve_size). Must be constant for the codec's lifetime and a multiple of 16 (AES block size) for alignment.

Encrypt/Decrypt operate on full-page buffers. src and dst are both exactly pageSize; dst must not alias src. The 1-based page number pgno is bound into the authentication tag as associated data so that moving a page to a different file offset invalidates its MAC.

func NewAESCodec

func NewAESCodec(key []byte) (Codec, error)

NewAESCodec constructs a codec using AES-256-GCM with a raw 32-byte key. Prefer EncryptionConfig.Passphrase for the common passphrase-based case; use this directly when managing key material externally (HSM, KMS, etc).

func NewChaCha20Poly1305Codec

func NewChaCha20Poly1305Codec(key []byte) (Codec, error)

NewChaCha20Poly1305Codec constructs a ChaCha20-Poly1305 codec with a raw 32-byte key.

func NewXChaCha20Poly1305Codec

func NewXChaCha20Poly1305Codec(key []byte) (Codec, error)

NewXChaCha20Poly1305Codec constructs an XChaCha20-Poly1305 codec with a raw 32-byte key. 24-byte nonce.

type Collection

type Collection interface {
	// Name returns the name of the collection.
	Name() string

	// FindId finds a document by its ID.
	// Returns the document or an error if the document is not found.
	FindId(ctx context.Context, id any) (Doc, error)

	// FindIdWithParser finds a document by its ID. Uses provided anyenc parser.
	// Returns the document or an error if the document is not found.
	FindIdWithParser(ctx context.Context, p *anyenc.Parser, id any) (Doc, error)

	// Find returns a new Query object with given filter
	Find(filter any) Query

	// Insert inserts multiple documents into the collection.
	// Returns an error if the insertion fails.
	Insert(ctx context.Context, docs ...*anyenc.Value) (err error)

	// UpdateOne updates a single document in the collection.
	// Provided document must contain an id field
	// Returns an error if the update fails.
	UpdateOne(ctx context.Context, doc *anyenc.Value) (err error)

	// UpdateId updates a single document in the collection with provided modifier
	// Returns a modify result or error.
	UpdateId(ctx context.Context, id any, mod query.Modifier) (res ModifyResult, err error)

	// UpsertOne inserts a document if it does not exist, or updates it if it does.
	// Returns the ID of the upserted document or an error if the operation fails.
	UpsertOne(ctx context.Context, doc *anyenc.Value) (err error)

	// UpsertId updates a single document or creates new one
	// Returns a modify result or error.
	UpsertId(ctx context.Context, id any, mod query.Modifier) (res ModifyResult, err error)

	// DeleteId deletes a single document by its ID.
	// Returns an error if the deletion fails.
	DeleteId(ctx context.Context, id any) (err error)

	// Count returns the number of documents in the collection.
	// Returns the count of documents or an error if the operation fails.
	Count(ctx context.Context) (count int, err error)

	// CreateIndex creates a new index.
	// Returns an error if index exists or the operation fails.
	CreateIndex(ctx context.Context, info ...IndexInfo) (err error)

	// EnsureIndex ensures an index exists on the specified fields.
	// Returns an error if the operation fails.
	EnsureIndex(ctx context.Context, info ...IndexInfo) (err error)

	// DropIndex drops an index by its name.
	// Returns an error if the operation fails.
	DropIndex(ctx context.Context, indexName string) (err error)

	// GetIndexes returns a list of indexes on the collection.
	GetIndexes() (indexes []Index)

	// Stats returns the storage footprint of the collection: document count,
	// stored and uncompressed sizes, compression ratio and per-index sizes.
	// It scans the whole collection and is intended for diagnostics.
	Stats(ctx context.Context) (CollectionStats, error)

	// Rename renames the collection.
	// Returns an error if the operation fails.
	Rename(ctx context.Context, newName string) (err error)

	// Drop drops the collection.
	// Returns an error if the operation fails.
	Drop(ctx context.Context) (err error)

	// ReadTx starts a new read-only transaction. It's just a proxy to db object.
	// Returns a ReadTx or an error if there is an issue starting the transaction.
	ReadTx(ctx context.Context) (ReadTx, error)

	// WriteTx starts a new read-write transaction. It's just a proxy to db object.
	// Returns a WriteTx or an error if there is an issue starting the transaction.
	WriteTx(ctx context.Context) (WriteTx, error)

	// Close closes the collection.
	// Returns an error if the operation fails.
	Close() error
}

Collection represents a collection of documents.

type CollectionOptions

type CollectionOptions struct {
	// Compression overrides the database-wide compression setting for this collection.
	// Zero value inherits the database default.
	Compression Compression
}

CollectionOptions configures per-collection settings at creation time.

type CollectionStats

type CollectionStats struct {
	// Name is the collection name.
	Name string

	// DocCount is the number of documents in the collection.
	DocCount int

	// StoredDocsBytes is the sum of stored document value bytes. When
	// compression is enabled this counts the compressed form.
	StoredDocsBytes int

	// UncompressedDocsBytes is the sum of document value bytes after
	// decompression. It equals StoredDocsBytes when nothing is compressed.
	UncompressedDocsBytes int

	// CompressionEnabled reports whether compression is active for this
	// collection. Individual documents below anyenc.CompressMinSize are stored
	// uncompressed even when it is enabled.
	CompressionEnabled bool

	// CompressionRatio is UncompressedDocsBytes/StoredDocsBytes. It is 1.0 when
	// nothing is compressed and grows above 1.0 as compression saves space.
	CompressionRatio float64

	// DocsSizeBytes is the physical on-disk size of the collection's document
	// B-tree (page count including overflow pages times the page size).
	DocsSizeBytes int

	// IndexesSizeBytes is the sum of SizeBytes across all indexes.
	IndexesSizeBytes int

	// TotalSizeBytes is DocsSizeBytes plus IndexesSizeBytes.
	TotalSizeBytes int

	// Indexes holds per-index statistics.
	Indexes []IndexStats
}

CollectionStats describes the storage footprint of a collection: its documents, compression effectiveness and per-index sizes.

type Compression

type Compression int

Compression specifies the compression algorithm for document values.

const (
	// S2 enables S2 compression for objects larger than 256 bytes (default).
	S2 Compression = 1
	// NoCompression disables compression entirely.
	NoCompression Compression = 2
)

type Config

type Config struct {
	// SyncPoolElementMaxSize defines maximum size of buffer that can be returned to the syncpool
	// default value is 2MiB
	SyncPoolElementMaxSize int

	// CommitSync forces fsync on every WAL commit (like SQLite synchronous=FULL in WAL mode).
	// When false (default), fsync is deferred to checkpoint time, which reduces write latency
	// at the cost of losing the last committed transaction(s) on power loss.
	CommitSync bool

	// DisableAutoCheckpoint disables WAL auto-checkpoint entirely.
	// When true, checkpoint must be triggered manually or via durability auto-flush.
	DisableAutoCheckpoint bool

	// AutoCheckpointAfter overrides the default WAL auto-checkpoint threshold (10000 frames).
	// Only used when DisableAutoCheckpoint is false. 0 means use default.
	AutoCheckpointAfter int

	// CacheSize overrides the default per-DB page cache size (in pages,
	// default 5000). Primarily for tests that need to force pagerStress /
	// cache-spill behavior at low data volumes. Zero means use the default.
	CacheSize int

	// InMemory keeps the entire database in memory with no files on disk.
	// The database does not survive process crashes. When true, InProcess
	// and CommitSync=false are forced on automatically.
	// The path argument to Open is ignored and can be any string (e.g. ":memory:").
	InMemory bool

	// DisableCompression disables S2 compression for document values.
	// By default, objects larger than 256 bytes are compressed with S2.
	DisableCompression bool

	// UseGlobalPageBuffer opts this DB into the global pre-allocated page
	// buffer pool. The pool must be initialized beforehand via InitPageBuffer.
	// When false (default), page buffers use sync.Pool (GC-managed, like
	// SQLite's default malloc mode).
	UseGlobalPageBuffer bool

	// MmapSize enables mmap-backed reads of the database file up to the
	// given byte limit. Zero (default) disables mmap — reads use pread
	// via ReadAt, same behavior as before this feature existed.
	// Matches SQLite's PRAGMA mmap_size. Linux/darwin + amd64/arm64
	// only; no-op on other platforms.
	//
	// SAFETY WARNINGS — read before enabling:
	//
	//   - SIGBUS on unreachable storage: if the DB file becomes
	//     truncated, unlinked, or the storage device is removed while
	//     a mapping is live (USB unplug, iCloud eviction of a
	//     "dataless" file, NFS server timeout, external process
	//     truncate), accessing the mapped bytes generates SIGBUS. Go's
	//     runtime translates SIGBUS to an uncatchable "unexpected
	//     fault address" crash — the whole process dies. The pread
	//     path (MmapSize=0) returns a normal error for all of these
	//     scenarios. Only enable on paths you control: local disk,
	//     stable filesystem, single-device. Do NOT enable for DBs
	//     stored in iCloud Drive, OneDrive, Dropbox, or any network
	//     mount.
	//
	//   - Mobile / iOS: mmap'd bytes count against the per-process
	//     memory limit and can trip iOS's "jetsam" reaper on older
	//     devices. iOS may also silently purge mmap pages under
	//     memory pressure; re-faulting succeeds for local files but
	//     fails (SIGBUS) if the backing file has become unavailable
	//     (permission revocation, iCloud eviction). Recommend leaving
	//     at zero on iOS/iPadOS unless you have a measured need.
	//
	//   - Network filesystems: mmap coherence over NFS/SMB is weaker
	//     than over local disks; concurrent writes from peers may be
	//     invisible to the mapping, and reads may observe torn state.
	//     pread does not have this issue.
	//
	// Recommended when it IS safe: 64-512 MiB on workloads with
	// frequent large-blob or repeat-page reads, on stable local
	// storage, desktop/server deployments.
	MmapSize int64

	// DurabilityConfig provides configuration for crash recovery and idle auto-flush
	Durability DurabilityConfig

	// Encryption, when non-empty, enables page-level AES-256-GCM encryption
	// of the on-disk database file. Zero value means no encryption.
	Encryption EncryptionConfig

	// OnIntegrityError, when non-nil, is invoked from the read path on
	// every per-page integrity failure (XXH3 trailer mismatch in cksum
	// mode, AEAD auth-tag failure in encryption mode). Plain databases
	// never fire it.
	//
	// The callback runs synchronously on the I/O goroutine and must not
	// block. Push to a buffered channel + drain elsewhere if you need
	// retention or cross-thread delivery.
	//
	// Set this at Open time so failures discovered during the first
	// page-1 read (which happens inside Open) are observable. There is
	// no post-Open setter — the codebase favors config-at-Open over
	// runtime mutation.
	OnIntegrityError func(IntegrityError)

	// ContinueOnIntegrityError, when true, lets reads of corrupt pages
	// return their (potentially garbage) bytes instead of erroring with
	// ErrPageIntegrity. The OnIntegrityError callback still fires —
	// only the error-return is suppressed. Mirror of cksumvfs's
	// `PRAGMA checksum_verification = OFF`.
	//
	// Default (false) is the safe choice: corrupt pages cause reads to
	// fail, callers see the error, app halts or recovers. Enable only
	// for forensic dumps where you'd rather read garbage than not be
	// able to read at all.
	//
	// Honored only in checksum mode. AEAD-encrypted databases ignore
	// this flag (disabling AEAD verification would return attacker-
	// controlled plaintext, defeating the cipher).
	ContinueOnIntegrityError bool
}

Config provides the configuration options for the database.

type DB

type DB interface {
	// CreateCollection creates a new collection with the specified name.
	// Returns the created Collection or an error if the collection already exists.
	// Possible errors:
	// - ErrCollectionExists: if the collection already exists.
	CreateCollection(ctx context.Context, collectionName string, opts ...CollectionOptions) (Collection, error)

	// OpenCollection opens an existing collection with the specified name.
	// Returns the opened Collection or an error if the collection does not exist.
	// Possible errors:
	// - ErrCollectionNotFound: if the collection does not exist.
	OpenCollection(ctx context.Context, collectionName string) (Collection, error)

	// Collection is a convenience method to get or create a collection.
	// It first attempts to open the collection, and if it does not exist, it creates the collection.
	// Returns the Collection or an error if there is an issue creating or opening the collection.
	Collection(ctx context.Context, collectionName string, opts ...CollectionOptions) (Collection, error)

	// GetCollectionNames returns a list of all collection names in the database.
	// Returns a slice of collection names or an error if there is an issue retrieving the names.
	GetCollectionNames(ctx context.Context) ([]string, error)

	// Stats returns the statistics of the database.
	// Returns a DBStats struct containing the database statistics or an error if there is an issue retrieving the stats.
	Stats(ctx context.Context) (DBStats, error)

	// QuickCheck performs a quick integrity check. If result not ok returns error.
	QuickCheck(ctx context.Context) (err error)

	// IntegrityCheck runs the full structural btree integrity check:
	// reachable-page coverage, orphan detection, freelist consistency,
	// overflow-chain validation, key ordering, and master-page consistency.
	// Returns nil if the database is structurally consistent, or an error
	// aggregating up to 100 issues found. More expensive than QuickCheck —
	// intended for stress tests and offline diagnostics, not normal opens.
	IntegrityCheck(ctx context.Context) (err error)

	// Flush perform checkpoint on the btree database
	// When waitIdleDuration > 0, wait for waitIdleTime since the last write tx got released
	Flush(ctx context.Context, waitIdleDuration time.Duration, mode FlushMode) error

	// Backup creates a backup of the database at the specified file path.
	// Returns an error if the operation fails.
	Backup(ctx context.Context, path string) (err error)

	// ReadTx starts a new read-only transaction.
	// Returns a ReadTx or an error if there is an issue starting the transaction.
	ReadTx(ctx context.Context) (ReadTx, error)

	// WriteTx starts a new read-write transaction.
	// Returns a WriteTx or an error if there is an issue starting the transaction.
	WriteTx(ctx context.Context) (WriteTx, error)

	// Close closes the database connection.
	// Returns an error if there is an issue closing the connection.
	Close() error

	// IntegrityMode reports the page-level integrity mode of this database
	// (none / checksum / AEAD).
	IntegrityMode() IntegrityMode

	// VerifyIntegrity walks every page and verifies its per-page integrity
	// tag (XXH3-128 trailer for cksum mode, AEAD auth tag for encrypted mode).
	// Plain DBs return IntegrityNone with zero pages scanned. Mismatches are
	// returned in IntegrityReport.Errors; the function only errors on I/O or
	// context cancellation. See IntegrityConfig.
	VerifyIntegrity(ctx context.Context) (IntegrityReport, error)
}

DB represents a document-oriented database.

func Open

func Open(ctx context.Context, path string, config *Config) (DB, error)

Open opens a database at the specified path with the given configuration. The config parameter can be nil for default settings. Returns a DB instance or an error.

type DBStats

type DBStats struct {
	// CollectionsCount is the total number of collections in the database.
	CollectionsCount int

	// IndexesCount is the total number of indexes across all collections in the database.
	IndexesCount int

	// TotalSizeBytes is the total size of the database in bytes.
	TotalSizeBytes int

	// DataSizeBytes is the total size of the data stored in the database in bytes, excluding free space.
	DataSizeBytes int

	DirtyOnOpen             bool          // indicates we have sentinel file on open
	DirtyQuickCheckDuration time.Duration // time spent in quickcheck if dirty
}

DBStats represents the statistics of the database.

type Doc

type Doc interface {
	// Value returns the document as a *anyenc.Value.
	// Important: When used in an iterator, the returned value is valid only until the next call to Next.
	Value() *anyenc.Value
}

Doc represents a document in the collection.

type DurabilityConfig

type DurabilityConfig struct {
	// Enable auto-flush according to IdleAfter and FlushMode
	AutoFlush bool

	// IdleAfter is the duration to wait after the last write before performing autoflush
	// Default: 20s
	IdleAfter time.Duration

	// FlushMode specifies how to autoflush data during idle periods
	// Default: FlushModeCheckpointPassive
	FlushMode FlushMode

	// Sentinel enables the sentinel file (.lock) that tracks database dirty state
	// When true (default is false), the sentinel file is used to detect unclean shutdowns and perform QuickCheck on open
	Sentinel bool
}

type EncryptionConfig

type EncryptionConfig struct {
	// Passphrase is the user-supplied secret. Nil = no encryption.
	Passphrase []byte

	// KDFIterations overrides the PBKDF2 cost. Zero means 256,000 (the
	// SQLCipher v4 default). Ignored when Passphrase is exactly 32 bytes
	// (raw-key path). Use lower values only in tests.
	KDFIterations int

	// CipherType selects the built-in AEAD when Passphrase is set.
	// Zero value (CipherAES256GCM) is the default.
	CipherType CipherType

	// Codec, when non-nil, bypasses Passphrase / KDFIterations / CipherType
	// and is used verbatim. Use this for HSM-backed or other externally-
	// keyed codecs.
	Codec Codec
}

EncryptionConfig enables page-level encryption of the database file.

Three ways to enable encryption, in increasing order of control:

  1. Passphrase (common case): cfg.Encryption.Passphrase = []byte("my-pass") Default cipher is AES-256-GCM. Set CipherType to pick a different built-in (ChaCha20-Poly1305 / XChaCha20-Poly1305).

  2. Passphrase + CipherType (pick cipher, keep passphrase KDF): cfg.Encryption.Passphrase = []byte("my-pass") cfg.Encryption.CipherType = anystore.CipherChaCha20Poly1305

  3. Bring-your-own Codec (HSM, external KMS, custom AEAD): cfg.Encryption.Codec = myCodec // must implement anystore.Codec Passphrase / CipherType / KDFIterations are ignored in this mode.

Passphrase treatment (modes 1 and 2):

  • Exactly 32 bytes → raw key, no KDF.
  • Any other length → PBKDF2-HMAC-SHA256 stretches it to 32 bytes against the 16-byte salt stored in the database header.

An empty (length 0) non-nil Passphrase is rejected with an error; use nil Passphrase to disable encryption.

Once a database is created with encryption, it must always be opened with matching Encryption config. Reopening without the passphrase (or with the wrong one) returns an error. There is no in-place rekey in v1; changing a passphrase requires exporting to a freshly-opened database.

InMemory databases ignore Encryption (there is no at-rest artifact to protect).

func (EncryptionConfig) Enabled

func (e EncryptionConfig) Enabled() bool

Enabled reports whether this config will install a codec on Open.

type Explain

type Explain struct {
	Sql string

	// Rich explain output: multi-line plan with cost breakdown and candidates
	Plan string

	Indexes []IndexExplain
}

type File

type File = btree.File

File and VFS are always available for type-checking.

type FlushMode

type FlushMode string

FlushMode controls checkpoint behavior during flush, matching SQLite's SQLITE_CHECKPOINT_* modes.

const (
	// FlushModeCheckpointPassive checkpoints as many WAL frames as possible
	// without waiting for any readers or writers to finish. Might leave the
	// checkpoint unfinished if there are concurrent readers or writers.
	FlushModeCheckpointPassive FlushMode = "CHECKPOINT_PASSIVE"

	// FlushModeCheckpointFull waits until there is no writer and all readers
	// are reading from the most recent snapshot, then checkpoints all frames.
	// Blocks new writers while pending, but new readers continue unimpeded.
	// The WAL is preserved (not reset).
	FlushModeCheckpointFull FlushMode = "CHECKPOINT_FULL"

	// FlushModeCheckpointRestart is like Full but after checkpointing also
	// waits until all readers are reading from the database file only, then
	// resets the WAL so new writes start from the beginning. Blocks new
	// writers while pending, but does not impede readers.
	FlushModeCheckpointRestart FlushMode = "CHECKPOINT_RESTART"

	// FlushModeCheckpointTruncate is like Restart but also truncates the WAL
	// file to zero bytes.
	FlushModeCheckpointTruncate FlushMode = "CHECKPOINT_TRUNCATE"
)

type Index

type Index interface {
	// Info returns the IndexInfo for this index.
	Info() IndexInfo

	// Len returns the length of the index.
	Len(ctx context.Context) (int, error)
}

Index represents an index on a collection.

type IndexExplain

type IndexExplain struct {
	Name string
	Cost float64 // CBO computed cost
	Used bool
}

type IndexHint

type IndexHint struct {
	IndexName string
	Boost     int
}

type IndexInfo

type IndexInfo struct {
	// Name is the name of the index. If empty, it will be generated
	// based on the fields (e.g., "name,-createdDate").
	Name string `json:"name"`

	// Fields are the fields included in the index. Each field can specify
	// ascending (e.g., "name") or descending (e.g., "-createdDate") order.
	Fields []string `json:"fields"`

	// Unique indicates whether the index enforces a unique constraint.
	Unique bool `json:"unique"`

	// Sparse indicates whether the index is sparse, indexing only documents
	// with the specified fields.
	Sparse bool `json:"sparse"`
}

IndexInfo provides information about an index.

type IndexSketchInfo

type IndexSketchInfo struct {
	Size     int      // bucket count; normally qplanner.DefaultSketchSize
	DocCount uint64   // total documents tracked by this index
	Buckets  []uint64 // per-bucket frequency counts, len == Size
}

IndexSketchInfo is a read-only snapshot of an index's persisted sketch.

type IndexSketchInspector

type IndexSketchInspector interface {
	InspectIndexSketch(ctx context.Context, collName, indexName string) (IndexSketchInfo, error)
}

IndexSketchInspector exposes decoded sketch state for diagnostic and test use. It intentionally lives outside the DB interface — consumers opt in via type assertion so the main public API stays focused.

type IndexStats

type IndexStats struct {
	// Name is the index name.
	Name string

	// Fields are the indexed fields, with leading '-' marking descending order.
	Fields []string

	// Unique reports whether the index enforces a unique constraint.
	Unique bool

	// Sparse reports whether the index skips documents missing the fields.
	Sparse bool

	// EntryCount is the number of entries (key/value pairs) in the index B-tree.
	EntryCount int

	// PayloadBytes is the sum of key+value byte lengths across all entries.
	PayloadBytes int

	// SizeBytes is the physical on-disk size of the index B-tree, computed as
	// its page count (including overflow pages) times the database page size.
	SizeBytes int

	// SketchDocCount is the document count tracked by the index sketch.
	SketchDocCount uint64

	// SketchSize is the number of buckets in the index sketch.
	SketchSize int

	// SketchDistribution summarizes the sketch's bucket frequency distribution.
	SketchDistribution qplanner.SketchDistribution
}

IndexStats describes the storage footprint and sketch state of a single index.

type IntegrityError

type IntegrityError struct {
	PageNo uint32
	Kind   IntegrityErrorKind
	Inner  error
}

IntegrityError describes a single per-page verification failure.

type IntegrityErrorKind

type IntegrityErrorKind int

IntegrityErrorKind discriminates per-page failure types.

const (
	// IntegrityKindUnknown is reserved.
	IntegrityKindUnknown IntegrityErrorKind = IntegrityErrorKind(btree.IntegrityKindUnknown)
	// IntegrityChecksumMismatch indicates an XXH3-128 trailer mismatch
	// (cksum mode).
	IntegrityChecksumMismatch IntegrityErrorKind = IntegrityErrorKind(btree.IntegrityChecksumMismatch)
	// IntegrityAEADAuthFail indicates an AEAD auth-tag failure (AES-GCM
	// or ChaCha20-Poly1305 mode).
	IntegrityAEADAuthFail IntegrityErrorKind = IntegrityErrorKind(btree.IntegrityAEADAuthFail)
)

type IntegrityMode

type IntegrityMode int

IntegrityMode is the codec mode of a DB.

type IntegrityReport

type IntegrityReport struct {
	Mode   IntegrityMode
	Pages  int
	Errors []IntegrityError
}

IntegrityReport is the outcome of VerifyIntegrity.

type Iterator

type Iterator interface {
	// Next advances the iterator to the next document.
	Next() bool

	// Doc returns the current document.
	Doc() (Doc, error)

	// Err returns any error encountered during the lifetime of the iterator.
	Err() error

	// Close closes the iterator and releases any associated resources.
	Close() error
}

Iterator represents an iterator over query results.

type ModifyResult

type ModifyResult struct {
	// Matched is the number of documents matched by the query.
	Matched int

	// Modified is the number of documents that were actually modified.
	Modified int
}

ModifyResult represents the result of a modification operation.

type PipelinePerfCounters

type PipelinePerfCounters struct {
	Planner qplanner.PerfCounters

	DocCalls           uint64
	DocParsedHits      uint64
	DocFallbacks       uint64
	DocFallbackSeekNs  uint64
	DocFallbackParseNs uint64
}

PipelinePerfCounters aggregates iterator and Doc() fallback profiling counters.

func SnapshotPipelinePerfCounters

func SnapshotPipelinePerfCounters() PipelinePerfCounters

SnapshotPipelinePerfCounters returns current profiling counters.

type Query

type Query interface {

	// Limit sets the maximum number of documents to return.
	Limit(limit uint) Query

	// Offset sets the number of documents to skip before starting to return results.
	Offset(offset uint) Query

	// Sort sets the sort order for the query results.
	Sort(sort ...any) Query

	// IndexHint adds or removes boost for some indexes
	IndexHint(hints ...IndexHint) Query

	// Iter executes the query and returns an Iterator for the results.
	Iter(ctx context.Context) (Iterator, error)

	// Count returns the number of documents matching the query.
	Count(ctx context.Context) (count int, err error)

	// Update modifies documents matching the query.
	Update(ctx context.Context, modifier any) (res ModifyResult, err error)

	// Delete removes documents matching the query.
	Delete(ctx context.Context) (res ModifyResult, err error)

	// Explain provides the query execution plan.
	Explain(ctx context.Context) (explain Explain, err error)
}

Query represents a query on a collection.

type ReadTx

type ReadTx interface {
	// Context returns the context associated with the transaction.
	Context() context.Context

	// Commit commits the transaction.
	// Returns an error if the commit fails.
	Commit() error

	// Done returns true if the transaction is completed (committed or rolled back).
	Done() bool
	// contains filtered or unexported methods
}

ReadTx represents a read-only transaction.

type VFS

type VFS = btree.VFS

type WriteTx

type WriteTx interface {
	// ReadTx is embedded to provide read-only transaction methods.
	ReadTx
	// Rollback rolls back the transaction.
	// Returns an error if the rollback fails.
	Rollback() error

	// SetModified marks the transaction as having made modifications
	// used internally for sentinel mechanism
	SetModified()
}

WriteTx represents a read-write transaction.

Directories

Path Synopsis
docs
btree/mappings/scripts/build_mappings command
Rebuilds the derived mapping artifacts from go_to_sqlite.json.
Rebuilds the derived mapping artifacts from go_to_sqlite.json.
btree/mappings/scripts/extract_codec_blocks command
Extracts every SQLCipher codec hook block from ../sqlcipher/src and writes docs/btree/mappings/sqlcipher_codec_blocks.gen.json so mappings_diff can surface any block that has no corresponding row in the hand-edited sqlcipher_codec.json.
Extracts every SQLCipher codec hook block from ../sqlcipher/src and writes docs/btree/mappings/sqlcipher_codec_blocks.gen.json so mappings_diff can surface any block that has no corresponding row in the hand-edited sqlcipher_codec.json.
btree/mappings/scripts/extract_funcs command
Extracts function lists from two sources and writes them to JSON files:
Extracts function lists from two sources and writes them to JSON files:
btree/mappings/scripts/mappings_diff command
Reports drift between the freshly-extracted allowlists (*.gen.json) and the hand-edited mapping inputs (go_to_sqlite.json, sqlite_skip.json, sqlcipher_codec.json).
Reports drift between the freshly-extracted allowlists (*.gen.json) and the hand-edited mapping inputs (go_to_sqlite.json, sqlite_skip.json, sqlcipher_codec.json).
internal
btree
Package btree implements an embedded, crash-safe, ordered key-value database engine in pure Go, closely following SQLite's design for the B-tree, pager, WAL, and transaction subsystems.
Package btree implements an embedded, crash-safe, ordered key-value database engine in pure Go, closely following SQLite's design for the B-tree, pager, WAL, and transaction subsystems.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL