storage

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 16, 2026 License: MIT Imports: 26 Imported by: 0

Documentation

Overview

Package storage is graphdb's core graph store: a multi-tenant, in-memory property graph with durable snapshot + write-ahead-log (WAL) persistence.

The entry type is GraphStorage, created with NewGraphStorage. Nodes and edges are addressed by uint64 IDs and carry typed Value properties (construct them with StringValue, IntValue, JSONValue, etc.).

Tenant scoping

Every operation has a tenant-blind form (e.g. GraphStorage.CreateNode, GraphStorage.GetNode) and a tenant-strict *ForTenant form (e.g. [GraphStorage.CreateNodeForTenant], GraphStorage.GetNodeForTenant). New code should use the *ForTenant methods. A cross-tenant lookup returns ErrNodeNotFound / ErrEdgeNotFound — the same error as a genuinely missing entity — so a tenant cannot probe another tenant's data via error shape.

Internals

Nodes and edges live in 256-way partitioned shard maps with per-shard read locks plus a global write lock for the cross-cutting indexes (label/type, per-tenant enumeration, property, and per-tenant HNSW vector indexes). State is persisted as a flat snapshot plus a WAL that is replayed on open; structured property values round-trip through ValueFromJSON / ValueToJSON.

See the runnable examples for the basic create/traverse flow and the tenant-isolation contract.

Transaction support for GraphDB storage.

This file is the package documentation for transaction support. The implementation is split across:

  • transaction_types.go: Transaction struct and error definitions
  • transaction_ops.go: CRUD operations within a transaction
  • transaction_commit.go: Commit and rollback logic

Index

Examples

Constants

View Source
const (
	KeyPrefixNode     = 'n' // n:<nodeID> -> Node
	KeyPrefixEdge     = 'e' // e:<edgeID> -> Edge
	KeyPrefixOutEdge  = 'o' // o:<fromID>:<toID> -> edgeID
	KeyPrefixInEdge   = 'i' // i:<toID>:<fromID> -> edgeID
	KeyPrefixLabel    = 'l' // l:<label>:<nodeID> -> empty
	KeyPrefixProperty = 'p' // p:<key>:<value>:<nodeID> -> empty
	KeyPrefixMeta     = 'm' // m:<key> -> value (metadata like counters)
)

Key prefixes for LSM storage

View Source
const DefaultTenantID = "default"

DefaultTenantID is used when no tenant is specified (backward compatibility). String form for use with public APIs that still take "tenantID string" parameters; mirrors tenantid.Default for the typed code paths.

View Source
const (
	// MaxPropertyValueLength is the maximum allowed length for string property values
	MaxPropertyValueLength = 10000
)

Variables

View Source
var (
	ErrNodeNotFound              = errors.New("node not found")
	ErrEdgeNotFound              = errors.New("edge not found")
	ErrStorageClosed             = errors.New("storage is closed")
	ErrInvalidID                 = errors.New("invalid ID")
	ErrWALAppendFailed           = errors.New("WAL append failed")
	ErrMarshalFailed             = errors.New("marshal failed")
	ErrIndexFailed               = errors.New("index operation failed")
	ErrUniqueConstraintViolation = errors.New("unique constraint violation")
	// ErrInvalidEdgeWeight is returned by edge create/update when the weight is
	// non-finite (±Inf or NaN). The WAL JSON-encodes the edge and cannot marshal
	// non-finite floats, so such an edge would silently fail to persist and be
	// lost on crash — reject it at the boundary instead (#328).
	ErrInvalidEdgeWeight = errors.New("edge weight must be a finite number")
)

Common sentinel errors

View Source
var (
	ErrTransactionNotActive    = errors.New("transaction is not active")
	ErrTransactionAlreadyEnded = errors.New("transaction has already been committed or rolled back")
	ErrNestedTransaction       = errors.New("nested transactions are not supported")
)
View Source
var ErrCompositeIndexKeyTypeMismatch = fmt.Errorf("propertyKeys and indexTypes must have the same length")

ErrCompositeIndexKeyTypeMismatch is returned when propertyKeys and indexTypes have different lengths

View Source
var ErrCompositeIndexMinKeys = fmt.Errorf("composite index requires at least 2 property keys")

ErrCompositeIndexMinKeys is returned when fewer than 2 property keys are provided

Functions

func EdgeNotFoundError

func EdgeNotFoundError(edgeID uint64) error

EdgeNotFoundError creates an edge not found error.

func IndexError

func IndexError(op, field string, cause error) error

IndexError creates an index operation error.

func IsClosed

func IsClosed(err error) bool

IsClosed returns true if the error indicates the storage is closed.

func IsNotFound

func IsNotFound(err error) bool

IsNotFound returns true if the error is a not found error.

func MarshalError

func MarshalError(entity string, id uint64, cause error) error

MarshalError creates a marshal error for the given entity.

func NodeNotFoundError

func NodeNotFoundError(nodeID uint64) error

NodeNotFoundError creates a node not found error.

func PropertiesToJSON added in v0.5.0

func PropertiesToJSON(props map[string]Value) map[string]any

PropertiesToJSON converts a property map to a JSON-ready map by running each value through ValueToJSON. Returns a non-nil empty map for a nil/empty input so json.Marshal emits {} rather than null.

func SanitizePropertyMap

func SanitizePropertyMap(properties map[string]any) map[string]any

SanitizePropertyMap sanitizes all string values in a property map This should be called before storing properties to prevent XSS attacks

func SanitizePropertyValue

func SanitizePropertyValue(value any) (any, bool)

SanitizePropertyValue sanitizes a property value based on its type Returns the sanitized value and a boolean indicating if it was modified

func SanitizeStringValue

func SanitizeStringValue(s string) string

SanitizeStringValue sanitizes a string value to prevent XSS attacks It performs the following: - HTML escaping to neutralize script tags, event handlers, etc. - Removal of null bytes - Length limiting to prevent DoS

func ValueToJSON added in v0.5.0

func ValueToJSON(v Value) any

ValueToJSON converts a storage Value back to a JSON-ready Go value — the documented inverse of ValueFromJSON. Scalars and arrays decode to their Go types; timestamps render as RFC3339 strings; bytes stay []byte (encoding/json base64-encodes them); TypeJSON unmarshals to its original shape (#224).

On a decode error (corruption / malformed Data) it falls back to the raw bytes rather than a sentinel string, preserving the value's presence in the response. Centralising this here lets the REST handlers and the GraphQL resolvers share one converter instead of each hand-rolling a type switch (the divergence that caused #224's GraphQL "null" / {"Type","Data"} bugs).

func WALError

func WALError(op string, cause error) error

WALError creates a WAL operation error.

Types

type BTreeGraphStorage

type BTreeGraphStorage struct {
	// contains filtered or unexported fields
}

BTreeGraphStorage is a graph store backed by pkg/btree's persistent B+Tree. Keys follow a typed-prefix layout:

n:{tenant}:{nodeID}              -> JSON(Node)
e:{tenant}:{fromID}:{type}:{toID} -> JSON(Edge)
ei:{tenant}:{edgeID}             -> primary edge key (lookup index)
i:{tenant}:{toID}:{type}:{fromID} -> sentinel (incoming-edge index)
l:{tenant}:{label}:{nodeID}      -> sentinel (label index)
meta:nextNodeID / meta:nextEdgeID -> 8-byte big-endian counters

Counters are persisted on Close() and Snapshot(); IDs are issued via atomic.AddUint64 within the running process.

func NewBTreeGraphStorage

func NewBTreeGraphStorage(dataDir string) (*BTreeGraphStorage, error)

NewBTreeGraphStorage opens (or initializes) a B+Tree-backed graph store rooted at dataDir/graph.db.

func (*BTreeGraphStorage) AddObserver

func (gs *BTreeGraphStorage) AddObserver(obs NodeObserver)

AddObserver is a no-op on the BTree backend: the underlying observer notify mechanism lives on *GraphStorage. Observers attached to a BTreeGraphStorage simply never fire. This is intentional — the BTree backend is a C2-stage experimental write surface, not an observer dispatch host.

func (*BTreeGraphStorage) BeginBatch

func (gs *BTreeGraphStorage) BeginBatch() *Batch

BeginBatch's signature returns *Batch — there is no error channel, so a silent nil return would nil-deref on the first operation. Panic is the least-surprising failure mode here; production code must select *GraphStorage if batches are needed.

func (*BTreeGraphStorage) Close

func (gs *BTreeGraphStorage) Close() error

Close persists ID counters and closes the underlying B+Tree.

func (*BTreeGraphStorage) CountEdgesForTenant

func (gs *BTreeGraphStorage) CountEdgesForTenant(tenantID string) uint64

func (*BTreeGraphStorage) CountNodesForTenant

func (gs *BTreeGraphStorage) CountNodesForTenant(tenantID string) uint64

func (*BTreeGraphStorage) CreateEdge

func (gs *BTreeGraphStorage) CreateEdge(fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, error)

func (*BTreeGraphStorage) CreateEdgeWithTenant

func (gs *BTreeGraphStorage) CreateEdgeWithTenant(tenantID string, fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, error)

func (*BTreeGraphStorage) CreateNode

func (gs *BTreeGraphStorage) CreateNode(labels []string, properties map[string]Value) (*Node, error)

func (*BTreeGraphStorage) CreateNodeWithTenant

func (gs *BTreeGraphStorage) CreateNodeWithTenant(tenantID string, labels []string, properties map[string]Value) (*Node, error)

func (*BTreeGraphStorage) CreateNodeWithUniquePropertyForTenant

func (gs *BTreeGraphStorage) CreateNodeWithUniquePropertyForTenant(tenantID string, labels []string, properties map[string]Value, uniqueLabel string, uniquePropertyKey string) (*Node, error)

CreateNodeWithUniquePropertyForTenant is the B-lite atomic uniqueness primitive used by graphdb-coord. Returning a typed error (not silent (nil, nil)) keeps coord state from being silently corrupted if anyone wires this backend in.

func (*BTreeGraphStorage) CreateVectorIndex

func (gs *BTreeGraphStorage) CreateVectorIndex(propertyName string, dimensions int, m int, efConstruction int, metric vector.DistanceMetric) error

func (*BTreeGraphStorage) CreateVectorIndexForTenant

func (gs *BTreeGraphStorage) CreateVectorIndexForTenant(tenantID string, propertyName string, dimensions int, m int, efConstruction int, metric vector.DistanceMetric) error

func (*BTreeGraphStorage) DeleteEdge

func (gs *BTreeGraphStorage) DeleteEdge(edgeID uint64) error

func (*BTreeGraphStorage) DeleteEdgeForTenant

func (gs *BTreeGraphStorage) DeleteEdgeForTenant(edgeID uint64, tenantID string) error

func (*BTreeGraphStorage) DeleteNode

func (gs *BTreeGraphStorage) DeleteNode(nodeID uint64) error

func (*BTreeGraphStorage) DeleteNodeForTenant

func (gs *BTreeGraphStorage) DeleteNodeForTenant(nodeID uint64, tenantID string) error

DeleteNodeForTenant removes the node record and its label-index entries.

TODO(C2.1): unlike (*GraphStorage).DeleteNodeForTenant, this does NOT cascade outgoing/incoming edges. Tracked as a behavioral gap; callers that depend on cascade delete must use the in-memory backend until C2.1.

func (*BTreeGraphStorage) DropVectorIndex

func (gs *BTreeGraphStorage) DropVectorIndex(propertyName string) error

func (*BTreeGraphStorage) DropVectorIndexForTenant

func (gs *BTreeGraphStorage) DropVectorIndexForTenant(tenantID string, propertyName string) error

func (*BTreeGraphStorage) FindNodesByLabelAcrossTenants

func (gs *BTreeGraphStorage) FindNodesByLabelAcrossTenants(label string) ([]*Node, error)

func (*BTreeGraphStorage) FindNodesByPropertyForTenant

func (gs *BTreeGraphStorage) FindNodesByPropertyForTenant(key string, value Value, tenantID string) ([]*Node, error)

func (*BTreeGraphStorage) FindNodesByPropertyIndexedForTenant

func (gs *BTreeGraphStorage) FindNodesByPropertyIndexedForTenant(key string, value Value, tenantID string) ([]*Node, error)

func (*BTreeGraphStorage) GetAllEdgesAcrossTenants

func (gs *BTreeGraphStorage) GetAllEdgesAcrossTenants() []*Edge

func (*BTreeGraphStorage) GetAllEdgesForTenant

func (gs *BTreeGraphStorage) GetAllEdgesForTenant(tenantID string) []*Edge

func (*BTreeGraphStorage) GetAllLabels

func (gs *BTreeGraphStorage) GetAllLabels() []string

func (*BTreeGraphStorage) GetAllNodesAcrossTenants

func (gs *BTreeGraphStorage) GetAllNodesAcrossTenants() []*Node

func (*BTreeGraphStorage) GetAllNodesForTenant

func (gs *BTreeGraphStorage) GetAllNodesForTenant(tenantID string) []*Node

func (*BTreeGraphStorage) GetEdge

func (gs *BTreeGraphStorage) GetEdge(edgeID uint64) (*Edge, error)

func (*BTreeGraphStorage) GetEdgeForTenant

func (gs *BTreeGraphStorage) GetEdgeForTenant(edgeID uint64, tenantID string) (*Edge, error)

func (*BTreeGraphStorage) GetEdgeTypesForTenant

func (gs *BTreeGraphStorage) GetEdgeTypesForTenant(tenantID string) []string

func (*BTreeGraphStorage) GetEdgesByTypeForTenant

func (gs *BTreeGraphStorage) GetEdgesByTypeForTenant(tenantID string, edgeType string) []*Edge

func (*BTreeGraphStorage) GetIncomingEdges

func (gs *BTreeGraphStorage) GetIncomingEdges(nodeID uint64) ([]*Edge, error)

func (*BTreeGraphStorage) GetIncomingEdgesForTenant

func (gs *BTreeGraphStorage) GetIncomingEdgesForTenant(nodeID uint64, tenantID string) ([]*Edge, error)

func (*BTreeGraphStorage) GetLabelsForTenant

func (gs *BTreeGraphStorage) GetLabelsForTenant(tenantID string) []string

func (*BTreeGraphStorage) GetNode

func (gs *BTreeGraphStorage) GetNode(nodeID uint64) (*Node, error)

func (*BTreeGraphStorage) GetNodeForTenant

func (gs *BTreeGraphStorage) GetNodeForTenant(nodeID uint64, tenantID string) (*Node, error)

func (*BTreeGraphStorage) GetNodesByLabelForTenant

func (gs *BTreeGraphStorage) GetNodesByLabelForTenant(tenantID string, label string) []*Node

func (*BTreeGraphStorage) GetOutgoingEdges

func (gs *BTreeGraphStorage) GetOutgoingEdges(nodeID uint64) ([]*Edge, error)

func (*BTreeGraphStorage) GetOutgoingEdgesForTenant

func (gs *BTreeGraphStorage) GetOutgoingEdgesForTenant(nodeID uint64, tenantID string) ([]*Edge, error)

func (*BTreeGraphStorage) GetStatistics

func (gs *BTreeGraphStorage) GetStatistics() Statistics

func (*BTreeGraphStorage) GetVectorIndexMetric

func (gs *BTreeGraphStorage) GetVectorIndexMetric(propertyName string) (vector.DistanceMetric, error)

func (*BTreeGraphStorage) GetVectorIndexMetricForTenant

func (gs *BTreeGraphStorage) GetVectorIndexMetricForTenant(tenantID string, propertyName string) (vector.DistanceMetric, error)

func (*BTreeGraphStorage) HasPropertyIndex

func (gs *BTreeGraphStorage) HasPropertyIndex(key string) bool

func (*BTreeGraphStorage) HasVectorIndex

func (gs *BTreeGraphStorage) HasVectorIndex(propertyName string) bool

func (*BTreeGraphStorage) HasVectorIndexForTenant

func (gs *BTreeGraphStorage) HasVectorIndexForTenant(tenantID string, propertyName string) bool

func (*BTreeGraphStorage) ListVectorIndexes

func (gs *BTreeGraphStorage) ListVectorIndexes() []string

func (*BTreeGraphStorage) ListVectorIndexesForTenant

func (gs *BTreeGraphStorage) ListVectorIndexesForTenant(tenantID string) []string

func (*BTreeGraphStorage) RemoveNodeFromVectorIndexes

func (gs *BTreeGraphStorage) RemoveNodeFromVectorIndexes(nodeID uint64, tenantID string) error

func (*BTreeGraphStorage) RemoveNodeProperties

func (gs *BTreeGraphStorage) RemoveNodeProperties(nodeID uint64, keys []string) error

func (*BTreeGraphStorage) RemoveNodePropertiesForTenant

func (gs *BTreeGraphStorage) RemoveNodePropertiesForTenant(nodeID uint64, keys []string, tenantID string) error

func (*BTreeGraphStorage) SetEncryption

func (gs *BTreeGraphStorage) SetEncryption(engine encryption.EncryptDecrypter, keyManager encryption.KeyProvider)

SetEncryption is a no-op: at-rest encryption for the BTree backend is out of scope for C2 (the in-memory backend's encryption integration is the canonical path).

func (*BTreeGraphStorage) Snapshot

func (gs *BTreeGraphStorage) Snapshot() error

Snapshot persists ID counters and flushes the underlying B+Tree.

Signature note: matches S1 (`Snapshot() error`, no ctx). The archive parent had `Snapshot(ctx)` for cancellability, but R3 (this S1 closure) deliberately kept the no-ctx shape — see interface.go's header comment for the rationale. A future cancelable/streaming snapshot would be a new method (e.g., SnapshotStream), not a signature change to Snapshot.

func (*BTreeGraphStorage) UpdateEdge

func (gs *BTreeGraphStorage) UpdateEdge(edgeID uint64, properties map[string]Value, weight *float64) error

func (*BTreeGraphStorage) UpdateEdgeForTenant

func (gs *BTreeGraphStorage) UpdateEdgeForTenant(edgeID uint64, properties map[string]Value, weight *float64, tenantID string) error

func (*BTreeGraphStorage) UpdateNode

func (gs *BTreeGraphStorage) UpdateNode(nodeID uint64, properties map[string]Value) error

func (*BTreeGraphStorage) UpdateNodeForTenant

func (gs *BTreeGraphStorage) UpdateNodeForTenant(nodeID uint64, properties map[string]Value, tenantID string) error

func (*BTreeGraphStorage) UpdateNodeVectorIndexes

func (gs *BTreeGraphStorage) UpdateNodeVectorIndexes(node *Node) error

UpdateNodeVectorIndexes / RemoveNodeFromVectorIndexes are best-effort maintenance hooks; returning nil keeps callers like CreateNodeWithTenant quiet. Real maintenance is the in-memory backend's domain (R1.x).

func (*BTreeGraphStorage) UpsertEdgeWithTenant

func (gs *BTreeGraphStorage) UpsertEdgeWithTenant(tenantID string, fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, bool, error)

func (*BTreeGraphStorage) VectorSearch

func (gs *BTreeGraphStorage) VectorSearch(propertyName string, query []float32, k int, ef int) ([]vector.SearchResult, error)

func (*BTreeGraphStorage) VectorSearchForTenant

func (gs *BTreeGraphStorage) VectorSearchForTenant(tenantID string, propertyName string, query []float32, k int, ef int) ([]vector.SearchResult, error)

Tenant-scoped vector index stubs (R3 / S1 closure). Same shapes as the tenant-blind variants above — the BTree backend does not implement vector indexes (mutating methods return errBTreeBackendUnsupported, probing methods return false/empty/error per the F4 spike's unified- response convention).

func (*BTreeGraphStorage) WithNodeRefForTenant

func (gs *BTreeGraphStorage) WithNodeRefForTenant(nodeID uint64, tenantID string, fn func(*Node) error) error

type Batch

type Batch struct {
	// contains filtered or unexported fields
}

Batch represents a batch of write operations

func (*Batch) AddEdge

func (b *Batch) AddEdge(fromNodeID, toNodeID uint64, edgeType string, properties map[string]Value, weight float64) (uint64, error)

AddEdge queues an edge creation in the batch

func (*Batch) AddNode

func (b *Batch) AddNode(labels []string, properties map[string]Value) (uint64, error)

AddNode queues a node creation in the batch

func (*Batch) Clear

func (b *Batch) Clear()

Clear removes all operations from the batch

func (*Batch) Commit

func (b *Batch) Commit() error

Commit executes all batched operations atomically.

Mirrors Transaction.Commit's structure: persist + maintain every index under gs.mu while COLLECTING the off-lock work (HNSW vector inserts, observer snapshots), then release the lock and apply that work off-lock (Track P H2 — HNSW inserts must not run under gs.mu; observers see committed state). The per-op WAL writes stay under the lock, unchanged.

func (*Batch) DeleteEdge

func (b *Batch) DeleteEdge(edgeID uint64)

DeleteEdge queues an edge deletion in the batch

func (*Batch) DeleteNode

func (b *Batch) DeleteNode(nodeID uint64)

DeleteNode queues a node deletion in the batch

func (*Batch) Size

func (b *Batch) Size() int

Size returns the number of operations in the batch

func (*Batch) UpdateNode

func (b *Batch) UpdateNode(nodeID uint64, properties map[string]Value)

UpdateNode queues a node update in the batch

type CompositeIndex

type CompositeIndex struct {
	// contains filtered or unexported fields
}

CompositeIndex maintains an index on multiple node properties. This enables efficient lookups when filtering by multiple properties simultaneously. For example, a composite index on (domain, status) allows O(1) lookups for "all concepts in domain 'math' with status 'active'".

func NewCompositeIndex

func NewCompositeIndex(propertyKeys []string, indexTypes []ValueType) (*CompositeIndex, error)

NewCompositeIndex creates a new composite index on multiple properties. The order of keys matters - lookups must provide values in the same order. Returns an error if propertyKeys and indexTypes have different lengths, or if fewer than 2 property keys are provided.

func (*CompositeIndex) GetStatistics

func (idx *CompositeIndex) GetStatistics() CompositeIndexStatistics

GetStatistics returns statistics about the composite index

func (*CompositeIndex) Insert

func (idx *CompositeIndex) Insert(nodeID uint64, values []Value) error

Insert adds a node to the composite index

func (*CompositeIndex) Lookup

func (idx *CompositeIndex) Lookup(values []Value) ([]uint64, error)

Lookup finds all nodes matching all property values (exact match)

func (*CompositeIndex) PrefixLookup

func (idx *CompositeIndex) PrefixLookup(values []Value) ([]uint64, error)

PrefixLookup finds all nodes matching a prefix of the composite key. This allows lookups like "all nodes with domain='math'" without specifying the second property. The values slice must be a prefix of the full key.

func (*CompositeIndex) PropertyKeys

func (idx *CompositeIndex) PropertyKeys() []string

PropertyKeys returns the property keys this index covers

func (*CompositeIndex) Remove

func (idx *CompositeIndex) Remove(nodeID uint64, values []Value) error

Remove removes a node from the composite index

type CompositeIndexStatistics

type CompositeIndexStatistics struct {
	PropertyKeys   []string
	UniqueKeys     int
	TotalNodes     int
	AvgNodesPerKey float64
}

CompositeIndexStatistics holds statistics about a composite index

type CompressedEdgeList

type CompressedEdgeList struct {
	BaseNodeID uint64 // First node ID in the list (exported for gob encoding)
	Deltas     []byte // Varint-encoded deltas between consecutive sorted node IDs (exported for gob encoding)
	EdgeCount  int    // Number of edges in the list (exported for gob encoding)
}

CompressedEdgeList stores a list of edge target node IDs in compressed format Uses delta encoding + varint compression for memory efficiency

func NewCompressedEdgeList

func NewCompressedEdgeList(nodeIDs []uint64) (*CompressedEdgeList, error)

NewCompressedEdgeList creates a compressed edge list from node IDs Uses buffer pooling to reduce GC pressure. Returns error if data corruption is detected (should never happen in normal operation).

func (*CompressedEdgeList) Add

Add adds a new node ID to the compressed list Note: This is less efficient than batch creation and should be used sparingly

func (*CompressedEdgeList) CompressionRatio

func (c *CompressedEdgeList) CompressionRatio() float64

CompressionRatio returns the compression ratio (uncompressed / compressed)

func (*CompressedEdgeList) Contains

func (c *CompressedEdgeList) Contains(nodeID uint64) bool

Contains checks if a node ID exists in the compressed list Walks through deltas sequentially with early termination (avoids full decompression)

func (*CompressedEdgeList) Count

func (c *CompressedEdgeList) Count() int

Count returns the number of edges in the compressed list

func (*CompressedEdgeList) Decompress

func (c *CompressedEdgeList) Decompress() []uint64

Decompress returns the original list of node IDs Uses buffer pooling to reduce GC pressure

func (*CompressedEdgeList) Remove

func (c *CompressedEdgeList) Remove(nodeID uint64) (*CompressedEdgeList, error)

Remove removes a node ID from the compressed list Note: This is less efficient than batch creation and should be used sparingly

func (*CompressedEdgeList) Size

func (c *CompressedEdgeList) Size() int

Size returns the memory size in bytes

func (*CompressedEdgeList) UncompressedSize

func (c *CompressedEdgeList) UncompressedSize() int

UncompressedSize returns the size if this list was uncompressed

type CompressionStats

type CompressionStats struct {
	TotalLists        int
	TotalEdges        int64
	UncompressedBytes int64
	CompressedBytes   int64
	AverageRatio      float64
}

CompressionStats tracks compression statistics across all edge lists

func CalculateCompressionStats

func CalculateCompressionStats(lists []*CompressedEdgeList) CompressionStats

CalculateCompressionStats calculates statistics for multiple compressed lists

type Edge

type Edge struct {
	ID         uint64
	TenantID   string // Multi-tenancy: empty string defaults to "default" tenant
	FromNodeID uint64
	ToNodeID   uint64
	Type       string
	Properties map[string]Value
	Weight     float64
	CreatedAt  int64
}

Edge represents a relationship between nodes

func (*Edge) Clone

func (e *Edge) Clone() *Edge

Clone creates a deep copy of an edge

func (*Edge) GetProperty

func (e *Edge) GetProperty(key string) (Value, bool)

GetProperty gets a property value

type EdgeCache

type EdgeCache struct {
	// contains filtered or unexported fields
}

EdgeCache implements an LRU cache for CompressedEdgeList

func NewEdgeCache

func NewEdgeCache(maxSize int) *EdgeCache

NewEdgeCache creates a new LRU cache with the specified maximum size

func (*EdgeCache) Clear

func (ec *EdgeCache) Clear()

Clear removes all entries from the cache

func (*EdgeCache) Get

func (ec *EdgeCache) Get(key string) *CompressedEdgeList

Get retrieves a value from the cache Returns nil if not found

func (*EdgeCache) HitRate

func (ec *EdgeCache) HitRate() float64

HitRate returns the cache hit rate (0.0 - 1.0)

func (*EdgeCache) Put

func (ec *EdgeCache) Put(key string, value *CompressedEdgeList)

Put adds or updates a value in the cache

func (*EdgeCache) ResetStats

func (ec *EdgeCache) ResetStats()

ResetStats resets hit/miss counters

func (*EdgeCache) Size

func (ec *EdgeCache) Size() int

Size returns the current number of entries in the cache

func (*EdgeCache) Stats

func (ec *EdgeCache) Stats() (hits, misses uint64, hitRate float64)

Stats returns cache statistics

type EdgeSpec added in v0.5.0

type EdgeSpec struct {
	FromID, ToID uint64
	Type         string
	Properties   map[string]Value
	Weight       float64
}

EdgeSpec describes one edge for bulk creation.

type EdgeStore

type EdgeStore struct {
	// contains filtered or unexported fields
}

EdgeStore provides disk-backed storage for adjacency lists using LSM

func NewEdgeStore

func NewEdgeStore(dataDir string, cacheSize int) (*EdgeStore, error)

NewEdgeStore creates a new disk-backed edge storage

func (*EdgeStore) Close

func (es *EdgeStore) Close() error

Close closes the edge store and flushes all data

func (*EdgeStore) GetIncomingEdges

func (es *EdgeStore) GetIncomingEdges(nodeID uint64) ([]uint64, error)

GetIncomingEdges retrieves the incoming edge list for a node

func (*EdgeStore) GetOutgoingEdges

func (es *EdgeStore) GetOutgoingEdges(nodeID uint64) ([]uint64, error)

GetOutgoingEdges retrieves the outgoing edge list for a node

func (*EdgeStore) StoreIncomingEdges

func (es *EdgeStore) StoreIncomingEdges(nodeID uint64, edges []uint64) error

StoreIncomingEdges stores the incoming edge list for a node

func (*EdgeStore) StoreOutgoingEdges

func (es *EdgeStore) StoreOutgoingEdges(nodeID uint64, edges []uint64) error

StoreOutgoingEdges stores the outgoing edge list for a node

func (*EdgeStore) Sync

func (es *EdgeStore) Sync() error

Sync forces a flush of pending writes to disk

type ErrorBuilder

type ErrorBuilder struct {
	// contains filtered or unexported fields
}

ErrorBuilder provides a fluent interface for building StorageErrors.

func NewError

func NewError(op string) *ErrorBuilder

NewError creates a new error builder with the given operation.

func (*ErrorBuilder) Build

func (b *ErrorBuilder) Build() *StorageError

Build returns the constructed StorageError.

func (*ErrorBuilder) Cause

func (b *ErrorBuilder) Cause(err error) *ErrorBuilder

Cause sets the underlying error cause.

func (*ErrorBuilder) Context

func (b *ErrorBuilder) Context(ctx string) *ErrorBuilder

Context sets additional context information.

func (*ErrorBuilder) Edge

func (b *ErrorBuilder) Edge(id uint64) *ErrorBuilder

Edge sets the entity to "edge" with the given ID.

func (*ErrorBuilder) Err

func (b *ErrorBuilder) Err() error

Err returns the error as an error interface.

func (*ErrorBuilder) Field

func (b *ErrorBuilder) Field(name string) *ErrorBuilder

Field sets the field name for property operations.

func (*ErrorBuilder) Index

func (b *ErrorBuilder) Index(field string) *ErrorBuilder

Index sets the entity to "index" with the given field name.

func (*ErrorBuilder) Node

func (b *ErrorBuilder) Node(id uint64) *ErrorBuilder

Node sets the entity to "node" with the given ID.

func (*ErrorBuilder) WAL

func (b *ErrorBuilder) WAL() *ErrorBuilder

WAL sets the entity to "WAL".

type GraphSnapshot

type GraphSnapshot struct {
	// contains filtered or unexported fields
}

TimeTravel returns a snapshot of the graph at a specific time

func NewGraphSnapshot

func NewGraphSnapshot(graph *GraphStorage, timestamp int64) *GraphSnapshot

NewGraphSnapshot creates a point-in-time view of the graph

func (*GraphSnapshot) GetOutgoingEdges

func (gs *GraphSnapshot) GetOutgoingEdges(nodeID uint64) ([]*Edge, error)

GetOutgoingEdges returns edges valid at snapshot time

type GraphStorage

type GraphStorage struct {
	// contains filtered or unexported fields
}

GraphStorage is the core in-memory graph storage engine

Example

ExampleGraphStorage shows the basic embedding flow: open a store, create two nodes and an edge between them, then read an edge back. Node IDs are assigned sequentially from 1.

package main

import (
	"fmt"
	"log"
	"os"

	"github.com/dd0wney/graphdb/pkg/storage"
)

func main() {
	dir, err := os.MkdirTemp("", "graphdb-example")
	if err != nil {
		log.Fatal(err)
	}
	defer os.RemoveAll(dir)

	gs, err := storage.NewGraphStorage(dir)
	if err != nil {
		log.Fatal(err)
	}
	defer func() { _ = gs.Close() }()

	alice, err := gs.CreateNode([]string{"Person"}, map[string]storage.Value{
		"name": storage.StringValue("Alice"),
	})
	if err != nil {
		log.Fatal(err)
	}
	bob, err := gs.CreateNode([]string{"Person"}, map[string]storage.Value{
		"name": storage.StringValue("Bob"),
	})
	if err != nil {
		log.Fatal(err)
	}
	if _, err := gs.CreateEdge(alice.ID, bob.ID, "KNOWS", nil, 1.0); err != nil {
		log.Fatal(err)
	}

	edges, err := gs.GetOutgoingEdges(alice.ID)
	if err != nil {
		log.Fatal(err)
	}
	name, _ := alice.Properties["name"].AsString()
	fmt.Printf("%s -%s-> node %d\n", name, edges[0].Type, edges[0].ToNodeID)
}
Output:
Alice -KNOWS-> node 2
Example (TenantIsolation)

ExampleGraphStorage_tenantIsolation shows the *ForTenant convention: new code scopes every operation to a tenant, and a cross-tenant read returns ErrNodeNotFound (the same error as a genuinely missing node) so existence can't leak across tenants.

package main

import (
	"errors"
	"fmt"
	"log"
	"os"

	"github.com/dd0wney/graphdb/pkg/storage"
)

func main() {
	dir, err := os.MkdirTemp("", "graphdb-example")
	if err != nil {
		log.Fatal(err)
	}
	defer os.RemoveAll(dir)

	gs, err := storage.NewGraphStorage(dir)
	if err != nil {
		log.Fatal(err)
	}
	defer func() { _ = gs.Close() }()

	doc, err := gs.CreateNodeWithTenant("acme", []string{"Doc"}, nil)
	if err != nil {
		log.Fatal(err)
	}

	// The owning tenant sees the node.
	_, err = gs.GetNodeForTenant(doc.ID, "acme")
	fmt.Println("acme can read:", err == nil)

	// A different tenant cannot — and gets ErrNodeNotFound, not a distinct
	// "forbidden" error, so it can't tell the node exists at all.
	_, err = gs.GetNodeForTenant(doc.ID, "globex")
	fmt.Println("globex blocked:", errors.Is(err, storage.ErrNodeNotFound))
}
Output:
acme can read: true
globex blocked: true

func NewGraphStorage

func NewGraphStorage(dataDir string) (*GraphStorage, error)

NewGraphStorage creates a new graph storage engine with default config

func NewGraphStorageWithConfig

func NewGraphStorageWithConfig(config StorageConfig) (*GraphStorage, error)

NewGraphStorageWithConfig creates a new graph storage engine with custom config

func (*GraphStorage) AddObserver

func (gs *GraphStorage) AddObserver(obs NodeObserver)

AddObserver registers obs to receive node-lifecycle notifications.

Observers are called in registration order. AddObserver is safe to call concurrently with storage operations, but the typical usage is at startup before serving requests (see R2.5's server_init.go wiring). Observers cannot be removed once registered — for tests, construct a fresh GraphStorage rather than detaching observers.

func (*GraphStorage) BeginBatch

func (g *GraphStorage) BeginBatch() *Batch

BeginBatch starts a new batch operation

func (*GraphStorage) BeginTransaction

func (gs *GraphStorage) BeginTransaction() (*Transaction, error)

BeginTransaction starts a new transaction in the default tenant. Equivalent to BeginTransactionForTenant("") — kept for backward compatibility.

func (*GraphStorage) BeginTransactionForTenant

func (gs *GraphStorage) BeginTransactionForTenant(tenantID string) (*Transaction, error)

BeginTransactionForTenant starts a new transaction bound to tenantID. All nodes/edges created in the transaction are stamped with this tenant, and Commit enforces tenant ownership of edge endpoints / update targets.

func (*GraphStorage) Close

func (gs *GraphStorage) Close() error

Close performs cleanup

func (*GraphStorage) CompactWAL added in v0.5.0

func (gs *GraphStorage) CompactWAL() error

CompactWAL checkpoints the WAL: it writes a snapshot of the current state, capturing the boundary LSN under the snapshot's lock, then drops every WAL entry the snapshot already covers (LSN ≤ boundary) while keeping concurrent writers' entries (LSN > boundary). The replay model is unchanged: snapshot (state ≤ boundary) + remaining WAL on top.

M-1 (AUDIT_security_2026-06-10 / DESIGN_m1_wal_remanence_2026-06-10 Option A): called after a tenant delete so the deleted tenant's OpCreate* records — its PII — leave the WAL immediately instead of lingering until the next Close. Safe under live traffic; a naive Snapshot+Truncate loses any write that lands between the two.

func (*GraphStorage) CompressEdgeLists

func (gs *GraphStorage) CompressEdgeLists() error

CompressEdgeLists compresses all uncompressed edge lists This can be called periodically to reduce memory usage

func (*GraphStorage) CountEdgesForTenant

func (gs *GraphStorage) CountEdgesForTenant(tenantID string) uint64

CountEdgesForTenant returns the number of edges for a tenant.

func (*GraphStorage) CountNodesByLabelForTenant

func (gs *GraphStorage) CountNodesByLabelForTenant(tenantID, label string) int

CountNodesByLabelForTenant returns how many nodes a tenant has with the given label, reading len(index) directly instead of cloning the whole bucket the way len(GetNodesByLabelForTenant(...)) does. The label index is O(1) to size; the previous count path materialized and deep-cloned every node in the bucket — for a 50k-node label that is 50k Clone() calls under gs.mu.RLock just to discard them and take a length (audit M1).

func (*GraphStorage) CountNodesForTenant

func (gs *GraphStorage) CountNodesForTenant(tenantID string) uint64

CountNodesForTenant returns the number of nodes for a tenant.

func (*GraphStorage) CreateEdge

func (gs *GraphStorage) CreateEdge(fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, error)

CreateEdge creates a new edge between two nodes in the default tenant. Tenant-blind on node verification — used by replication (which lands replicated writes in the default tenant; see audit task A8) and by other intentionally tenant-blind callers (temporal snapshots, CLI, examples). Multi-tenant API callers must use CreateEdgeWithTenant.

Existence (not tenancy) of the from/to nodes is still validated.

func (*GraphStorage) CreateEdgeWithTenant

func (gs *GraphStorage) CreateEdgeWithTenant(tenantID string, fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, error)

CreateEdgeWithTenant creates a new edge between two nodes for a specific tenant. From/to nodes must belong to the same tenant — cross-tenant or missing surfaces as ErrNodeNotFound (unified to avoid existence-leak side channel).

Audit A6a follow-up (2026-05-08): closes the residual gap from A6a where this method accepted node IDs owned by other tenants. The /vector-search, /traverse and /shortest-path tenant scoping now rests on this guarantee — see the updated comments in pkg/api/handlers_edges.go and pkg/algorithms/shortest_path.go.

func (*GraphStorage) CreateEdgesWithTenant added in v0.5.0

func (gs *GraphStorage) CreateEdgesWithTenant(tenantID string, specs []EdgeSpec) ([]uint64, error)

CreateEdgesWithTenant creates many edges under one acquisition of gs.mu, mirroring CreateEdgeWithTenant (verify endpoints, then createEdgeWithTenantNoVerify) but amortizing the lock. WAL waits run once after unlock. Returns the new edge IDs in input order.

On error mid-batch the lock is released, the WAL waits for already-created edges still run, and the IDs created so far are returned with the error — matching the single-create contract (a successful createEdgeWithTenantNoVerify is durable and not rolled back).

func (*GraphStorage) CreateNode

func (gs *GraphStorage) CreateNode(labels []string, properties map[string]Value) (*Node, error)

CreateNode creates a new node in the default tenant. For multi-tenant operations, use CreateNodeWithTenant instead.

func (*GraphStorage) CreateNodeWithTenant

func (gs *GraphStorage) CreateNodeWithTenant(tenantID string, labels []string, properties map[string]Value) (*Node, error)

CreateNodeWithTenant creates a new node for a specific tenant.

Lock discipline (R2.1, S11 spike §7.4): the gs.mu.Lock is released before notifyNodeCreated runs so observer code never executes under any storage lock. Direct unlock + nil-check on err mirrors what `defer gs.mu.Unlock()` would do but lets the notify call land after the lock release.

func (*GraphStorage) CreateNodeWithUniquePropertyForTenant

func (gs *GraphStorage) CreateNodeWithUniquePropertyForTenant(
	tenantID string,
	labels []string,
	properties map[string]Value,
	uniqueLabel string,
	uniquePropertyKey string,
) (*Node, error)

CreateNodeWithUniquePropertyForTenant creates a node only if no other node in the same tenant already has the same value for (uniqueLabel, uniquePropertyKey). The check + create runs under a single gs.mu.Lock acquisition, so two concurrent calls cannot both observe "no conflict" and both create. On conflict, returns a *UniqueConstraintError (errors.Is matches ErrUniqueConstraintViolation).

uniqueLabel must be present in labels and the new properties must contain uniquePropertyKey. The constraint is enforced for the named label only, mirroring the per-label scope in pkg/constraints — nodes with other labels can hold the same property value.

Introduced for B-lite (docs/COORD_DEPLOY_SPIKE_2026-05-10.md §5.2 / §10 PR 1) so the GraphQL :Claim resolver can enforce one active claim per for_task. Generic enough to reuse if other coord types need uniqueness; a fully general HTTP/GraphQL constraint API (B-full) is still deferred per the spike.

func (*GraphStorage) CreateNodesWithTenant added in v0.5.0

func (gs *GraphStorage) CreateNodesWithTenant(tenantID string, specs []NodeSpec) ([]uint64, error)

CreateNodesWithTenant creates many nodes under one acquisition of gs.mu, mirroring CreateNodeWithTenant's per-node logic (createNodeLocked) but amortizing the global write lock across the whole batch. Post-lock steps (vector inserts, WAL durability wait, observer notify) run once after unlock, exactly as the single-node path does. Returns the new IDs in input order.

On error mid-batch the lock is released and the IDs created so far are returned alongside the error — the nodes already published before the failure stay durable (their WAL entries were enqueued under the lock), matching the single-create contract where a successful createNodeLocked is not rolled back.

func (*GraphStorage) CreatePropertyIndex

func (gs *GraphStorage) CreatePropertyIndex(propertyKey string, valueType ValueType) error

CreatePropertyIndex creates an index on a node property

func (*GraphStorage) CreateVectorIndex

func (gs *GraphStorage) CreateVectorIndex(
	propertyName string,
	dimensions int,
	m int,
	efConstruction int,
	metric vector.DistanceMetric,
) error

CreateVectorIndex creates a vector index for a node property under the default tenant. Tenant-blind; new callers should prefer CreateVectorIndexForTenant.

func (*GraphStorage) CreateVectorIndexForTenant

func (gs *GraphStorage) CreateVectorIndexForTenant(
	tenantID string,
	propertyName string,
	dimensions int,
	m int,
	efConstruction int,
	metric vector.DistanceMetric,
) error

CreateVectorIndexForTenant creates a vector index for a property under the given tenant. Returns an error for empty tenantID.

func (*GraphStorage) DeleteAllNodes added in v0.5.0

func (gs *GraphStorage) DeleteAllNodes() error

DeleteAllNodes removes every node, edge, and index from the graph and truncates the WAL, then writes an empty snapshot so a restart doesn't replay the old data. Intended for full-reload scenarios where the caller will repopulate the graph immediately after.

func (*GraphStorage) DeleteEdge

func (gs *GraphStorage) DeleteEdge(edgeID uint64) error

DeleteEdge deletes an edge by ID.

Tenant-blind. New callers should prefer DeleteEdgeForTenant.

func (*GraphStorage) DeleteEdgeBetweenAcrossTenants

func (gs *GraphStorage) DeleteEdgeBetweenAcrossTenants(fromID, toID uint64, edgeType string) (bool, error)

DeleteEdgeBetweenAcrossTenants deletes an edge between two nodes by type. Returns true if an edge was deleted, false if no matching edge existed.

Cross-tenant (see FindEdgeBetweenAcrossTenants): can delete an edge owned by any tenant; no request path uses it. Add a *ForTenant variant for scoped callers before exposing edge deletion by endpoint.

func (*GraphStorage) DeleteEdgeForTenant

func (gs *GraphStorage) DeleteEdgeForTenant(edgeID uint64, tenantID string) error

DeleteEdgeForTenant deletes an edge by ID, scoped to the given tenant. Returns ErrEdgeNotFound on missing or cross-tenant.

See GetNodeForTenant in node_operations.go for the rationale on the unified missing-vs-cross-tenant error.

func (*GraphStorage) DeleteNode

func (gs *GraphStorage) DeleteNode(nodeID uint64) error

DeleteNode deletes a node and all its edges.

Tenant-blind. New callers should prefer DeleteNodeForTenant.

Lock discipline (R2.1, S11 spike §7.4): defer-unlock was replaced with explicit unlock at every return path so notifyNodeDeleted can dispatch strictly after gs.mu.Lock is released. The deleted node's TenantID is captured under lock (from the lookup at line 514) and passed to the notify call after unlock — the node's data is not accessible by then.

func (*GraphStorage) DeleteNodeForTenant

func (gs *GraphStorage) DeleteNodeForTenant(nodeID uint64, tenantID string) error

DeleteNodeForTenant deletes a node and all its edges, scoped to the given tenant. Returns ErrNodeNotFound on missing or cross-tenant (same rationale as GetNodeForTenant).

func (*GraphStorage) DeleteTenant added in v0.5.0

func (gs *GraphStorage) DeleteTenant(tenantID string) (nodesDeleted, edgesDeleted int, err error)

DeleteTenant cascade-deletes all of a tenant's graph data — every node and edge it owns, plus the per-tenant indexes/counts those deletes maintain, plus the tenant's vector-index definitions. Returns how many nodes and edges were removed. The default tenant cannot be deleted.

Mechanism: loop DeleteNodeForTenant over the tenant's nodes (reusing the tested cascade — each node delete removes its incident edges and maintains global+per-tenant indexes and the WAL), then a defensive pass over any edges not already cascaded (edges are intra-tenant, so the node pass usually clears them). Per-node ErrNodeNotFound / per-edge ErrEdgeNotFound are tolerated so the cascade is idempotent and re-runnable. Vector index definitions are dropped via the WAL-logged DropVectorIndexForTenant (#320) so the drop survives a crash, mirroring the WAL-durable node/edge deletes.

Search indexes (on-disk LSA, in-memory FTS) are owned by the API server, not GraphStorage — the caller drops those (see handleDeleteTenant).

Cost is O(nodes+edges) individually-locked deletes; acceptable for a correctness fix. A bulk in-storage purge is a future optimization.

func (*GraphStorage) DropPropertyIndex

func (gs *GraphStorage) DropPropertyIndex(propertyKey string) error

DropPropertyIndex removes an index on a node property

func (*GraphStorage) DropVectorIndex

func (gs *GraphStorage) DropVectorIndex(propertyName string) error

DropVectorIndex removes a vector index from the default tenant. Tenant-blind; new callers should prefer DropVectorIndexForTenant.

func (*GraphStorage) DropVectorIndexForTenant

func (gs *GraphStorage) DropVectorIndexForTenant(tenantID string, propertyName string) error

DropVectorIndexForTenant removes the given tenant's vector index for propertyName. Returns an error for empty tenantID.

func (*GraphStorage) EdgesByTypePageForTenant added in v0.5.0

func (gs *GraphStorage) EdgesByTypePageForTenant(tenantID, edgeType string, afterID uint64, limit int) ([]*Edge, uint64)

EdgesByTypePageForTenant returns up to `limit` edges belonging to `tenantID` with the given type and ID > afterID, in ascending-ID order, plus the next cursor (the last returned edge's ID, or 0 if this is the last page). afterID=0 starts from the beginning. Returns (nil, 0) when the tenant or edge type is not found, mirroring GetEdgesByTypeForTenant.

Lock pattern: collect sorted IDs from the type index under gs.mu.RLock, release, then clone each edge under its per-shard RLock. Same non-atomic- snapshot tradeoff as EdgesPageForTenant and GetAllEdgesForTenant.

func (*GraphStorage) EdgesPageForTenant added in v0.5.0

func (gs *GraphStorage) EdgesPageForTenant(tenantID string, afterID uint64, limit int) ([]*Edge, uint64)

EdgesPageForTenant returns up to `limit` edges belonging to `tenantID` with ID > afterID, in ascending-ID order, plus the next cursor (the last returned edge's ID, or 0 if this is the last page). afterID=0 starts from the beginning of the tenant's edge set.

Lock pattern mirrors GetAllEdgesForTenant and NodesPageForTenant: collect sorted IDs under gs.mu.RLock, release, then clone under per-shard RLocks.

func (*GraphStorage) FindAllEdgesBetweenAcrossTenants

func (gs *GraphStorage) FindAllEdgesBetweenAcrossTenants(fromID, toID uint64) ([]*Edge, error)

FindAllEdgesBetweenAcrossTenants finds all edges between two nodes (any type). Useful for checking relationship existence regardless of edge type.

Cross-tenant (see FindEdgeBetweenAcrossTenants): returns edges owned by any tenant; no request path uses it. Add a *ForTenant variant for scoped callers.

func (*GraphStorage) FindEdgeBetweenAcrossTenants

func (gs *GraphStorage) FindEdgeBetweenAcrossTenants(fromID, toID uint64, edgeType string) (*Edge, error)

FindEdgeBetweenAcrossTenants finds an existing edge between two nodes with a specific type. Returns nil if no such edge exists. This is useful for implementing upsert semantics.

Cross-tenant: matches an edge owned by ANY tenant. The explicit *AcrossTenants name (audit A3b convention) makes that scope visible at the call site; a tenant-scoped caller must add/use a *ForTenant variant. No request path uses this today — renamed by the tenant-isolation sweep so it can't be wired in as a silent cross-tenant read.

func (*GraphStorage) FindEdgesByTypeAcrossTenants

func (gs *GraphStorage) FindEdgesByTypeAcrossTenants(edgeType string) ([]*Edge, error)

FindEdgesByTypeAcrossTenants returns every edge of the given type from every tenant. The explicit name makes the cross-tenant scope visible (audit A3b convention); tenant-scoped callers must use GetEdgesByTypeForTenant instead. Results are sorted by edge ID.

func (*GraphStorage) FindNodesByLabelAcrossTenants

func (gs *GraphStorage) FindNodesByLabelAcrossTenants(label string) ([]*Node, error)

FindNodesByLabelAcrossTenants returns every node with the given label from every tenant. The explicit name makes the cross-tenant scope visible (audit A3b convention, cf. GetAllNodesAcrossTenants); tenant-scoped callers must use GetNodesByLabelForTenant instead. Results are sorted by node ID.

func (*GraphStorage) FindNodesByProperty

func (gs *GraphStorage) FindNodesByProperty(key string, value Value) ([]*Node, error)

FindNodesByProperty finds nodes with a specific property value.

Tenant-blind. New callers in tenant-scoped code paths should prefer FindNodesByPropertyForTenant.

func (*GraphStorage) FindNodesByPropertyForTenant

func (gs *GraphStorage) FindNodesByPropertyForTenant(key string, value Value, tenantID string) ([]*Node, error)

FindNodesByPropertyForTenant returns the subset of nodes with the given property value that are owned by the given tenant. Audit A6c-storage (2026-05-08).

Performs a single full scan filtered by both property and tenant — avoids the wasted work of scanning all-tenants then filtering.

func (*GraphStorage) FindNodesByPropertyIndexed

func (gs *GraphStorage) FindNodesByPropertyIndexed(key string, value Value) ([]*Node, error)

FindNodesByPropertyIndexed uses an index to find nodes (O(1) lookup).

Tenant-blind. The property index is global (not per-tenant), so this returns matches across every tenant. New callers in tenant-scoped code paths should prefer FindNodesByPropertyIndexedForTenant.

func (*GraphStorage) FindNodesByPropertyIndexedForTenant

func (gs *GraphStorage) FindNodesByPropertyIndexedForTenant(key string, value Value, tenantID string) ([]*Node, error)

FindNodesByPropertyIndexedForTenant uses an index for the property lookup, then post-filters by tenant ownership. Audit A6c-storage (2026-05-08).

The property index is global — keyed by property value, not (value, tenant). Per-tenant indexes would halve the scan work in the common case but are deferred: changes the on-disk index format (the audit's scope is correctness first, perf later). Until then, the post-index filter pays one extra pass per result set, which is acceptable because the index already collapses to O(matches), not O(nodes).

func (*GraphStorage) FindNodesByPropertyPrefixAcrossTenants

func (gs *GraphStorage) FindNodesByPropertyPrefixAcrossTenants(key string, prefix string) ([]*Node, error)

FindNodesByPropertyPrefixAcrossTenants uses an index to find nodes by string prefix.

Cross-tenant (see FindNodesByPropertyRangeAcrossTenants): returns prefix matches from all tenants; only caller is cmd/benchmark-index. Add a post-filtered *ForTenant variant before exposing prefix queries on a request path.

func (*GraphStorage) FindNodesByPropertyRangeAcrossTenants

func (gs *GraphStorage) FindNodesByPropertyRangeAcrossTenants(key string, start, end Value) ([]*Node, error)

FindNodesByPropertyRangeAcrossTenants uses an index to find nodes in a range.

Cross-tenant: returns matching nodes from ALL tenants (the property index is value-keyed, not tenant-partitioned). The explicit *AcrossTenants name (audit A3b) surfaces that — unlike the exact-match path there is no *ForTenant variant yet, and the only caller is cmd/benchmark-index. A request path must add a post-filtered *ForTenant variant (cf. FindNodesByPropertyIndexedForTenant) before exposing range queries.

func (*GraphStorage) ForEachNode

func (gs *GraphStorage) ForEachNode(fn func(*Node) bool)

ForEachNode iterates over all nodes, calling the provided function for each. The function receives a cloned node to prevent modification. Iteration stops early if the function returns false.

func (*GraphStorage) GetAllEdgesForTenant

func (gs *GraphStorage) GetAllEdgesForTenant(tenantID string) []*Edge

GetAllEdgesForTenant returns all edges belonging to a specific tenant.

Edge analogue of GetAllNodesForTenant: it enumerates the tenant's own edge IDs from tenantEdgeIDs (O(tenant size)) instead of scanning every shard across all tenants and filtering (the former O(total-DB) cross-tenant amplification that also stalled writers — Track P / H4). IDs are collected under gs.mu.RLock, sorted ascending for deterministic pagination, then cloned under per-shard RLocks after release. Same non-atomic-snapshot tradeoff as GetAllNodesForTenant: an edge deleted between collection and clone is skipped.

func (*GraphStorage) GetAllLabels

func (gs *GraphStorage) GetAllLabels() []string

GetAllLabels returns all unique node labels in the graph

func (*GraphStorage) GetAllNodeIDs

func (gs *GraphStorage) GetAllNodeIDs() []uint64

GetAllNodeIDs returns all node IDs in the storage. This is the preferred way to iterate over all nodes, as it handles deleted nodes correctly (unlike iterating from 1 to NodeCount).

func (*GraphStorage) GetAllNodesAcrossTenants

func (gs *GraphStorage) GetAllNodesAcrossTenants() []*Node

GetAllNodesAcrossTenants returns every node from every tenant.

Use ONLY for legitimate cross-tenant operations: replication, full backup, admin reports. The name is deliberately verbose so that reaching for it is a deliberate decision; calling it from any HTTP handler is almost certainly the wrong choice — use GetAllNodesForTenant(getTenantFromContext(r)) instead.

Replaced the previous GetAllNodes (removed 2026-05-08, audit A3b) which had identical semantics under a name that made it easy to call accidentally from request-scoped code paths. That accidental misuse in pkg/api/handlers_nodes.go was Security CRIT #2 in the 2026-05-06 audit.

func (*GraphStorage) GetAllNodesForTenant

func (gs *GraphStorage) GetAllNodesForTenant(tenantID string) []*Node

GetAllNodesForTenant returns all nodes belonging to a specific tenant.

It enumerates the tenant's own node IDs from tenantNodeIDs (O(tenant size)) rather than scanning every shard across all tenants and filtering (the former O(total-DB) cross-tenant amplification — Track P / H4). IDs are collected under gs.mu.RLock then cloned under per-shard RLocks after release, so the global lock is not held across the clone loop and writers are not stalled (the A4 read pattern, see GetNodeForTenant).

Consequence of the lock split: the result is no longer an atomic snapshot of the tenant at a single instant — a node deleted between the ID collection and its clone is simply skipped (same tradeoff A4 accepted for GetNode). IDs are returned in ascending order for deterministic pagination.

func (*GraphStorage) GetCompressionStats

func (gs *GraphStorage) GetCompressionStats() CompressionStats

GetCompressionStats returns compression statistics

func (*GraphStorage) GetCurrentLSN

func (gs *GraphStorage) GetCurrentLSN() uint64

GetCurrentLSN returns the current LSN (Log Sequence Number) from the WAL This is used by replication to track the latest position in the write-ahead log

func (*GraphStorage) GetEdge

func (gs *GraphStorage) GetEdge(edgeID uint64) (*Edge, error)

GetEdge retrieves an edge by ID.

Tenant-blind. New callers should prefer GetEdgeForTenant.

Reader takes the per-shard read lock; writers (CreateEdge, UpdateEdge, DeleteEdge, UpsertEdge) hold gs.mu.Lock plus lockShard(edgeID) for the edgeShards mutation, so readers and writers exclude correctly on shardLocks[edgeID & shardMask]. Audit task A4-edges (2026-05-10) closed the prior shared-map race by partitioning gs.edges into [256]map[uint64]*Edge.

func (*GraphStorage) GetEdgeByID

func (gs *GraphStorage) GetEdgeByID(edgeID uint64) (*Edge, error)

GetEdgeByID gets an edge by ID (alias for GetEdge for compatibility)

func (*GraphStorage) GetEdgeForTenant

func (gs *GraphStorage) GetEdgeForTenant(edgeID uint64, tenantID string) (*Edge, error)

GetEdgeForTenant retrieves an edge by ID, scoped to the given tenant. Returns ErrEdgeNotFound on missing or cross-tenant.

func (*GraphStorage) GetEdgeTypesForTenant

func (gs *GraphStorage) GetEdgeTypesForTenant(tenantID string) []string

GetEdgeTypesForTenant returns all unique edge types used by a tenant.

func (*GraphStorage) GetEdgesByTypeForTenant

func (gs *GraphStorage) GetEdgesByTypeForTenant(tenantID, edgeType string) []*Edge

GetEdgesByTypeForTenant returns all edges with the given type for a specific tenant.

func (*GraphStorage) GetIncomingEdges

func (gs *GraphStorage) GetIncomingEdges(nodeID uint64) ([]*Edge, error)

GetIncomingEdges gets all incoming edges to a node.

Tenant-blind. New callers in tenant-scoped code paths should prefer GetIncomingEdgesForTenant.

func (*GraphStorage) GetIncomingEdgesForTenant

func (gs *GraphStorage) GetIncomingEdgesForTenant(nodeID uint64, tenantID string) ([]*Edge, error)

GetIncomingEdgesForTenant returns the subset of incoming edges to nodeID that are owned by the given tenant. Audit A6c-storage (2026-05-08); mirror of A6b's GetOutgoingEdgesForTenant.

As of the A6a follow-up, cross-tenant edges cannot be created via the API path (CreateEdgeWithTenant refuses foreign-tenant from/to references), so the edge-tenant filter is sufficient for callers in /query and /algorithms hot paths.

func (*GraphStorage) GetIndexStatistics

func (gs *GraphStorage) GetIndexStatistics() map[string]IndexStatistics

GetIndexStatistics returns statistics for all property indexes

func (*GraphStorage) GetLabelsForTenant

func (gs *GraphStorage) GetLabelsForTenant(tenantID string) []string

GetLabelsForTenant returns all unique labels used by a tenant.

func (*GraphStorage) GetNode

func (gs *GraphStorage) GetNode(nodeID uint64) (*Node, error)

GetNode retrieves a node by ID.

Tenant-blind. New callers should prefer GetNodeForTenant. Legacy callers retain this entry point until the audit-driven migration completes — see docs/AUDIT_fixes_plan_2026-05-06.md task A3.

func (*GraphStorage) GetNodeByID

func (gs *GraphStorage) GetNodeByID(nodeID uint64) (*Node, error)

GetNodeByID gets a node by ID (alias for GetNode for compatibility)

func (*GraphStorage) GetNodeForTenant

func (gs *GraphStorage) GetNodeForTenant(nodeID uint64, tenantID string) (*Node, error)

GetNodeForTenant retrieves a node by ID, scoped to the given tenant.

Returns ErrNodeNotFound if the node does not exist OR belongs to a different tenant. The same error in both cases is intentional — a distinct ErrCrossTenant would let an attacker enumerate "this ID exists in *some* tenant" via response-shape inference.

Empty tenantID defaults to tenantid.Default ("default"). This matches CreateNode's default-tenant fallback so single-tenant deployments and tests that don't supply a tenant continue to work transparently.

Closes Security CRIT #1 from docs/AUDIT_security_2026-05-06.md.

func (*GraphStorage) GetNodesByLabelForTenant

func (gs *GraphStorage) GetNodesByLabelForTenant(tenantID, label string) []*Node

GetNodesByLabelForTenant returns all nodes with the given label for a specific tenant.

func (*GraphStorage) GetOutgoingEdges

func (gs *GraphStorage) GetOutgoingEdges(nodeID uint64) ([]*Edge, error)

GetOutgoingEdges gets all outgoing edges from a node.

Tenant-blind. Returns edges across all tenants — used by replication, snapshotting, and legitimately tenant-blind callers (CLI, examples). New callers in tenant-scoped code paths should prefer GetOutgoingEdgesForTenant.

func (*GraphStorage) GetOutgoingEdgesForTenant

func (gs *GraphStorage) GetOutgoingEdgesForTenant(nodeID uint64, tenantID string) ([]*Edge, error)

GetOutgoingEdgesForTenant returns the subset of outgoing edges from nodeID that are owned by the given tenant. Audit A6b (2026-05-08).

Filters at the storage layer rather than at the handler so the algorithm package can scope traversals without re-implementing the filter at every call site. The filter is on edge.TenantID; node-side scoping is the caller's responsibility (see GetNodeForTenant).

Tenant-strict guarantee: as of the A6a follow-up, CreateEdgeWithTenant refuses cross-tenant from/to node references, so a tenant-stamped edge cannot point at a foreign-tenant node. Pure edge-side filtering here is sufficient.

func (*GraphStorage) GetStatistics

func (gs *GraphStorage) GetStatistics() Statistics

GetStatistics returns current database statistics

func (*GraphStorage) GetTenantStats

func (gs *GraphStorage) GetTenantStats(tenantID string) *TenantStats

GetTenantStats returns usage statistics for a tenant.

func (*GraphStorage) GetVectorIndexMetric

func (gs *GraphStorage) GetVectorIndexMetric(propertyName string) (vector.DistanceMetric, error)

GetVectorIndexMetric returns the distance metric for the default tenant's vector index on propertyName. Tenant-blind; new callers should prefer GetVectorIndexMetricForTenant.

func (*GraphStorage) GetVectorIndexMetricForTenant

func (gs *GraphStorage) GetVectorIndexMetricForTenant(
	tenantID string,
	propertyName string,
) (vector.DistanceMetric, error)

GetVectorIndexMetricForTenant returns the distance metric for the given tenant's vector index on propertyName. Returns ErrNodeNotFound for empty tenantID or when the tenant has no index for propertyName.

func (*GraphStorage) HasPropertyIndex

func (gs *GraphStorage) HasPropertyIndex(key string) bool

HasPropertyIndex checks if an index exists for a given property key

func (*GraphStorage) HasVectorIndex

func (gs *GraphStorage) HasVectorIndex(propertyName string) bool

HasVectorIndex checks if a vector index exists in the default tenant. Tenant-blind; new callers should prefer HasVectorIndexForTenant.

func (*GraphStorage) HasVectorIndexForTenant

func (gs *GraphStorage) HasVectorIndexForTenant(tenantID string, propertyName string) bool

HasVectorIndexForTenant reports whether the given tenant has a vector index for propertyName. Returns false (not an error) for empty tenantID or unknown tenants — the unified-false response prevents tenant-existence probing.

func (*GraphStorage) ListTenants

func (gs *GraphStorage) ListTenants() []string

ListTenants returns a list of all tenant IDs that have data.

func (*GraphStorage) ListVectorIndexes

func (gs *GraphStorage) ListVectorIndexes() []string

ListVectorIndexes returns the default tenant's vector-indexed property names. Tenant-blind; new callers should prefer ListVectorIndexesForTenant.

func (*GraphStorage) ListVectorIndexesForTenant

func (gs *GraphStorage) ListVectorIndexesForTenant(tenantID string) []string

ListVectorIndexesForTenant returns the given tenant's vector-indexed property names. Returns []string{} (not nil) for empty tenantID or tenants with no indexes.

func (*GraphStorage) NodesByLabelPageForTenant added in v0.5.0

func (gs *GraphStorage) NodesByLabelPageForTenant(tenantID, label string, afterID uint64, limit int) ([]*Node, uint64)

NodesByLabelPageForTenant returns up to `limit` nodes belonging to `tenantID` with the given label and ID > afterID, in ascending-ID order, plus the next cursor (the last returned node's ID, or 0 if this is the last page). afterID=0 starts from the beginning. Returns (nil, 0) when the tenant or label is not found, mirroring GetNodesByLabelForTenant.

Lock pattern: collect sorted IDs from the label index under gs.mu.RLock, release, then clone each node under its per-shard RLock. Same non-atomic- snapshot tradeoff as NodesPageForTenant and GetAllNodesForTenant.

func (*GraphStorage) NodesPageForTenant added in v0.5.0

func (gs *GraphStorage) NodesPageForTenant(tenantID string, afterID uint64, limit int) ([]*Node, uint64)

NodesPageForTenant returns up to `limit` nodes belonging to `tenantID` with ID > afterID, in ascending-ID order, plus the next cursor (the last returned node's ID, or 0 if this is the last page). afterID=0 starts from the beginning of the tenant's node set.

Lock pattern mirrors GetAllNodesForTenant: collect the sorted ID slice under gs.mu.RLock, release, then clone each node under its per-shard RLock. This means the result is not an atomic snapshot — a node deleted between ID collection and clone is skipped — but the global lock is not held across the clone loop, so writers are not stalled.

func (*GraphStorage) RemoveNodeFromVectorIndexes

func (gs *GraphStorage) RemoveNodeFromVectorIndexes(nodeID uint64, tenantID string) error

RemoveNodeFromVectorIndexes removes a node's vectors from all of the given tenant's vector indexes. Called from DeleteNode (passing the looked-up node's TenantID) and from cleanup paths.

Empty tenantID falls back to tenantid.Default so tenant-blind callers (legacy DeleteNode path, tests that don't set TenantID) continue to clean up the default-tenant namespace transparently.

Signature change relative to pre-R1.2 (this PR): now takes tenantID as a second parameter. Callers that previously passed only nodeID must supply the node's tenant (or "" for tenant-blind / default semantics).

func (*GraphStorage) RemoveNodeProperties

func (gs *GraphStorage) RemoveNodeProperties(nodeID uint64, keys []string) error

RemoveNodeProperties removes specified properties from a node. Unlike UpdateNode (which merges), this deletes keys from the properties map. Tenant-blind — new callers in tenant-scoped code paths should prefer RemoveNodePropertiesForTenant.

func (*GraphStorage) RemoveNodePropertiesForTenant

func (gs *GraphStorage) RemoveNodePropertiesForTenant(nodeID uint64, keys []string, tenantID string) error

RemoveNodePropertiesForTenant removes specified properties from a node, scoped to the given tenant. Returns ErrNodeNotFound on missing or cross-tenant. Audit A6c-query (2026-05-08).

Mirrors UpdateNodeForTenant's lock-then-delegate pattern: tenant validation under read lock, brief lock-drop window before RemoveNodeProperties acquires the write lock. Race window is benign — tenant IDs are immutable after node creation and node IDs don't recycle, so the only race is "node deleted by another goroutine before ours" which RemoveNodeProperties handles via ErrNodeNotFound.

func (*GraphStorage) SetEncryption

func (gs *GraphStorage) SetEncryption(engine encryption.EncryptDecrypter, keyManager encryption.KeyProvider)

SetEncryption sets the encryption engine and key manager for the storage. Uses typed interfaces for compile-time safety.

func (*GraphStorage) Snapshot

func (gs *GraphStorage) Snapshot() error

Snapshot saves the current state to disk

func (*GraphStorage) UpdateEdge

func (gs *GraphStorage) UpdateEdge(edgeID uint64, properties map[string]Value, weight *float64) error

UpdateEdge updates an edge's properties and/or weight.

Tenant-blind. New callers should prefer UpdateEdgeForTenant.

func (*GraphStorage) UpdateEdgeForTenant

func (gs *GraphStorage) UpdateEdgeForTenant(edgeID uint64, properties map[string]Value, weight *float64, tenantID string) error

UpdateEdgeForTenant updates an edge's properties and/or weight, scoped to the given tenant. Returns ErrEdgeNotFound on missing or cross-tenant.

func (*GraphStorage) UpdateNode

func (gs *GraphStorage) UpdateNode(nodeID uint64, properties map[string]Value) error

UpdateNode updates a node's properties.

Tenant-blind. New callers should prefer UpdateNodeForTenant.

Lock discipline (R2.1, S11 spike §7.4): notifyNodeUpdated dispatches after gs.mu.Lock is released. The oldNode / newNode clones are taken inside the lock window (when the live shard pointer is safe) and are only allocated when observers are registered — observerless callers pay zero clone cost.

func (*GraphStorage) UpdateNodeForTenant

func (gs *GraphStorage) UpdateNodeForTenant(nodeID uint64, properties map[string]Value, tenantID string) error

UpdateNodeForTenant updates a node's properties, scoped to the given tenant. Returns ErrNodeNotFound on missing or cross-tenant (same rationale as GetNodeForTenant).

func (*GraphStorage) UpdateNodeVectorIndexes

func (gs *GraphStorage) UpdateNodeVectorIndexes(node *Node) error

UpdateNodeVectorIndexes updates vector indexes when a node is added or updated. Routes by node.TenantID; empty TenantID falls back to tenantid.Default so tenant-blind callers (legacy CreateNode path, tests that don't set TenantID) continue to land vectors in the default-tenant namespace transparently.

Behavior change relative to pre-R1.2 (this PR): a node with an explicit non-default TenantID now lands its vectors under its tenant's index, not the global namespace. Previously all vectors landed globally; the existence-leak prevention happened post-search in handlers_vectors.go via WithNodeRefForTenant filtering. After this PR, isolation is structural (per-tenant indexes via R1.1), and the handler post-filter becomes defense-in-depth + lock discipline for label/property filtering.

func (*GraphStorage) UpsertEdge

func (gs *GraphStorage) UpsertEdge(fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, bool, error)

UpsertEdge creates a new edge or updates an existing one between two nodes in the default tenant. Tenant-blind on node verification; see CreateEdge for the rationale. Existence is still validated.

func (*GraphStorage) UpsertEdgeWithTenant

func (gs *GraphStorage) UpsertEdgeWithTenant(tenantID string, fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, bool, error)

UpsertEdgeWithTenant creates a new edge or updates an existing one between two nodes for a specific tenant. From/to nodes must belong to the same tenant — cross-tenant or missing surfaces as ErrNodeNotFound (audit A6a follow-up; see CreateEdgeWithTenant).

func (*GraphStorage) VectorSearch

func (gs *GraphStorage) VectorSearch(
	propertyName string,
	query []float32,
	k int,
	ef int,
) ([]vector.SearchResult, error)

VectorSearch performs k-NN search on a vector-indexed property in the default tenant. Tenant-blind; new callers should prefer VectorSearchForTenant.

func (*GraphStorage) VectorSearchForTenant

func (gs *GraphStorage) VectorSearchForTenant(
	tenantID string,
	propertyName string,
	query []float32,
	k int,
	ef int,
) ([]vector.SearchResult, error)

VectorSearchForTenant performs k-NN search on a vector-indexed property, scoped to tenantID. Returns ErrNodeNotFound for empty tenantID or when the tenant has no index for propertyName — the unified error prevents existence-leak via error shape.

func (*GraphStorage) WithNodeRefForTenant

func (gs *GraphStorage) WithNodeRefForTenant(nodeID uint64, tenantID string, fn func(*Node) error) error

WithNodeRefForTenant invokes fn with the live node pointer for the given (id, tenantID), holding the per-shard read lock for the duration of fn. Returns ErrNodeNotFound if the node does not exist or belongs to a different tenant (same unified-error rationale as GetNodeForTenant).

Caller contract: fn MUST NOT escape the *Node pointer past its own return — the shard lock is released as soon as fn finishes, and a concurrent writer may mutate the node's fields immediately after. To retain a snapshot beyond fn, call node.Clone() before returning.

Designed for hot-path filter loops (vector search post-filter, etc.) where most candidates are rejected before any escape is needed; the per-iteration clone of GetNodeForTenant is wasted work for rejected candidates. Audit task A4 clone-elision (2026-05-10) — closes Performance HIGH-1 from docs/AUDIT_performance_2026-05-06.md.

type IndexStatistics

type IndexStatistics struct {
	PropertyKey    string
	UniqueValues   int
	TotalNodes     int
	AvgNodesPerKey float64
}

IndexStatistics holds statistics about an index

type LSMGraphStorage

type LSMGraphStorage struct {
	// contains filtered or unexported fields
}

LSMGraphStorage is a disk-backed graph storage using LSM trees

func NewLSMGraphStorage

func NewLSMGraphStorage(dataDir string) (*LSMGraphStorage, error)

NewLSMGraphStorage creates a new LSM-backed graph storage

func (*LSMGraphStorage) Close

func (gs *LSMGraphStorage) Close() error

Close closes the LSM storage

func (*LSMGraphStorage) CreateEdge

func (gs *LSMGraphStorage) CreateEdge(fromID, toID uint64, relType string, properties map[string]Value, weight float64) (*Edge, error)

CreateEdge creates a new edge in LSM storage

func (*LSMGraphStorage) CreateNode

func (gs *LSMGraphStorage) CreateNode(labels []string, properties map[string]Value) (*Node, error)

CreateNode creates a new node in LSM storage

func (*LSMGraphStorage) DeleteEdge

func (gs *LSMGraphStorage) DeleteEdge(edgeID uint64) error

DeleteEdge deletes an edge from LSM storage

func (*LSMGraphStorage) DeleteNode

func (gs *LSMGraphStorage) DeleteNode(nodeID uint64) error

DeleteNode deletes a node and its edges from LSM storage

func (*LSMGraphStorage) FindNodesByLabelAcrossTenants

func (gs *LSMGraphStorage) FindNodesByLabelAcrossTenants(label string) ([]*Node, error)

FindNodesByLabelAcrossTenants returns all nodes with a given label from LSM storage

func (*LSMGraphStorage) GetEdge

func (gs *LSMGraphStorage) GetEdge(edgeID uint64) (*Edge, error)

GetEdge retrieves an edge by ID from LSM storage

func (*LSMGraphStorage) GetIncomingEdges

func (gs *LSMGraphStorage) GetIncomingEdges(nodeID uint64) ([]*Edge, error)

GetIncomingEdges returns all incoming edges to a node in LSM storage

func (*LSMGraphStorage) GetNode

func (gs *LSMGraphStorage) GetNode(nodeID uint64) (*Node, error)

GetNode retrieves a node by ID from LSM storage

func (*LSMGraphStorage) GetOutgoingEdges

func (gs *LSMGraphStorage) GetOutgoingEdges(nodeID uint64) ([]*Edge, error)

GetOutgoingEdges returns all outgoing edges from a node in LSM storage

func (*LSMGraphStorage) GetStatistics

func (gs *LSMGraphStorage) GetStatistics() Statistics

GetStatistics returns graph statistics

func (*LSMGraphStorage) UpdateNode

func (gs *LSMGraphStorage) UpdateNode(nodeID uint64, properties map[string]Value) error

UpdateNode updates a node's properties in LSM storage

type Node

type Node struct {
	ID         uint64
	TenantID   string // Multi-tenancy: empty string defaults to "default" tenant
	Labels     []string
	Properties map[string]Value
	CreatedAt  int64
	UpdatedAt  int64
}

Node represents a vertex in the graph

func TemporalTraversal

func TemporalTraversal(graph *GraphStorage, startID uint64, timestamp int64, maxDepth int) ([]*Node, error)

TemporalTraversal performs BFS with time constraints

func (*Node) Clone

func (n *Node) Clone() *Node

Clone creates a deep copy of a node

func (*Node) GetProperty

func (n *Node) GetProperty(key string) (Value, bool)

GetProperty gets a property value

func (*Node) HasLabel

func (n *Node) HasLabel(label string) bool

HasLabel checks if node has a specific label

type NodeObserver

type NodeObserver interface {
	// OnNodeCreated fires after a successful CreateNode* call, after all
	// storage locks have been released. node is a freshly-cloned snapshot
	// — safe to read without locking and safe to retain past return.
	OnNodeCreated(ctx context.Context, node *Node)

	// OnNodeUpdated fires after a successful UpdateNode* or
	// RemoveNodeProperties* call. node is the post-mutation snapshot;
	// oldNode is the pre-mutation snapshot. Both are clones; safe to read
	// and retain.
	//
	// Implementations that themselves write back to the same node (e.g.,
	// an auto-embedder that writes an embedding property) must guard
	// against re-triggering notification — the canonical pattern is a
	// sentinel property key or context value set before the internal
	// write. R2.5's AutoEmbedObserver implements this guard.
	OnNodeUpdated(ctx context.Context, node *Node, oldNode *Node)

	// OnNodeDeleted fires after a successful DeleteNode* call. The node's
	// data is no longer accessible by the time dispatch runs; only
	// nodeID and tenantID are provided. tenantID is the deleted node's
	// TenantID captured before deletion.
	OnNodeDeleted(ctx context.Context, nodeID uint64, tenantID string)
}

NodeObserver receives node-lifecycle notifications. Implementations must be safe for concurrent calls from multiple goroutines: the same observer may be invoked from many storage operations in parallel.

All three methods are passed context.Background() in the current implementation — the storage API does not carry context yet. Migration to *Ctx-passing variants is its own track and does not block R2.x.

None of the lifecycle methods return an error. Errors encountered by the observer must be handled internally (log, meter, dead-letter queue). A panicking observer will crash the server; implementations must recover from panics or be verified panic-free.

type NodeSpec added in v0.5.0

type NodeSpec struct {
	Labels     []string
	Properties map[string]Value
}

NodeSpec describes one node for bulk creation.

type PropertyIndex

type PropertyIndex struct {
	// contains filtered or unexported fields
}

PropertyIndex maintains an index on a specific node property

func NewPropertyIndex

func NewPropertyIndex(propertyKey string, indexType ValueType) *PropertyIndex

NewPropertyIndex creates a new property index

func (*PropertyIndex) GetAllKeys

func (idx *PropertyIndex) GetAllKeys() []string

GetAllKeys returns all indexed keys (useful for debugging)

func (*PropertyIndex) GetStatistics

func (idx *PropertyIndex) GetStatistics() IndexStatistics

GetStatistics returns index statistics

func (*PropertyIndex) Insert

func (idx *PropertyIndex) Insert(nodeID uint64, value Value) error

Insert adds a node to the index

func (*PropertyIndex) Lookup

func (idx *PropertyIndex) Lookup(value Value) ([]uint64, error)

Lookup finds all nodes with a specific property value

func (*PropertyIndex) PrefixLookup

func (idx *PropertyIndex) PrefixLookup(prefix string) ([]uint64, error)

PrefixLookup finds all nodes with string properties starting with a prefix

func (*PropertyIndex) RangeLookup

func (idx *PropertyIndex) RangeLookup(start, end Value) ([]uint64, error)

RangeLookup finds all nodes with property values in a range [start, end] This is useful for numeric and timestamp ranges

func (*PropertyIndex) Remove

func (idx *PropertyIndex) Remove(nodeID uint64, value Value) error

Remove removes a node from the index

type PropertyIndexSnapshot

type PropertyIndexSnapshot struct {
	PropertyKey string
	IndexType   ValueType
	Index       map[string][]uint64
}

PropertyIndexSnapshot is a serializable representation of a PropertyIndex

type Statistics

type Statistics struct {
	NodeCount    uint64
	EdgeCount    uint64
	LastSnapshot time.Time
	TotalQueries uint64
	AvgQueryTime float64
}

Statistics tracks database statistics

type Storage

type Storage interface {
	StorageReader
	StorageWriter

	// Global/Admin operations
	GetAllLabels() []string
	GetAllNodesAcrossTenants() []*Node
	SetEncryption(engine encryption.EncryptDecrypter, keyManager encryption.KeyProvider)
	Snapshot() error

	// Observer registration (S11 / R2.1). AddObserver is typically called
	// once at startup before serving requests. Backends without
	// notification support implement this as a no-op (e.g.,
	// BTreeGraphStorage); observers attached to a no-op backend simply
	// never fire.
	AddObserver(obs NodeObserver)

	Close() error
}

Storage is the unified interface for GraphDB storage engines.

type StorageConfig

type StorageConfig struct {
	DataDir               string
	EnableBatching        bool
	EnableCompression     bool
	EnableEdgeCompression bool
	BatchSize             int
	FlushInterval         time.Duration
	UseDiskBackedEdges    bool // Enable disk-backed adjacency lists (Milestone 2)
	EdgeCacheSize         int  // LRU cache size for hot edge lists (default: 10000)
	BulkImportMode        bool // Disable WAL and use fast path for bulk loading

	// EncryptionEngine/KeyManager wire at-rest encryption at CONSTRUCTION
	// time, so the constructor's loadFromDisk can decrypt an encrypted
	// snapshot. SetEncryption after construction is too late for that
	// path — before M-14 an encrypted snapshot could never load at
	// restart (the server exited with "encryption is not enabled").
	EncryptionEngine encryption.EncryptDecrypter
	KeyManager       encryption.KeyProvider
}

StorageConfig holds configuration for GraphStorage

func DefaultStorageConfig added in v0.5.0

func DefaultStorageConfig(dataDir string) StorageConfig

DefaultStorageConfig returns the config NewGraphStorage uses. Callers that need to override a single field (e.g. wire EncryptionEngine at construction time) start from this instead of duplicating defaults.

type StorageError

type StorageError struct {
	Op      string // Operation that failed (e.g., "CreateNode", "DeleteEdge")
	Entity  string // Entity type (e.g., "node", "edge", "index")
	ID      uint64 // Entity ID (if applicable)
	Field   string // Field name (for property operations)
	Cause   error  // Underlying error
	Context string // Additional context
}

StorageError provides structured error information for storage operations.

func (*StorageError) Error

func (e *StorageError) Error() string

Error implements the error interface.

func (*StorageError) Is

func (e *StorageError) Is(target error) bool

Is reports whether the target error matches this error or its cause.

func (*StorageError) Unwrap

func (e *StorageError) Unwrap() error

Unwrap returns the underlying cause for error chain support.

type StorageReader

type StorageReader interface {
	// Node retrieval
	GetNodeForTenant(nodeID uint64, tenantID string) (*Node, error)
	GetNodesByLabelForTenant(tenantID string, label string) []*Node
	GetAllNodesForTenant(tenantID string) []*Node
	CountNodesForTenant(tenantID string) uint64

	// Edge retrieval
	GetEdgeForTenant(edgeID uint64, tenantID string) (*Edge, error)
	GetEdgesByTypeForTenant(tenantID string, edgeType string) []*Edge
	GetAllEdgesForTenant(tenantID string) []*Edge
	CountEdgesForTenant(tenantID string) uint64

	// Adjacency
	GetOutgoingEdgesForTenant(nodeID uint64, tenantID string) ([]*Edge, error)
	GetIncomingEdgesForTenant(nodeID uint64, tenantID string) ([]*Edge, error)

	// Metadata
	GetLabelsForTenant(tenantID string) []string
	GetEdgeTypesForTenant(tenantID string) []string
	HasPropertyIndex(key string) bool
	GetVectorIndexMetric(propertyName string) (vector.DistanceMetric, error)
	FindNodesByLabelAcrossTenants(label string) ([]*Node, error)
	GetNode(nodeID uint64) (*Node, error)
	GetOutgoingEdges(nodeID uint64) ([]*Edge, error)
	GetIncomingEdges(nodeID uint64) ([]*Edge, error)
	GetEdge(edgeID uint64) (*Edge, error)

	// Search and Indexing
	FindNodesByPropertyForTenant(key string, value Value, tenantID string) ([]*Node, error)
	FindNodesByPropertyIndexedForTenant(key string, value Value, tenantID string) ([]*Node, error)

	// Vector index — tenant-blind variants (legacy / single-tenant / test).
	// New callers should prefer the *ForTenant variants below.
	VectorSearch(propertyName string, query []float32, k int, ef int) ([]vector.SearchResult, error)
	ListVectorIndexes() []string
	HasVectorIndex(propertyName string) bool
	CreateVectorIndex(propertyName string, dimensions int, m int, efConstruction int, metric vector.DistanceMetric) error
	DropVectorIndex(propertyName string) error

	// Vector index — tenant-scoped variants (R1.x / F4 spike). Per F4
	// spike §1.3, these reject empty tenantID at the public layer and
	// return ErrNodeNotFound (unified error) for cross-tenant lookups
	// to prevent existence-leak via response shape.
	VectorSearchForTenant(tenantID string, propertyName string, query []float32, k int, ef int) ([]vector.SearchResult, error)
	ListVectorIndexesForTenant(tenantID string) []string
	HasVectorIndexForTenant(tenantID string, propertyName string) bool
	CreateVectorIndexForTenant(tenantID string, propertyName string, dimensions int, m int, efConstruction int, metric vector.DistanceMetric) error
	DropVectorIndexForTenant(tenantID string, propertyName string) error
	GetVectorIndexMetricForTenant(tenantID string, propertyName string) (vector.DistanceMetric, error)

	// Performance optimization: callback with live reference (no clone)
	WithNodeRefForTenant(nodeID uint64, tenantID string, fn func(*Node) error) error

	// Statistics
	GetStatistics() Statistics
}

StorageReader defines the read-only surface of the graph database. All *ForTenant methods are tenant-scoped per the repo's tenant-strict convention (cross-tenant lookups return ErrNodeNotFound, not a distinct error, to avoid existence-leak side channels).

type StorageWriter

type StorageWriter interface {
	// Node mutations
	CreateNodeWithTenant(tenantID string, labels []string, properties map[string]Value) (*Node, error)
	CreateNodeWithUniquePropertyForTenant(tenantID string, labels []string, properties map[string]Value, uniqueLabel string, uniquePropertyKey string) (*Node, error)
	UpdateNodeForTenant(nodeID uint64, properties map[string]Value, tenantID string) error
	DeleteNodeForTenant(nodeID uint64, tenantID string) error
	RemoveNodePropertiesForTenant(nodeID uint64, keys []string, tenantID string) error

	// Edge mutations
	CreateEdgeWithTenant(tenantID string, fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, error)
	UpdateEdgeForTenant(edgeID uint64, properties map[string]Value, weight *float64, tenantID string) error
	DeleteEdgeForTenant(edgeID uint64, tenantID string) error
	UpsertEdgeWithTenant(tenantID string, fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, bool, error)

	// Tenant-blind mutations (Admin/Test)
	CreateNode(labels []string, properties map[string]Value) (*Node, error)
	CreateEdge(fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, error)
	UpdateNode(nodeID uint64, properties map[string]Value) error
	DeleteNode(nodeID uint64) error
	RemoveNodeProperties(nodeID uint64, keys []string) error
	UpdateEdge(edgeID uint64, properties map[string]Value, weight *float64) error
	DeleteEdge(edgeID uint64) error

	// Batching
	BeginBatch() *Batch

	// Vector index maintenance. RemoveNodeFromVectorIndexes takes a tenantID
	// as of R1.2 so deletes route to the per-tenant index; empty tenantID
	// falls back to tenantid.Default for legacy tenant-blind callers.
	UpdateNodeVectorIndexes(node *Node) error
	RemoveNodeFromVectorIndexes(nodeID uint64, tenantID string) error
}

StorageWriter defines the mutative surface of the graph database.

type TemporalEdge

type TemporalEdge struct {
	*Edge
	ValidFrom int64 // Unix timestamp
	ValidTo   int64 // Unix timestamp (0 = infinity)
}

TemporalEdge is an edge with time validity

type TemporalMetrics

type TemporalMetrics struct {
	AverageEdgeLifetime float64
	ActiveEdgesAtTime   map[int64]int
	EdgeCreationRate    float64
	EdgeDeletionRate    float64
}

TemporalMetrics computes temporal graph metrics

func ComputeTemporalMetrics

func ComputeTemporalMetrics(graph *GraphStorage, startTime, endTime int64) (*TemporalMetrics, error)

ComputeTemporalMetrics analyzes temporal patterns

type TemporalQuery

type TemporalQuery struct {
	// contains filtered or unexported fields
}

TemporalQuery operations for time-based graph queries

func NewTemporalQuery

func NewTemporalQuery(graph *GraphStorage) *TemporalQuery

NewTemporalQuery creates a temporal query interface

func (*TemporalQuery) CreateTemporalEdge

func (tq *TemporalQuery) CreateTemporalEdge(
	fromID, toID uint64,
	edgeType string,
	properties map[string]Value,
	weight float64,
	validFrom, validTo int64,
) (*Edge, error)

CreateTemporalEdge creates an edge with time validity

func (*TemporalQuery) GetEdgesAtTime

func (tq *TemporalQuery) GetEdgesAtTime(nodeID uint64, timestamp int64) ([]*TemporalEdge, error)

GetEdgesAtTime returns all edges valid at a specific timestamp

func (*TemporalQuery) GetEdgesInTimeRange

func (tq *TemporalQuery) GetEdgesInTimeRange(nodeID uint64, start, end int64) ([]*TemporalEdge, error)

GetEdgesInTimeRange returns edges valid in a time range

type TenantStats

type TenantStats struct {
	NodeCount    uint64 // Number of nodes belonging to this tenant
	EdgeCount    uint64 // Number of edges belonging to this tenant
	StorageBytes uint64 // Estimated storage used by this tenant
	LastUpdated  int64  // Unix timestamp of last update
}

TenantStats tracks per-tenant usage statistics for multi-tenancy

type Transaction

type Transaction struct {
	// contains filtered or unexported fields
}

Transaction represents a database transaction. It is tenant-bound: every node/edge it creates is stamped with tenantID, and Commit validates that any edge endpoint / update target belongs to that tenant (the *ForTenant convention). Commit is atomic (single-fsync, all-or-none) and serializes on gs.mu (last-writer-wins; no conflict detection). Creates + updates only — deletes are deferred (the deletedNodes/deletedEdges fields below are dead scaffolding for that follow-up; nothing populates them today).

func (*Transaction) Commit

func (tx *Transaction) Commit() error

Commit applies all buffered changes atomically and durably.

  1. Validate the whole buffer (edge endpoints + update targets resolve to this tenant) BEFORE mutating anything — a bad reference aborts with nothing applied (all-or-none for reference errors).
  2. Under gs.mu, persist created nodes → created edges → property updates via the SHARED persist*Locked helpers (the same shard/global/tenant/vector/ property indexes + stats the direct write paths maintain), collecting the WAL entries and vector plans.
  3. Release gs.mu, then make the whole batch durable with a SINGLE fsync (all-or-none) via appendWALBatch. Durability is always atomic: appendWAL- Batch is the last step, so any earlier error writes no WAL and nothing becomes durable.
  4. Off-lock: apply the HNSW vector inserts, then dispatch observer notifications (so auto-embed observers see committed nodes).

Commit serializes on gs.mu (last-writer-wins; no conflict detection between concurrent transactions). Limitation: a malformed-vector error (wrong dimension) during step 2 aborts the commit and writes no WAL — so nothing is durable — but may leave a transient, non-durable in-memory partial (the same behaviour the direct create/update paths have on a mid-apply error). It is cleaned up by the absence of a WAL record on the next restart.

func (*Transaction) CreateEdge

func (tx *Transaction) CreateEdge(fromID, toID uint64, edgeType string, properties map[string]Value, weight float64) (*Edge, error)

CreateEdge creates an edge within the transaction

func (*Transaction) CreateNode

func (tx *Transaction) CreateNode(labels []string, properties map[string]Value) (*Node, error)

CreateNode creates a node within the transaction

func (*Transaction) GetEdgeByID

func (tx *Transaction) GetEdgeByID(edgeID uint64) (*Edge, error)

GetEdgeByID gets an edge by ID within the transaction context

func (*Transaction) GetNodeByID

func (tx *Transaction) GetNodeByID(nodeID uint64) (*Node, error)

GetNodeByID gets a node by ID within the transaction context

func (*Transaction) Rollback

func (tx *Transaction) Rollback() error

Rollback rolls back the transaction

func (*Transaction) UpdateNode

func (tx *Transaction) UpdateNode(nodeID uint64, properties map[string]Value) error

UpdateNode updates a node within the transaction

type UniqueConstraintError

type UniqueConstraintError struct {
	Label             string
	PropertyKey       string
	ConflictingNodeID uint64
	TenantID          string
}

UniqueConstraintError is returned by CreateNodeWithUniquePropertyForTenant when a node with the same (label, propertyKey, propertyValue) already exists in the tenant. Use errors.Is(err, ErrUniqueConstraintViolation) to detect this class. TenantID is preserved on the struct for logs and audit records; it is intentionally omitted from Error() so the message can be safely forwarded to HTTP/GraphQL response bodies without leaking the operating tenant.

func (*UniqueConstraintError) Error

func (e *UniqueConstraintError) Error() string

Error implements the error interface.

The message is response-body-safe: callers can forward Error() to an HTTP or GraphQL response without leaking the tenant identifier. Internal observers (logs, audit) that need the tenant should pull it from the TenantID struct field via errors.As.

func (*UniqueConstraintError) Unwrap

func (e *UniqueConstraintError) Unwrap() error

Unwrap allows errors.Is(err, ErrUniqueConstraintViolation).

type Value

type Value struct {
	Type ValueType
	Data []byte
}

Value represents a typed property value

func BoolArrayValue

func BoolArrayValue(arr []bool) Value

BoolArrayValue creates a bool array value Encoding: [4 bytes count][1 byte per bool]

func BoolValue

func BoolValue(b bool) Value

func BytesValue

func BytesValue(b []byte) Value

func FloatArrayValue

func FloatArrayValue(arr []float64) Value

FloatArrayValue creates a float64 array value Encoding: [4 bytes count][8 bytes per float64]

func FloatValue

func FloatValue(f float64) Value

func IntArrayValue

func IntArrayValue(arr []int64) Value

IntArrayValue creates an int64 array value Encoding: [4 bytes count][8 bytes per int64]

func IntValue

func IntValue(i int64) Value

func JSONValue added in v0.5.0

func JSONValue(v any) (Value, error)

JSONValue stores v as its JSON encoding (#224). Used for value shapes the typed constructors can't represent — null, objects, nested structures, and mixed/empty arrays. The bytes round-trip back to the original shape via AsJSON, so reads return proper JSON instead of a %v-stringified blob.

Returns an error only if v isn't JSON-marshallable; callers that must not drop the property (ValueFromJSON) fall back to the legacy string path on error.

func StringArrayValue

func StringArrayValue(arr []string) Value

StringArrayValue creates a string array value Encoding: [4 bytes count][for each: 4 bytes len + string bytes]

func StringValue

func StringValue(s string) Value

Helper functions to create typed values

func TimestampValue

func TimestampValue(t time.Time) Value

func ValueFromJSON

func ValueFromJSON(v any) Value

ValueFromJSON converts a JSON-decoded Go value (the result of json.Unmarshal into `any` / `map[string]any` / `[]any`) into a typed storage.Value.

Dispatches on Go type:

  • string → TypeString
  • int, int64 → TypeInt
  • float64 → TypeInt if whole-number, TypeFloat otherwise (JSON numbers always arrive as float64; we collapse whole numbers to TypeInt for better downstream compatibility).
  • bool → TypeBool
  • []any → dispatches on element type for the all-same-type cases (float64, string, int/int64, bool); mixed or empty arrays store as TypeJSON.
  • nil / map / any other shape → TypeJSON (the JSON encoding), so null/objects/nested structures round-trip instead of being stringified to "<nil>" / "map[]" (#224).

Never errors. Inverse of valueToInterface (in pkg/api/server_helpers.go).

This function is the single canonical converter shared between the REST handlers (pkg/api/server_helpers.go convertToValue) and the GraphQL resolvers (pkg/graphql/mutations_resolvers.go convertToStorageValue). Two duplicates with diverging behaviour had caused real bugs (2026-05-14 silent-failure shape #7 — REST fix alone left coord's GraphQL path broken); consolidation prevents the next "fix one, miss the other" incident.

func VectorValue

func VectorValue(vec []float32) Value

func (Value) ArrayContains

func (v Value) ArrayContains(element Value) (bool, error)

ArrayContains checks if an array value contains a specific element. Works with TypeStringArray, TypeIntArray, TypeFloatArray, TypeBoolArray.

func (Value) ArrayLen

func (v Value) ArrayLen() (int, error)

ArrayLen returns the length of an array value

func (Value) AsBool

func (v Value) AsBool() (bool, error)

func (Value) AsBoolArray

func (v Value) AsBoolArray() ([]bool, error)

AsBoolArray decodes a bool array value

func (Value) AsFloat

func (v Value) AsFloat() (float64, error)

func (Value) AsFloatArray

func (v Value) AsFloatArray() ([]float64, error)

AsFloatArray decodes a float64 array value

func (Value) AsInt

func (v Value) AsInt() (int64, error)

func (Value) AsIntArray

func (v Value) AsIntArray() ([]int64, error)

AsIntArray decodes an int64 array value

func (Value) AsJSON added in v0.5.0

func (v Value) AsJSON() (any, error)

AsJSON unmarshals a TypeJSON value back to its original Go shape (nil for JSON null, map[string]any for objects, []any for arrays). Returns an error for non-TypeJSON values so callers can't accidentally treat a typed value as JSON.

func (Value) AsString

func (v Value) AsString() (string, error)

Decode methods

func (Value) AsStringArray

func (v Value) AsStringArray() ([]string, error)

AsStringArray decodes a string array value

func (Value) AsTimestamp

func (v Value) AsTimestamp() (time.Time, error)

func (Value) AsVector

func (v Value) AsVector() ([]float32, error)

func (Value) String

func (v Value) String() string

String returns a string representation of the Value for comparison and display purposes. This method implements the fmt.Stringer interface.

type ValueType

type ValueType uint8

ValueType represents the type of a property value

const (
	TypeString ValueType = iota
	TypeInt
	TypeFloat
	TypeBool
	TypeBytes
	TypeTimestamp
	TypeVector // Vector of float32 for embeddings/vector search

	// Array types
	TypeStringArray
	TypeIntArray
	TypeFloatArray
	TypeBoolArray

	// TypeJSON stores the marshalled JSON encoding of a value the typed
	// enum above can't represent: null, objects, nested structures, and
	// mixed/empty arrays (#224). Data holds the JSON bytes; AsJSON
	// unmarshals them back to the original shape so the value round-trips
	// instead of being stringified via %v ("<nil>", "map[]").
	//
	// MUST stay last in the enum: ValueType is persisted as its numeric
	// iota, so appending here leaves every existing on-disk value's tag
	// unchanged. Do not insert a type above this line without a snapshot
	// version bump.
	TypeJSON
)

type VectorIndex

type VectorIndex struct {
	// contains filtered or unexported fields
}

VectorIndex manages vector search indexes for node properties, partitioned by tenant.

The outer map key is tenantid.TenantID; the inner key is propertyName. Per-tenant outer partitioning means a tenant's index cannot be observed (searched, listed, has-checked) by a request scoped to a different tenant — the existence-leak channel that motivated the F4 redesign. See docs/internals/design/F4_VECTOR_TENANT_REDESIGN.md.

The tenant-blind public methods (CreateIndex, Search, etc.) are preserved for backwards compatibility; each delegates to its *ForTenant counterpart with tenantid.Default. The strict *ForTenant methods reject the empty tenantID — they are the path R1.2's GraphStorage.*VectorIndexForTenant wrappers route through.

func NewVectorIndex

func NewVectorIndex() *VectorIndex

NewVectorIndex creates a new vector index manager.

func (*VectorIndex) AddVector

func (vi *VectorIndex) AddVector(propertyName string, nodeID uint64, vec []float32) error

AddVector adds a vector to the default-tenant index. Equivalent to AddVectorForTenant(tenantid.Default, ...).

func (*VectorIndex) AddVectorForTenant

func (vi *VectorIndex) AddVectorForTenant(
	tenantID tenantid.TenantID,
	propertyName string,
	nodeID uint64,
	vec []float32,
) error

AddVectorForTenant adds a vector to (tenantID, propertyName)'s index.

func (*VectorIndex) CreateIndex

func (vi *VectorIndex) CreateIndex(
	propertyName string,
	dimensions int,
	m int,
	efConstruction int,
	metric vector.DistanceMetric,
) error

CreateIndex creates a new HNSW index for a property under the default tenant. Equivalent to CreateIndexForTenant(tenantid.Default, ...).

func (*VectorIndex) CreateIndexForTenant

func (vi *VectorIndex) CreateIndexForTenant(
	tenantID tenantid.TenantID,
	propertyName string,
	dimensions int,
	m int,
	efConstruction int,
	metric vector.DistanceMetric,
) error

CreateIndexForTenant creates a new HNSW index for a property under the given tenant. Returns an error if tenantID is empty or an index already exists for (tenantID, propertyName).

func (*VectorIndex) DimensionsForTenant

func (vi *VectorIndex) DimensionsForTenant(tenantID tenantid.TenantID, propertyName string) (int, bool)

DimensionsForTenant returns the configured vector dimension of (tenantID, propertyName)'s index and a presence flag. Used to validate an incoming vector's length cheaply (no graph traversal) before the expensive off-lock HNSW insert (Track P item 3 / H2): the storage write path decodes + dimension-checks under gs.mu, then runs Insert after releasing it.

func (*VectorIndex) DropIndex

func (vi *VectorIndex) DropIndex(propertyName string) error

DropIndex removes the default-tenant's index for propertyName. Equivalent to DropIndexForTenant(tenantid.Default, ...).

func (*VectorIndex) DropIndexForTenant

func (vi *VectorIndex) DropIndexForTenant(tenantID tenantid.TenantID, propertyName string) error

DropIndexForTenant removes (tenantID, propertyName)'s index.

func (*VectorIndex) GetIndexMetric

func (vi *VectorIndex) GetIndexMetric(propertyName string) (vector.DistanceMetric, error)

GetIndexMetric returns the distance metric for the default-tenant's index on propertyName. Equivalent to GetIndexMetricForTenant(tenantid.Default, ...).

func (*VectorIndex) GetIndexMetricForTenant

func (vi *VectorIndex) GetIndexMetricForTenant(
	tenantID tenantid.TenantID,
	propertyName string,
) (vector.DistanceMetric, error)

GetIndexMetricForTenant returns the distance metric for (tenantID, propertyName)'s index.

func (*VectorIndex) HasAnyIndex

func (vi *VectorIndex) HasAnyIndex() bool

HasAnyIndex reports whether any vector index exists across all tenants. A cheap O(1) guard (one map-length read) so write paths can skip vector bookkeeping entirely on graphs that use no vector search.

func (*VectorIndex) HasIndex

func (vi *VectorIndex) HasIndex(propertyName string) bool

HasIndex reports whether the default-tenant has an index for propertyName. Equivalent to HasIndexForTenant(tenantid.Default, ...).

func (*VectorIndex) HasIndexForTenant

func (vi *VectorIndex) HasIndexForTenant(tenantID tenantid.TenantID, propertyName string) bool

HasIndexForTenant reports whether (tenantID, propertyName) has an index. Returns false (not an error) for both "no such tenant" and "no index on this property" — the unified false response prevents tenant-existence probing.

func (*VectorIndex) IndexDefinitions

func (vi *VectorIndex) IndexDefinitions() []VectorIndexDef

IndexDefinitions returns the definition of every vector index across all tenants, for snapshot persistence. Order is unspecified.

func (*VectorIndex) ListIndexes

func (vi *VectorIndex) ListIndexes() []string

ListIndexes returns the default-tenant's indexed property names. Equivalent to ListIndexesForTenant(tenantid.Default).

func (*VectorIndex) ListIndexesForTenant

func (vi *VectorIndex) ListIndexesForTenant(tenantID tenantid.TenantID) []string

ListIndexesForTenant returns tenantID's indexed property names. Returns an empty slice (not nil) for tenants with no indexes.

func (*VectorIndex) RemoveVector

func (vi *VectorIndex) RemoveVector(propertyName string, nodeID uint64) error

RemoveVector removes a vector from the default-tenant index. Equivalent to RemoveVectorForTenant(tenantid.Default, ...).

func (*VectorIndex) RemoveVectorForTenant

func (vi *VectorIndex) RemoveVectorForTenant(
	tenantID tenantid.TenantID,
	propertyName string,
	nodeID uint64,
) error

RemoveVectorForTenant removes nodeID's vector from (tenantID, propertyName)'s index.

func (*VectorIndex) Search

func (vi *VectorIndex) Search(
	propertyName string,
	query []float32,
	k int,
	ef int,
) ([]vector.SearchResult, error)

Search performs k-NN search on the default-tenant index. Equivalent to SearchForTenant(tenantid.Default, ...).

func (*VectorIndex) SearchForTenant

func (vi *VectorIndex) SearchForTenant(
	tenantID tenantid.TenantID,
	propertyName string,
	query []float32,
	k int,
	ef int,
) ([]vector.SearchResult, error)

SearchForTenant performs k-NN search on (tenantID, propertyName)'s index.

type VectorIndexDef

type VectorIndexDef struct {
	TenantID       string
	PropertyName   string
	Dimensions     int
	M              int
	EfConstruction int
	Metric         vector.DistanceMetric
}

VectorIndexDef is a serializable description of one vector index — the construction parameters needed to recreate it on restart. The HNSW graph itself is NOT persisted; it is rebuilt from the node set on load (see GraphStorage.rebuildVectorIndexesFromNodes).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL