config

package
v0.1.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 20, 2026 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package config defines Orca's YAML configuration shape and loading helpers.

The schema is an intentional subset of the full Orca configuration surface; extending it later is a matter of adding fields and keeping zero-values backward-compatible.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ParseLogLevel

func ParseLogLevel(s string) (slog.Level, error)

ParseLogLevel maps an orca log-level string to slog.Level. Returns an error for unknown values. Empty string is treated as the configured default ("info"). Used both by config.validate at YAML parse time and by the cmd/orca entrypoint to honour the ORCA_LOG_LEVEL environment override.

Types

type AWSS3

type AWSS3 struct {
	Endpoint     string `yaml:"endpoint"` // empty for real AWS S3
	Region       string `yaml:"region"`
	Bucket       string `yaml:"bucket"`
	AccessKey    string `yaml:"access_key"`
	SecretKey    string `yaml:"secret_key"`
	UsePathStyle bool   `yaml:"use_path_style"` // true for LocalStack
}

AWSS3 is the awss3 origin adapter configuration. In dev this points at LocalStack alongside the cachestore (different bucket); in production it points at real AWS S3 with no Endpoint override.

type Azureblob

type Azureblob struct {
	Account    string `yaml:"account"`
	AccountKey string `yaml:"account_key"`
	Container  string `yaml:"container"`

	// Endpoint, when set, overrides the default Azure Blob service URL
	// (https://<account>.blob.core.windows.net/). Used in dev to point
	// at Azurite (http://azurite:10000/devstoreaccount1) so the
	// azureblob driver path can be exercised without a real Azure
	// account.
	Endpoint string `yaml:"endpoint"`
}

Azureblob is the azureblob origin adapter configuration.

Page and Append blobs are unconditionally rejected at Head: their random-access mutation model is incompatible with the chunked, immutable cache contract orca relies on. There is no configuration switch for this behaviour.

type ByteSize

type ByteSize int64

ByteSize is an int64 byte count with a YAML unmarshal hook that accepts either a numeric scalar (legacy form: `size: 8388608`) or a human-readable string scalar (`size: 8 MiB`, `size: 1.5 GiB`, `size: 128MiB`, `size: 1 GB`).

SI suffixes (KB, MB, GB, TB, PB) are decimal multipliers (powers of ten); IEC suffixes (KiB, MiB, GiB, TiB, PiB) are binary multipliers (powers of two). This matches the convention used by Kubernetes resource quantities, most container tooling, and the IEC standard. Operators who mean exactly 1 048 576 bytes should write "1 MiB"; "1 MB" is 1 000 000.

Fractional values are allowed and truncated by the underlying parser ("1.5 GiB" -> 1 610 612 736). Negative values, NaN, overflow above int64 max, and empty / whitespace-only scalars are rejected at unmarshal time with a message tagged with the YAML line number for ease of locating the offending entry.

The zero value is 0, which applyDefaults treats as "field omitted" for fields that have a default fallback (e.g. Chunking.Size).

func (ByteSize) Int64

func (b ByteSize) Int64() int64

Int64 returns the raw byte count as an int64. Provided as an explicit accessor so callsites that hand the value to int64-typed APIs (chunk.SizeFor, chunk.Tier.ChunkSize) read naturally without scattered int64(...) casts.

func (ByteSize) String

func (b ByteSize) String() string

String renders the byte count using IEC units (e.g. "8.0 MiB", "1.5 GiB"). Used in validation error messages so operators see the offending value in human-friendly units regardless of how it was written in YAML.

func (*ByteSize) UnmarshalYAML

func (b *ByteSize) UnmarshalYAML(value *yaml.Node) error

UnmarshalYAML implements yaml.Unmarshaler. The accepted forms are described on ByteSize. The function trims surrounding whitespace and rejects negatives up front so the operator sees a bytesize-flavored error rather than humanize.ParseBytes's less-specific "unhandled size name" surface.

type Cachestore

type Cachestore struct {
	Driver string       `yaml:"driver"` // "s3" in v1
	S3     CachestoreS3 `yaml:"s3"`
}

Cachestore is the in-DC chunk store configuration.

type CachestoreS3

type CachestoreS3 struct {
	Endpoint     string `yaml:"endpoint"`
	Bucket       string `yaml:"bucket"`
	Region       string `yaml:"region"`
	AccessKey    string `yaml:"access_key"`
	SecretKey    string `yaml:"secret_key"`
	UsePathStyle bool   `yaml:"use_path_style"` // true for LocalStack
}

CachestoreS3 is the s3 driver configuration. In dev this points at LocalStack; in production at VAST or another in-DC S3-compatible store.

Bucket versioning is unconditionally validated at startup: a versioned bucket silently breaks the no-clobber atomic-commit primitive (PutObject + If-None-Match: *) the driver depends on. There is no configuration switch for this gate.

type ChunkCatalog

type ChunkCatalog struct {
	MaxEntries int `yaml:"max_entries"`
}

ChunkCatalog is the in-memory chunk-presence cache configuration.

type ChunkTier

type ChunkTier struct {
	MinObjectSize ByteSize `yaml:"min_object_size"`
	ChunkSize     ByteSize `yaml:"chunk_size"`
}

ChunkTier is one entry in the Chunking.Tiers ladder. Objects whose size is at or above MinObjectSize use ChunkSize, unless a higher-threshold tier also matches (in which case the higher tier wins). Both fields must be > 0; ChunkSize must be >= 1 MiB (the floor that applies to Chunking.Size as well). Both fields accept the same numeric-or-human-readable forms as Chunking.Size; see ByteSize.

type Chunking

type Chunking struct {
	// Size is the base chunk size used for objects smaller than the
	// smallest tier threshold. Accepts a numeric byte count
	// (`size: 8388608`) or a human-readable string
	// (`size: 8 MiB`, `size: 1 GB`); see ByteSize for the accepted
	// units and SI-vs-IEC semantics.
	Size      ByteSize    `yaml:"size"` // default 8 MiB
	Tiers     []ChunkTier `yaml:"tiers"`
	Readahead *int        `yaml:"readahead"`
}

Chunking governs chunk size and read-ahead for client GETs.

Size is the base chunk size used for objects smaller than the smallest Tier threshold. Tiers, if non-empty, override Size for objects at or above each tier's MinObjectSize: the tier with the largest threshold <= the object's size wins. Tiers must be strictly ascending by MinObjectSize; the loader enforces this at validate time so the runtime selection path can assume sorted input.

Readahead is the number of chunks the client-edge GET handler prefetches while streaming the current chunk to the client. It is a pointer so the loader can distinguish an omitted YAML field (defaults to 8) from an explicit "readahead: 0" (disables read-ahead and restores the strictly-sequential chunk-fetch behavior). The cost is bounded by readahead * effective_chunk_size of extra in-flight cachestore body buffers per concurrent GET; cold-fill speculation is additionally bounded by the per-replica origin semaphore (target_per_replica), so peak per-replica cold-buffer memory is at most:

target_per_replica * max(Size, max ChunkSize across Tiers)

With the defaults (Size=8 MiB, Tiers up to 128 MiB, 4 replicas at target_global=64), the per-replica ceiling is 16 * 128 MiB = 2 GiB. Operators with tighter memory budgets should lower the highest tier's ChunkSize or drop the largest-object tier entirely.

func (Chunking) AsChunkTiers

func (c Chunking) AsChunkTiers() []chunk.Tier

AsChunkTiers returns the configured tier ladder as a []chunk.Tier slice suitable for chunk.SizeFor. Returns nil for an empty list. The slice is in the validated ascending-MinObjectSize order.

func (Chunking) ReadaheadDepth

func (c Chunking) ReadaheadDepth() int

ReadaheadDepth returns the configured read-ahead depth. A nil pointer (YAML omitted) returns 0; applyDefaults populates the default-on value so configurations that loaded through Load always have a non-nil pointer. Callers that bypass Load (e.g. hand-constructed test configs) get 0 for nil, which matches the "feature disabled" semantics.

type Cluster

type Cluster struct {
	Service           string        `yaml:"service"`            // headless Service FQDN
	MembershipRefresh time.Duration `yaml:"membership_refresh"` // DNS poll interval
	InternalListen    string        `yaml:"internal_listen"`
	InternalTLS       InternalTLS   `yaml:"internal_tls"`
	TargetReplicas    int           `yaml:"target_replicas"`
	SelfPodIP         string        `yaml:"self_pod_ip"` // resolved from POD_IP env
}

Cluster captures peer discovery + internal-listener configuration.

type Config

type Config struct {
	Server       Server       `yaml:"server"`
	Origin       Origin       `yaml:"origin"`
	Cachestore   Cachestore   `yaml:"cachestore"`
	Cluster      Cluster      `yaml:"cluster"`
	ChunkCatalog ChunkCatalog `yaml:"chunk_catalog"`
	Metadata     Metadata     `yaml:"metadata"`
	Chunking     Chunking     `yaml:"chunking"`
	Logging      Logging      `yaml:"logging"`
}

Config is the top-level Orca configuration.

func Load

func Load(path string) (*Config, error)

Load reads the YAML config file at path and returns a populated Config. Defaults are applied for fields left at zero-value.

func (*Config) TargetPerReplica

func (c *Config) TargetPerReplica() int

TargetPerReplica returns the per-replica origin concurrency cap derived from origin.target_global divided by cluster.target_replicas. This bounds the number of concurrent in-flight origin requests this replica will issue.

type InternalTLS

type InternalTLS struct {
	Enabled    bool   `yaml:"enabled"`
	CertFile   string `yaml:"cert_file"`
	KeyFile    string `yaml:"key_file"`
	CAFile     string `yaml:"ca_file"`
	ServerName string `yaml:"server_name"`
}

InternalTLS governs the internal-listener mTLS posture.

Production: enabled=true (mTLS required). Dev: enabled=false (plain HTTP/2). The binary logs WARN at startup.

type Logging

type Logging struct {
	// Level is one of "debug", "info", "warn", "error". Empty
	// defaults to "info".
	Level string `yaml:"level"`
}

Logging governs structured-log output. The level controls slog emission filtering; debug surfaces per-request and per-chunk tracing through the fetch coordinator, metadata cache, chunk catalog, cluster, cachestore, and origin drivers.

The ORCA_LOG_LEVEL environment variable, if set and non-empty, overrides the YAML-configured Level at process startup. Useful for one-shot debug sessions without re-rendering the configmap.

type Metadata

type Metadata struct {
	TTL         time.Duration `yaml:"ttl"`
	NegativeTTL time.Duration `yaml:"negative_ttl"`
	MaxEntries  int           `yaml:"max_entries"`
}

Metadata is the object-metadata cache configuration.

type Origin

type Origin struct {
	ID           string        `yaml:"id"`
	Driver       string        `yaml:"driver"` // "azureblob" or "awss3"
	TargetGlobal int           `yaml:"target_global"`
	QueueTimeout time.Duration `yaml:"queue_timeout"`
	Retry        OriginRetry   `yaml:"retry"`
	Azureblob    Azureblob     `yaml:"azureblob"`
	AWSS3        AWSS3         `yaml:"awss3"`
}

Origin describes the upstream origin (Azure Blob or AWS S3 in v1).

type OriginRetry

type OriginRetry struct {
	Attempts         int           `yaml:"attempts"`
	BackoffInitial   time.Duration `yaml:"backoff_initial"`
	BackoffMax       time.Duration `yaml:"backoff_max"`
	MaxTotalDuration time.Duration `yaml:"max_total_duration"`
}

OriginRetry captures the leader-side pre-header retry budget.

type Server

type Server struct {
	Listen string     `yaml:"listen"`
	Auth   ServerAuth `yaml:"auth"`

	// OpsListen is the bind address for the operations endpoint
	// hosting /healthz and /readyz. Plain HTTP, no auth. Kubelet
	// liveness and readiness probes target this address; production
	// Service objects do not forward this port externally.
	OpsListen string `yaml:"ops_listen"`
}

Server holds the client-edge listener configuration plus the ops listener used for kubelet probes (/healthz and /readyz).

type ServerAuth

type ServerAuth struct {
	Enabled          bool   `yaml:"enabled"`
	Mode             string `yaml:"mode"`
	BearerSecretFile string `yaml:"bearer_secret_file"`
}

ServerAuth governs the client-edge authentication path.

Production: enabled=true with mode=bearer or mode=mtls. Dev: enabled=false disables authentication entirely (no token or client cert required). This is a single security knob, not a dev_mode flag.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL