swarm

package
v0.7.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 2, 2026 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

Package swarm holds the per-slot session record schema and the JetStream-KV-backed Manager that bones swarm verbs use to track active swarm sessions in a workspace.

State lives in a single KV bucket (DefaultBucketName) keyed by slot name. ADR 0028 §"Lifecycle and state" explains why session state went to KV instead of a per-slot state.json on disk: every field in this struct is either already authoritative somewhere else in JetStream KV (task_id in bones-tasks, agent_id in bones-presence, the claim hold in bones-holds) or describes the host-local OS process owning the leaf (host, leaf_pid). Putting them in one bucket gives cross-host visibility and a single source of truth for `bones swarm status` and `bones doctor` without re-deriving state from N substrate buckets per call.

The Manager is intentionally simpler than internal/presence: there is no background heartbeat goroutine. Heartbeats are driven by agent-process calls to `bones swarm commit`, which extends TTL via CAS. A slot whose agent has crashed stops heartbeating and the bucket TTL evicts the entry — exactly the surface `bones doctor` is meant to see.

Index

Constants

View Source
const DefaultBucketName = "bones-swarm-sessions"

DefaultBucketName is the JetStream KV bucket holding swarm sessions. The name is part of the substrate contract (parallel to bones-tasks, bones-holds, bones-presence) and SHOULD NOT change without a substrate-evolution ADR.

View Source
const DefaultCaps = "oih"

DefaultCaps is the fossil capability string granted to a slot user when the caller doesn't override. Matches cli/swarm_join.go.

View Source
const DefaultHubFossilURL = "http://127.0.0.1:8765"

DefaultHubFossilURL is the hub fossil HTTP URL bones writes when `bones up` brings a hub to default ports. Mirrors the constant in cli/swarm_join.go so Acquire callers don't have to plumb the URL through the CLI flag layer.

View Source
const DefaultTTL = 5 * time.Minute

DefaultTTL is the JetStream KV bucket TTL. Sessions are heartbeated via `bones swarm commit`; a slot whose agent has crashed stops renewing and the bucket evicts the entry after this interval. Five minutes mirrors ADR 0028's "stale after 5min no renewal" rule.

View Source
const EventsFile = ".bones/swarm-events.jsonl"

EventsFile is the workspace-relative path of the structured swarm-lifecycle event log. Living next to the workspace leaf log keeps the audit-shaped data in one place; the dedicated filename makes the purpose obvious to operators tailing it.

Path is `<workspace>/.bones/swarm-events.jsonl`. JSONL (one JSON object per line) is the right shape for `tail -f`, `grep`, and streaming-friendly tooling. Each line is small (~150 bytes) so POSIX-atomic appends from concurrent slot processes don't tear.

Variables

View Source
var ErrCASConflict = errors.New("swarm: CAS conflict")

ErrCASConflict reports that a CAS-gated update or delete saw a revision mismatch — another writer raced ours. Mirrors internal/tasks.ErrCASConflict so callers can react identically.

View Source
var ErrCloseRequiresArtifact = errors.New(
	"swarm: close --result=success refused — no commit landed since join; " +
		"either commit the slot's artifact first or pass --no-artifact=<reason>",
)

ErrCloseRequiresArtifact is returned by ResumedLease.Close when the caller asked for CloseTaskOnSuccess but no commit has landed on the slot since join (LastRenewed == StartedAt on the session record). The substrate refuses the silent-bypass shape — closing success without producing an artifact severs the audit trail bones is supposed to preserve. Callers that have a legitimate reason to close success without a commit (e.g. a research subagent that returned findings inline) must pass CloseOpts.NoArtifact with an explicit reason; the reason is recorded so the post-mortem question "why did this slot leave no commit?" has a documented answer.

View Source
var ErrClosed = errors.New("swarm: sessions handle is closed")

ErrClosed reports that a public method was called on a Sessions whose Close has returned. Parallel to internal/presence.ErrClosed.

View Source
var ErrCrossHostOperation = errors.New(
	"swarm: cross-host operation refused — run the verb on the slot's owning host",
)

ErrCrossHostOperation is returned by Resume when the session record's Host field doesn't match this machine's hostname. Cross-host commit/close/status operations would manipulate a leaf process bones can't reach; the right answer is for the operator to run the verb on the owning host.

View Source
var ErrNotFound = errors.New("swarm: session not found")

ErrNotFound reports that the requested slot has no live session record in the bucket. Distinct from a substrate error so callers (swarm status, swarm commit) can distinguish "no session yet" from "NATS unreachable."

View Source
var ErrSessionAlreadyLive = errors.New(
	"swarm: live session already on this slot — pass --force to take over",
)

ErrSessionAlreadyLive is returned by Acquire when a live session record exists for the slot on this host and ForceTakeover is false. Callers that want to overwrite must set AcquireOpts.ForceTakeover.

View Source
var ErrSessionForeignHost = errors.New(
	"swarm: session owned by another host — refusing cross-host takeover",
)

ErrSessionForeignHost is returned by Acquire when the existing session record was written by a different host. Cross-host takeover is refused unconditionally; the operator must run the takeover from the owning host.

View Source
var ErrSessionGone = errors.New(
	"swarm: session record was deleted concurrently",
)

ErrSessionGone is returned by ResumedLease.Commit and ResumedLease.Close when the session record was deleted between Resume and the operation — usually because a concurrent close on another agent ran. Distinct from ErrSessionNotFound so callers can distinguish "never existed" from "deleted under us mid-flight".

View Source
var ErrSessionNotFound = errors.New(
	"swarm: session record not found — run `bones swarm join` first",
)

ErrSessionNotFound is returned by Resume when no session record exists for the slot. Distinct from swarm.ErrNotFound so callers can disambiguate "no record yet" from "Sessions handle closed".

View Source
var ErrWorkspaceNotBootstrapped = errors.New(
	"swarm: workspace not bootstrapped — the orchestrator must " +
		"bring up the hub before leaves can join; refusing to " +
		"bootstrap from a leaf context",
)

ErrWorkspaceNotBootstrapped is returned by Acquire when `<workspace>/.bones/hub.fossil` is missing. The role-leak guard from PR #54 lives here: a leaf must NEVER read the error as "run `bones up`" — bootstrap is the orchestrator's job. The error string deliberately omits the `bones up` phrase so a subagent reading stderr can't be misled into bootstrapping its own context.

Functions

func SlotDir

func SlotDir(workspaceDir, slot string) string

SlotDir returns the on-disk directory for the slot's per-leaf state under the workspace root. Layout:

<workspace>/.bones/swarm/<slot>/
├── leaf.fossil          libfossil repo (cloned from hub)
├── leaf.pid             host-local PID tracker
└── wt/                  worktree (Leaf.WT() result)

Pure path derivation; no KV lookup. Callers can compute this without opening a Sessions handle — `bones swarm cwd` exploits exactly that to avoid a NATS round-trip for a path query.

func SlotPidFile

func SlotPidFile(workspaceDir, slot string) string

SlotPidFile returns the slot's host-local PID-tracker path. Used by swarm join to write and swarm close to remove.

func SlotWorktree

func SlotWorktree(workspaceDir, slot string) string

SlotWorktree returns the slot's worktree path. Equivalent to filepath.Join(SlotDir(workspaceDir, slot), "wt"). Pulled out as a distinct helper because `bones swarm cwd` always wants this — the repo file and pid file are not user-facing.

Types

type AcquireOpts added in v0.1.3

type AcquireOpts struct {
	// HubURL overrides the hub fossil HTTP URL stamped into a fresh
	// session record. Resume reads HubURL from the existing record
	// and ignores this field. Empty → DefaultHubFossilURL.
	HubURL string

	// Hub is an in-process *coord.Hub. When set, the lease's
	// underlying coord.Leaf opens against this Hub directly
	// (LeafConfig.Hub) rather than via HubAddrs. Tests use this to
	// avoid spinning up a separate hub process; the CLI never sets
	// it.
	Hub *coord.Hub

	// Caps overrides fossil capabilities for the slot user on
	// Acquire. Resume ignores this. Empty → DefaultCaps.
	Caps string

	// ForceTakeover lets Acquire CAS-delete an existing same-host
	// session record on the slot. Recovery only — the typical path
	// is an explicit operator decision after `bones doctor`.
	ForceTakeover bool

	// NATSConn is a pre-connected NATS connection. If nil, the lease
	// dials info.NATSURL itself and the resulting connection is
	// closed when the lease is released. When set, the caller owns
	// the connection's lifetime.
	NATSConn *nats.Conn

	// Now is the time source for StartedAt and LastRenewed. nil →
	// time.Now().UTC. Tests inject a fixed clock.
	Now func() time.Time

	// NoAutosync disables the pre-commit hub pull on this lease's
	// underlying coord.Leaf. Default (false) keeps autosync ON,
	// which is the production default per ADR 0023's trunk-linearity
	// promise: every commit pulls from the hub before resolving the
	// trunk tip so the new commit's parent is the hub's-latest-tip,
	// producing one linear chain instead of N parallel forks.
	//
	// Set to true (CLI flag --no-autosync on swarm join / swarm
	// commit) when the caller has an explicit reason to operate in
	// branch-per-slot mode: offline tolerance, single-slot work
	// where no peer commits will race, or testing fan-in semantics.
	// Cost: one less hub HTTP round-trip per commit.
	NoAutosync bool
}

AcquireOpts tunes Acquire and Resume. Zero-value defaults are production-correct: HubURL → DefaultHubFossilURL, Caps → DefaultCaps, NATSConn dialed from info.NATSURL, Now → time.Now().UTC.

type CloseOpts added in v0.1.3

type CloseOpts struct {
	// CloseTaskOnSuccess, when true, calls coord.Leaf.Close on the
	// re-claimed task — closing the task in the bones-tasks bucket
	// in addition to releasing the claim hold. False just releases
	// the claim, leaving the task open for retry by the parent
	// dispatch. swarm close --result=success sets this true; fail
	// and fork leave it false.
	CloseTaskOnSuccess bool

	// NoArtifact, when non-empty, bypasses the
	// "success requires a commit since join" precondition. The
	// string is the operator's reason (e.g. "inline-only research
	// findings") and is recorded in the lifecycle event log so the
	// audit trail has a documented explanation for the missing
	// artifact. Ignored when CloseTaskOnSuccess is false (fail and
	// fork results have no artifact requirement).
	NoArtifact string

	// Reaped, when true, tags the emitted lifecycle event as
	// EventSlotReap rather than EventSlotClose. Set by the
	// `bones swarm reap` verb so post-mortem analysis can tell
	// substrate-driven cleanup ("agent went silent") apart from
	// operator-driven close ("subagent reported done"). No effect
	// on the cleanup mechanics.
	Reaped bool

	// KeepWT, when true, retains the per-slot worktree directory
	// even on a successful close. Default behavior (KeepWT=false
	// + CloseTaskOnSuccess=true) removes the wt so it does not
	// accumulate across cycles. Failed and forked closes
	// (CloseTaskOnSuccess=false) always retain wt regardless of
	// this flag — the operator may need to inspect what the slot
	// left behind.
	KeepWT bool
}

CloseOpts tunes ResumedLease.Close. Zero value is the conservative release-and-cleanup behavior; CloseTaskOnSuccess transitions the underlying task to closed in the bones-tasks bucket.

type CommitResult added in v0.1.3

type CommitResult struct {
	UUID       string
	PushResult *libfossil.SyncResult
	PushErr    error
	RenewErr   error
}

CommitResult is the outcome of ResumedLease.Commit. UUID is set on every successful local commit. PushResult is set when the post-commit HTTP push to the hub succeeded; PushErr is set when it failed (the local commit lands either way). RenewErr is set when the session-record CAS bump exhausted retries or saw an unrecoverable error. Callers should print warnings for the soft errors but should not roll back the local commit.

type Config

type Config struct {
	// NATSConn is the live NATS connection. Sessions does not dial; the
	// caller owns connect/disconnect lifecycle. Required.
	NATSConn *nats.Conn

	// BucketName overrides DefaultBucketName. Empty string falls back
	// to the package default.
	BucketName string

	// TTL overrides DefaultTTL on bucket creation. Zero falls back to
	// the package default. Ignored if the bucket already exists with
	// a different TTL — JetStream KV is not destructive on Update.
	TTL time.Duration
}

Config configures a Sessions handle. Only NATSConn is required; bucket name and TTL fall back to the package defaults if zero. Defaults match production; tests sometimes pass shorter TTLs to exercise expiry.

func (Config) Validate

func (c Config) Validate() error

Validate returns an error describing the first invalid field, or nil if the config is acceptable. Open calls Validate before any substrate work so callers see config issues with no NATS round-trip.

type Event added in v0.5.0

type Event struct {
	TS         time.Time `json:"ts"`
	Kind       EventKind `json:"event"`
	Slot       string    `json:"slot"`
	TaskID     string    `json:"task_id,omitempty"`
	AgentID    string    `json:"agent_id,omitempty"`
	Host       string    `json:"host,omitempty"`
	Result     string    `json:"result,omitempty"`
	NoArtifact string    `json:"no_artifact,omitempty"`
	CommitUUID string    `json:"commit_uuid,omitempty"`
}

Event is the on-disk shape written to swarm-events.jsonl. Fields are minimal-but-sufficient for an operator to reconstruct who did what when. Optional fields use omitempty so close events don't carry stale CommitUUIDs and join events don't carry spurious Result strings.

type EventKind added in v0.5.0

type EventKind string

EventKind enumerates the slot-lifecycle event types written to the JSONL log. Each kind corresponds to a distinct verb at the CLI surface; consumers (operator dashboards, replays) can filter by kind without parsing the rest of the line.

const (
	// EventSlotJoin is emitted when Acquire successfully writes a
	// session record. The slot is bound to a task and a leaf is open.
	EventSlotJoin EventKind = "slot_join"

	// EventSlotCommit is emitted when ResumedLease.Commit returns
	// without a leaf-side error. PushErr / RenewErr surface in the
	// returned CommitResult but the structured event records the
	// successful local commit (audit-trail-relevant).
	EventSlotCommit EventKind = "slot_commit"

	// EventSlotClose is emitted when ResumedLease.Close completes
	// the cleanup path. Always carries Result so post-mortem
	// analysis sees success/fail/fork distinction.
	EventSlotClose EventKind = "slot_close"

	// EventSlotReap is emitted by the swarm reap verb instead of
	// EventSlotClose so operators can distinguish operator-driven
	// cleanup ("we shut this slot down") from substrate-driven
	// reaping ("the agent went silent and we cleaned up after it").
	EventSlotReap EventKind = "slot_reap"
)

type FreshLease added in v0.2.0

type FreshLease struct {
	// contains filtered or unexported fields
}

FreshLease is a slot session created by Acquire. It owns an active claim taken at acquisition time. FreshLease.Release is the graceful exit (record persists for later Resume); FreshLease.Abort is the rollback path (record deleted).

FreshLease has neither Commit nor Close — those operate on a resumed lease with the latest record revision. Callers that need to commit after acquiring should Release the FreshLease and Resume in a separate operation.

func Acquire added in v0.2.0

func Acquire(
	ctx context.Context, info workspace.Info,
	slot, taskID string, opts AcquireOpts,
) (*FreshLease, error)

Acquire opens a FreshLease for a slot that has no live session record yet. Used by `bones swarm join`. Steps, in order, with any partial work cleaned up on error:

  1. Verify `<workspace>/.bones/hub.fossil` exists. If not, return ErrWorkspaceNotBootstrapped — the role-leak guard from PR #54.
  2. Create the slot's fossil user on the hub repo if missing.
  3. Open the swarm.Sessions handle (dial NATS or reuse opts.NATSConn).
  4. Read any existing session record. Active on this host without --force → ErrSessionAlreadyLive. Different host → ErrSessionForeignHost. Stale or --force → CAS-delete and continue.
  5. Open coord.Leaf with Autosync=true and claim the task.
  6. Put the session record (CAS create) and write the pid file.

The returned FreshLease holds the claim. Callers MUST call Release (record persists) or Abort (record deleted as rollback) exactly once.

func (*FreshLease) Abort added in v0.2.0

func (l *FreshLease) Abort(ctx context.Context) error

Abort rolls back the fresh acquisition: releases the claim, stops the leaf, CAS-deletes the session record, removes the host-local pid file, and tears down the Sessions handle and NATS connection. Used when join's downstream work fails after Acquire succeeded — the caller wants to leave the slot in the same state Acquire saw. Idempotent.

func (*FreshLease) FossilUser added in v0.2.0

func (b *FreshLease) FossilUser() string

FossilUser returns the slot's fossil user (e.g. "slot-rendering").

func (*FreshLease) HubURL added in v0.2.0

func (b *FreshLease) HubURL() string

HubURL returns the hub fossil HTTP URL the lease's leaf is peered against.

func (*FreshLease) Release added in v0.2.0

func (l *FreshLease) Release(ctx context.Context) error

Release closes the FreshLease without deleting the session record. Releases the claim, stops the leaf, closes the swarm.Sessions handle, and (if the lease owns the NATS connection) closes that too. Idempotent.

`bones swarm join` calls Release after writing the session record so subsequent verbs can Resume against it.

func (*FreshLease) SessionRevision added in v0.2.0

func (b *FreshLease) SessionRevision() uint64

SessionRevision returns the JetStream KV revision the lease last observed. Bumped internally by Commit on each successful CAS. Test-only utility — production verbs go through Commit / Close.

func (*FreshLease) Slot added in v0.2.0

func (b *FreshLease) Slot() string

Slot returns the slot name this lease is bound to.

func (*FreshLease) TaskID added in v0.2.0

func (b *FreshLease) TaskID() string

TaskID returns the task ID stamped into the lease's session record.

func (*FreshLease) WT added in v0.2.0

func (b *FreshLease) WT() string

WT returns the slot's worktree path. Returns the empty string after the leaf has been stopped (e.g. post-Commit).

type ResumedLease added in v0.2.0

type ResumedLease struct {
	// contains filtered or unexported fields
}

ResumedLease is a slot session reconstructed from an existing record by Resume. It does NOT hold a claim; Commit and Close re-acquire as needed. CAS revision on the session record is tracked internally; Commit advances it on every successful LastRenewed bump.

ResumedLease has Commit, Close, and Release. No Abort — fresh records are the only thing that gets rolled back.

func Resume added in v0.1.3

func Resume(
	ctx context.Context, info workspace.Info,
	slot string, opts AcquireOpts,
) (*ResumedLease, error)

Resume opens a ResumedLease for a slot whose session record already exists in KV. Used by every swarm verb other than join.

Resume does NOT re-take the claim — verbs that need a claim (Commit, Close) re-acquire it themselves. This mirrors today's CLI behavior: claim holds have a TTL that outlives a single verb only when bumped, and the record's LastRenewed is the durable liveness signal.

Resume refuses cross-host operations: if the session record's Host field doesn't match this machine, returns ErrCrossHostOperation. Returns ErrSessionNotFound if the slot has no record.

func (*ResumedLease) Close added in v0.2.0

func (l *ResumedLease) Close(ctx context.Context, opts CloseOpts) error

Close terminates the ResumedLease, removes the host-local pid file, and CAS-deletes the session record. When CloseOpts. CloseTaskOnSuccess is true the underlying task is also transitioned to closed in the bones-tasks bucket via Leaf.Close. Idempotent.

Steps, in order:

  1. Re-claim the lease's task (Resume did not claim it).
  2. If CloseTaskOnSuccess: Leaf.Close — closes the task and releases the claim hold in one operation. Otherwise just release the claim.
  3. Stop the leaf.
  4. Remove the host-local pid file.
  5. CAS-delete the session record, retrying on rev advance from a concurrent commit.
  6. Tear down the swarm.Sessions handle + NATS connection.

CAS conflict on step 5 (sibling commit bumped LastRenewed) re-reads the rev and retries — the operator wants to close, regardless of transient renewals. ErrSessionGone surfaces if the record is missing on first read; the close is otherwise idempotent.

func (*ResumedLease) Commit added in v0.2.0

func (l *ResumedLease) Commit(
	ctx context.Context, message string, files []coord.File,
) (CommitResult, error)

Commit re-claims the lease's task, announces holds for the file paths, commits the bytes via the underlying coord.Leaf, releases the claim, stops the leaf, HTTP-pushes the slot's leaf.fossil to the hub via /xfer, and CAS-bumps LastRenewed on the session record.

Returns CommitResult with the local-commit UUID always set on success. PushResult / PushErr report the hub-push outcome; the hub may be unreachable without rolling back the local commit. RenewErr reports the session-record CAS bump; transient CAS conflicts retry up to jskv.MaxRetries times. ErrSessionGone surfaces as RenewErr if the session record was deleted between Resume and now.

After Commit returns, the lease's underlying coord.Leaf has been stopped — the push path needs an exclusive libfossil.Repo handle on leaf.fossil. The lease can still be Released or Closed; doing other verb work on the lease's leaf after Commit is undefined and would have to re-Resume.

func (*ResumedLease) FossilUser added in v0.2.0

func (b *ResumedLease) FossilUser() string

FossilUser returns the slot's fossil user (e.g. "slot-rendering").

func (*ResumedLease) HubURL added in v0.2.0

func (b *ResumedLease) HubURL() string

HubURL returns the hub fossil HTTP URL the lease's leaf is peered against.

func (*ResumedLease) Release added in v0.2.0

func (l *ResumedLease) Release(ctx context.Context) error

Release closes the ResumedLease without deleting the session record. Stops the leaf, closes the Sessions handle, and (if the lease owns the NATS connection) closes that too. Idempotent.

`bones swarm commit` calls Release after Commit so the record stays live for later verbs.

func (*ResumedLease) SessionRevision added in v0.2.0

func (b *ResumedLease) SessionRevision() uint64

SessionRevision returns the JetStream KV revision the lease last observed. Bumped internally by Commit on each successful CAS. Test-only utility — production verbs go through Commit / Close.

func (*ResumedLease) Slot added in v0.2.0

func (b *ResumedLease) Slot() string

Slot returns the slot name this lease is bound to.

func (*ResumedLease) TaskID added in v0.2.0

func (b *ResumedLease) TaskID() string

TaskID returns the task ID stamped into the lease's session record.

func (*ResumedLease) WT added in v0.2.0

func (b *ResumedLease) WT() string

WT returns the slot's worktree path. Returns the empty string after the leaf has been stopped (e.g. post-Commit).

type Session

type Session struct {
	// Slot is the slot name (matches the plan's `[slot: X]` annotation).
	// Doubles as the KV bucket key — slot is the primary identifier.
	Slot string `json:"slot"`

	// TaskID identifies the task this slot is currently working on.
	// Lookup against bones-tasks for full task state; this field is a
	// denormalized pointer used by `bones swarm status` and the close
	// verb so they don't need a second lookup.
	TaskID string `json:"task_id"`

	// AgentID is "slot-<name>" — the slot's fossil + coord identity.
	// Matches the agent_id used for the holds and presence buckets so
	// claim ownership and liveness join cleanly across substrates.
	AgentID string `json:"agent_id"`

	// Host is os.Hostname() of the machine running the leaf. Different
	// from the workspace host means another machine owns this slot;
	// `bones swarm` verbs abort with a clear cross-host error rather
	// than try to manage a remote leaf process.
	Host string `json:"host"`

	// LeafPID is the host-local PID of the bones CLI process that
	// wrote this record at join time. Informational only — every
	// swarm verb is its own short-lived CLI invocation, so this PID
	// is already-dead by the time any later verb reads it. Kept as a
	// debugging crumb (a pid that was alive at join means the join
	// did happen on this host) but NOT a liveness signal; use
	// LastRenewed for active/stale classification instead.
	LeafPID int `json:"leaf_pid"`

	// HubURL is the hub fossil HTTP URL recorded at join time. Later
	// verbs read this so they can post-commit push the slot's
	// leaf.fossil to the hub via /xfer without re-deriving the URL
	// from a flag. Empty on records written by older clients;
	// callers fall back to the join-flag default.
	HubURL string `json:"hub_url,omitempty"`

	// StartedAt is when this session record was first written.
	// Immutable once set; later renewals only touch LastRenewed.
	StartedAt time.Time `json:"started_at"`

	// LastRenewed is updated on every `swarm commit` (which heartbeats
	// the session) and used by `bones doctor` to flag stale entries.
	// This is the canonical active/stale signal — see LeafPID for the
	// "why we don't use the pid" discussion.
	LastRenewed time.Time `json:"last_renewed"`
}

Session is the per-slot record persisted in the bones-swarm-sessions JetStream KV bucket. One record per active slot in a workspace. Slot names are workspace-scoped because the bucket itself is per- NATS-deployment.

Marshaled as JSON for human-readable debugging via `nats kv get`. All time fields are RFC3339-encoded UTC. New fields must default safely on a missing-key read so older clients can read newer records (additive evolution only).

type Sessions added in v0.2.0

type Sessions struct {
	// contains filtered or unexported fields
}

Sessions owns the JetStream KV bucket holding swarm session records. Reads (Get, List) are public — `bones swarm status`, `bones doctor`, and slot-resolution helpers in the CLI consume them directly. Mutations (put, update, delete) are unexported; the only legal mutator is `swarm.Lease`, which lives in the same package.

The narrow public surface enforces a seam: the lifecycle of a session record is owned end-to-end by Lease, so outside callers cannot bypass Lease's invariants (host match, CAS revision tracking, claim-bound writes).

Every public method is safe to call concurrently. Close is idempotent. Unlike presence.Manager there is no heartbeat goroutine — callers (the bones swarm verbs, via Lease) drive renewal via CAS update.

func Open

func Open(ctx context.Context, cfg Config) (*Sessions, error)

Open creates (or reattaches to) the swarm sessions KV bucket and returns a Sessions handle. Caller must Close at shutdown. Open does not dial NATS: the connection comes pre-wired so reconnect policy stays a single-source concern (same shape as presence.Open).

func (*Sessions) BucketName added in v0.2.0

func (s *Sessions) BucketName() string

BucketName returns the bucket name in use. Useful for diagnostics and for `bones doctor` to mention the right bucket in its messages.

func (*Sessions) Close added in v0.2.0

func (s *Sessions) Close() error

Close marks the Sessions handle closed so subsequent public calls return ErrClosed. Safe to call more than once. Does not delete bucket contents; sessions outlive the process that wrote them (TTL eventually evicts).

func (*Sessions) Get added in v0.2.0

func (s *Sessions) Get(
	ctx context.Context, slot string,
) (Session, uint64, error)

Get reads the session record for slot. Returns ErrNotFound if no record exists. The returned revision is the JetStream KV sequence number suitable for a follow-up CAS update or delete (callable only from inside the swarm package — Sessions's mutators are unexported).

func (*Sessions) List added in v0.2.0

func (s *Sessions) List(ctx context.Context) ([]Session, error)

List returns every live session in the bucket. Order matches the underlying JetStream KV list order (key-name lexicographic in practice; callers that need a specific order should sort).

List is the canonical read-across-slots seam — `bones swarm status`, `bones doctor`, and CLI slot-resolution helpers all consume it. This is one of the read methods that justify keeping Sessions as a public type at all (versus folding into Lease).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL