longrun

package
v0.20.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 31, 2026 License: Apache-2.0 Imports: 15 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// UnlimitedRetries disables the retry limit — the task retries forever
	// (until a permanent error or context cancellation).
	// Use with caution: set this explicitly to opt in.
	UnlimitedRetries = -1

	// DefaultMaxRetries is used when MaxRetries is 0 (zero-value).
	DefaultMaxRetries = 3
)

Variables

This section is empty.

Functions

This section is empty.

Types

type AttemptStore added in v0.20.3

type AttemptStore interface {
	// Increment advances the counter for key and returns the value
	// BEFORE the increment (0-based attempt index).
	Increment(key string) int

	// Get returns the current counter for key. Returns 0 for unknown keys.
	Get(key string) int

	// Reset sets all counters to zero.
	Reset()
}

AttemptStore tracks retry attempt counters.

Default implementation is in-memory (MemoryStore). Users can provide a persistent implementation (Redis, SQLite) via WithAttemptStore to survive process restarts without losing backoff state.

Keys are opaque strings formed by the caller — e.g. "rule:fetch issues" or "baseline:node". The store does not interpret them.

Rule keys are derived from the error message (sentinels) or from the explicit TransientRule.Key field (typed nil pointers). This makes keys stable across deployments — reordering rules does not break persistent state.

AttemptStore is NOT required to be safe for concurrent use. Each Task owns its own store instance.

type BackoffFunc added in v0.20.3

type BackoffFunc func(attempt int) time.Duration

BackoffFunc computes the delay before the next retry given a 0-based attempt index. Pure function — no side effects, no context, no IO.

The package provides common constructors (Exponential, Constant), but any function with this signature works — jitter, decorrelated, adaptive, etc.

Example (custom jittered backoff):

func jittered(attempt int) time.Duration {
    base := time.Second * time.Duration(1<<attempt)
    return base + time.Duration(rand.Int63n(int64(base/2)))
}

func Constant added in v0.20.3

func Constant(d time.Duration) BackoffFunc

Constant returns a BackoffFunc that always returns the same delay. Useful for retry-after scenarios or testing.

Example:

longrun.Constant(5*time.Second)

func DefaultBackoff added in v0.5.0

func DefaultBackoff() BackoffFunc

DefaultBackoff returns a sensible default BackoffFunc.

Configured as Exponential(1s, 30s) — classic doubling, capped at 30s. Perfect for 5 retries: 1s, 2s, 4s, 8s, 16s.

func Exponential added in v0.20.3

func Exponential(initial, maxCap time.Duration) BackoffFunc

Exponential returns a BackoffFunc with classic exponential growth. Multiplier is 2.0. Delay is capped at maxCap.

Formula: delay = initial * 2^attempt, capped at maxCap.

Example:

longrun.Exponential(2*time.Second, 2*time.Minute)
// attempt 0: 2s, attempt 1: 4s, attempt 2: 8s, ..., capped at 2m

func ExponentialWith added in v0.20.3

func ExponentialWith(initial, maxCap time.Duration, multiplier float64) BackoffFunc

ExponentialWith returns a BackoffFunc with configurable multiplier.

Formula: delay = initial * multiplier^attempt, capped at maxCap.

Example:

longrun.ExponentialWith(1*time.Second, 30*time.Second, 1.5)
// attempt 0: 1s, attempt 1: 1.5s, attempt 2: 2.25s, ...

type Baseline added in v0.14.0

type Baseline struct {
	// Policies maps error categories to their retry policies.
	// Use predefined categories (CategoryNode, CategoryService) or define custom ones.
	Policies map[ErrorCategory]Policy

	// Default policy for errors not matching any category in Policies.
	// nil → unknown errors are permanent (crash). Use for preflights.
	// non-nil → retry with loud ERROR logging. Use for workers.
	Default *Policy

	// Classify is the application-level classifier.
	// Called after built-in transport classification.
	// nil = no application classification, only transport + default.
	Classify ClassifierFunc
}

Baseline is a set of policies that Runner silently applies to every task. Tasks don't know about baseline — it's configured once on Runner.

Policies maps error categories to retry policies. Use predefined categories (CategoryNode, CategoryService) or define your own.

Classification pipeline in handleFailure:

[1] Built-in transport classify (net.OpError, timeout → Node)
[2] User classifier via Classify (apierr interfaces → Service)
[3] Not classified → Unknown ->
    Unknown + Default != nil → retry with Default policy (LOUD log)
    Unknown + Default == nil → permanent error

func NewBaseline added in v0.20.2

func NewBaseline(node, service Policy, classify ClassifierFunc) Baseline

NewBaseline creates a Baseline with Node and Service policies. Default is nil — unknown errors are permanent.

Example:

longrun.NewBaseline(
    longrun.Policy{Backoff: longrun.Exponential(2*time.Second, 2*time.Minute)},
    longrun.Policy{Backoff: longrun.Exponential(5*time.Second, 5*time.Minute)},
    myClassifier,
)

func NewBaselineDegraded added in v0.20.2

func NewBaselineDegraded(node, service, defaultPolicy Policy, classify ClassifierFunc) Baseline

NewBaselineDegraded creates a Baseline with Node, Service, and Default policies. Unknown errors retry with Default policy instead of crashing.

Example:

longrun.NewBaselineDegraded(
    longrun.Policy{Backoff: longrun.Exponential(2*time.Second, 2*time.Minute)},
    longrun.Policy{Backoff: longrun.Exponential(5*time.Second, 5*time.Minute)},
    longrun.Policy{Backoff: longrun.Exponential(30*time.Second, 5*time.Minute)},
    myClassifier,
)

type ClassifierFunc added in v0.14.0

type ClassifierFunc func(err error) *ErrorClass

ClassifierFunc inspects an error and returns its classification. Return nil if the error is not recognized — the next classification step will handle it.

ClassifierFunc must be safe to call concurrently from multiple goroutines.

type ErrorCategory added in v0.14.0

type ErrorCategory int

ErrorCategory classifies an error for baseline policy selection. Predefined categories cover common network integration scenarios. Users can define custom categories for domain-specific classification:

const CategoryDatabase longrun.ErrorCategory = 10
const (
	// CategoryUnknown means the error was not recognized by any classifier.
	// If Baseline.Default is set — retry with default policy.
	// If Baseline.Default is nil — permanent error.
	CategoryUnknown ErrorCategory = iota

	// CategoryNode indicates a transport-level failure (TCP, DNS, TLS, timeout).
	// The request never reached the server or the connection was interrupted.
	// Retry aggressively — the network will recover.
	CategoryNode

	// CategoryService indicates the remote service is under pressure
	// (rate limit, 5xx, maintenance). Retry gently — don't kick them
	// while they're down.
	CategoryService
)

type ErrorClass added in v0.14.0

type ErrorClass struct {
	// Category determines which baseline policy to use.
	Category ErrorCategory

	// WaitDuration, when > 0, overrides the backoff calculation.
	// The task sleeps exactly this duration instead of calling
	// policy.Backoff(attempt).
	// Typical source: Retry-After header on HTTP 429.
	WaitDuration time.Duration
}

ErrorClass is the result of error classification. Returned by ClassifierFunc to tell handleFailure which category the error belongs to and optionally how long to wait before retrying.

func ClassifyTransport added in v0.14.0

func ClassifyTransport(err error) *ErrorClass

ClassifyTransport checks whether err is a transport-level failure. Returns CategoryNode for network and timeout errors, nil otherwise.

This is a built-in classifier that depends only on stdlib. It runs before any user-provided ClassifierFunc in the handleFailure pipeline.

Exported for testability and for use in custom ClassifierFunc implementations that want to extend (not replace) the built-in transport classification.

Classification rules:

  • url.Error with Timeout() → Node (check before net.OpError because url.Error often wraps it)
  • context.DeadlineExceeded → Node
  • net.OpError → Node
  • net.DNSError → Node
  • io.EOF, io.ErrUnexpectedEOF → Node (connection dropped mid-response)

type Matcher added in v0.6.0

type Matcher struct {
	// contains filtered or unexported fields
}

Matcher checks whether an error matches a given pattern.

Two forms are supported:

  • error value (sentinel): matched via errors.Is
  • *T where T implements error: matched via errors.As

Examples:

NewMatcher(ErrTimeout)          // sentinel → errors.Is
NewMatcher((*net.OpError)(nil)) // pointer-to-type → errors.As

func NewMatcher added in v0.6.0

func NewMatcher(errVal any) Matcher

NewMatcher compiles an error pattern into a Matcher.

The errVal argument must be one of:

  • an error value (for errors.Is matching)
  • a pointer to an error type, i.e. *T where T implements error (for errors.As matching)

Panics if errVal is nil or an unsupported type.

func (Matcher) Match added in v0.6.0

func (m Matcher) Match(err error) bool

Match reports whether err matches the pattern.

type MemoryStore added in v0.20.3

type MemoryStore struct {
	// contains filtered or unexported fields
}

MemoryStore is the default in-memory AttemptStore. Exported so users can wrap it with decorators (logging, metrics, persistence fallback).

func NewMemoryStore added in v0.20.3

func NewMemoryStore() *MemoryStore

NewMemoryStore creates an in-memory AttemptStore.

func (*MemoryStore) Get added in v0.20.3

func (m *MemoryStore) Get(key string) int

func (*MemoryStore) Increment added in v0.20.3

func (m *MemoryStore) Increment(key string) int

func (*MemoryStore) Reset added in v0.20.3

func (m *MemoryStore) Reset()

type Option added in v0.6.0

type Option func(*taskConfig)

Option configures a Task. Use With* functions to create options.

func WithAttemptStore added in v0.20.3

func WithAttemptStore(store AttemptStore) Option

WithAttemptStore sets a custom AttemptStore for retry state persistence. Default: in-memory store (state lost on process restart). Use a persistent implementation (Redis, SQLite) to survive restarts without losing backoff progress.

func WithDelay added in v0.6.0

func WithDelay(d time.Duration) Option

WithDelay delays the first execution by the given duration. For interval tasks: first tick fires after delay, then every interval. For one-shot tasks: execution starts after delay. Delay is independent of interval.

func WithLogger added in v0.6.0

func WithLogger(l *slog.Logger) Option

WithLogger sets a custom logger for the task. Defaults to slog.Default().

func WithShutdown added in v0.6.0

func WithShutdown(fn ShutdownFunc) Option

WithShutdown registers a graceful shutdown hook for the task. The hook is called by Runner after all task goroutines have stopped.

func WithTimeout added in v0.6.0

func WithTimeout(d time.Duration) Option

WithTimeout sets a per-invocation timeout for the work function. Each call to work gets its own context with this deadline.

type Policy added in v0.14.0

type Policy struct {
	// Retries limits consecutive retry attempts.
	//   0 (zero-value) → unlimited retries (baseline default).
	//  >0 → exact retry count.
	Retries int

	// Backoff computes the delay before the next retry.
	// Use Exponential, Constant, or any custom BackoffFunc.
	Backoff BackoffFunc
}

Policy defines retry behavior for a single error category.

type RuleTracker added in v0.6.0

type RuleTracker struct {
	// contains filtered or unexported fields
}

RuleTracker tracks retry attempts for a single TransientRule.

Each rule has its own independent budget. The tracker is created internally by Task from TransientRule.MaxRetries.

func NewRuleTracker added in v0.6.0

func NewRuleTracker(maxRetries int) *RuleTracker

NewRuleTracker creates a tracker with the given max retries.

MaxRetries semantics:

0 (zero-value) → DefaultMaxRetries (3).
-1 (UnlimitedRetries) → no limit.
>0 → exact limit.

func (*RuleTracker) Attempt added in v0.6.0

func (rt *RuleTracker) Attempt() int

Attempt returns the current attempt count.

func (*RuleTracker) Max added in v0.6.0

func (rt *RuleTracker) Max() int

Max returns the resolved max retries.

func (*RuleTracker) OnFailure added in v0.6.0

func (rt *RuleTracker) OnFailure() (int, bool)

OnFailure records a failure and returns the 0-based attempt index and whether the caller is allowed to retry.

Example with max=3:

1st call: attempt=0, ok=true
2nd call: attempt=1, ok=true
3rd call: attempt=2, ok=true
4th call: attempt=3, ok=false (budget exhausted)

func (*RuleTracker) Reset added in v0.6.0

func (rt *RuleTracker) Reset()

Reset sets the attempt counter back to zero (e.g. after healthy progress).

type Runner

type Runner struct {
	// contains filtered or unexported fields
}

Runner orchestrates N tasks. When any task returns a permanent error the runner cancels all remaining tasks and performs graceful shutdown.

Runner does NOT handle OS signals — pass a cancellable context (e.g. via signal.NotifyContext).

func NewRunner

func NewRunner(opts RunnerOptions) *Runner

NewRunner creates a Runner with the given options.

func (*Runner) Add

func (r *Runner) Add(task *Task)

Add registers a task for concurrent execution. Panics if the same task is added twice. If Runner has a Baseline configured, it is appended as a failureHandler after the task's own TransientRule handlers.

func (*Runner) Wait

func (r *Runner) Wait(ctx context.Context) error

Wait starts all tasks concurrently and blocks until they all finish. When any task returns an error, all other tasks are cancelled via ctx. After all goroutines finish, shutdown hooks are called in LIFO order (reverse of Add). The ctx passed in controls the lifetime — the runner does NOT listen for OS signals; use signal.NotifyContext in the caller.

type RunnerOptions added in v0.5.0

type RunnerOptions struct {
	ShutdownTimeout time.Duration // default 30s
	Logger          *slog.Logger  // nil = slog.Default()

	// Baseline is a set of policies silently applied to every task.
	// When set, Runner passes it to each Task at Add time.
	// Zero value means no baseline — tasks rely solely on their own TransientRules.
	Baseline Baseline
}

RunnerOptions configures a Runner.

type ShutdownFunc added in v0.5.0

type ShutdownFunc func(ctx context.Context) error

ShutdownFunc is called during graceful shutdown.

type Task added in v0.5.0

type Task struct {
	// contains filtered or unexported fields
}

Task is a self-contained unit of work with interval, retry and backoff support. It can be used standalone (via Wait) or managed by a Runner.

Task is NOT safe for concurrent use — call Wait from a single goroutine. Runner handles this automatically (one goroutine per task).

func NewIntervalTask added in v0.6.0

func NewIntervalTask(name string, interval time.Duration, work WorkFunc, rules []TransientRule, opts ...Option) *Task

NewIntervalTask creates a task that runs on a ticker loop. If rules is nil — any error kills the task. If rules is provided — transient errors are retried per their configuration, permanent errors (no matching rule) kill the task.

Each TransientRule binds an error to its own retry budget and backoff curve. TransientRule.MaxRetries limits consecutive failures for that rule. When a tick completes successfully, all rule trackers reset — so intermittent failures separated by successful ticks never accumulate toward MaxRetries.

Panics if work is nil or interval <= 0. Panics if any rule has nil Err, unsupported Err type, or nil Backoff.

func NewOneShotTask added in v0.6.0

func NewOneShotTask(name string, work WorkFunc, rules []TransientRule, opts ...Option) *Task

NewOneShotTask creates a task that executes once. If rules is nil — no retries, any error is fatal. If rules is provided — transient errors are retried per their configuration.

Each TransientRule binds an error to its own retry budget and backoff curve. TransientRule.MaxRetries limits consecutive failures for that rule — the budget is never reset mid-execution for one-shot tasks.

Panics if work is nil. Panics if any rule has nil Err, unsupported Err type, or nil Backoff.

func (*Task) Wait added in v0.5.0

func (t *Task) Wait(ctx context.Context) error

Wait runs the task to completion, respecting the configured retry policy, backoff and interval. It blocks until the task finishes or ctx is cancelled.

type TransientRule added in v0.6.0

type TransientRule struct {
	// Err is the error to match.
	// Must be an error value (for errors.Is) or a pointer to an error type (for errors.As).
	// Passing nil or an unsupported type panics at construction time.
	// Examples:
	//
	//	{Err: ErrTimeout}           // sentinel → errors.Is
	//	{Err: (*net.OpError)(nil)}  // pointer-to-type → errors.As
	Err error

	// MaxRetries limits consecutive retry attempts for this rule.
	//   0 (zero-value) → DefaultMaxRetries (3) — safe default.
	//  -1 (UnlimitedRetries) → no limit — explicit opt-in.
	//  >0 → exact retry count.
	MaxRetries int

	Backoff BackoffFunc

	// Key is the stable identifier for AttemptStore persistence.
	//
	// For sentinel errors (errors.New), Key is auto-derived from the error
	// message — safe to leave empty.
	//
	// For typed nil pointers (*net.OpError)(nil), Key MUST be set explicitly —
	// the error message is "<nil>" which is not a stable identifier.
	// Construction panics if Key is empty for a typed nil pointer.
	//
	// When using a persistent AttemptStore (Redis, SQLite), Key must be
	// stable across deployments. Reordering rules is safe as long as
	// each rule keeps its Key.
	Key string
}

TransientRule binds an error to its retry settings. Different errors can have different retry budgets and backoff curves.

The Err field accepts two forms:

  • error value (sentinel): matched via errors.Is
  • *T where T implements error: matched via errors.As

Examples:

{Err: ErrTimeout}           // sentinel → errors.Is
{Err: (*net.OpError)(nil)}  // pointer-to-type → errors.As

func TransientGroup added in v0.7.0

func TransientGroup(maxRetries int, backoff BackoffFunc, errs ...error) []TransientRule

TransientGroup creates N rules with identical MaxRetries and BackoffFunc. Each rule gets its own independent retry budget — failures of one error do not count toward the budget of another.

Each error in errs must be a valid Err value (sentinel or typed nil pointer). See TransientRule.Err for details.

Note: typed nil pointers (*net.OpError)(nil) require explicit Key on each rule. TransientGroup does not set Key — use it only with sentinel errors.

Example:

longrun.TransientGroup(longrun.UnlimitedRetries, longrun.DefaultBackoff(),
    ErrFetchIssues,
    ErrStoreIssues,
)

type WorkFunc added in v0.5.0

type WorkFunc func(ctx context.Context) error

WorkFunc is the function that performs the actual work of a task.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL