retry

package

v0.2.1 Latest Latest Go to latest Published: Apr 2, 2026 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/driftlessaf/go-driftlessaf

Links

Open Source Insights

Documentation ¶

Overview ¶

Package retry provides exponential backoff retry logic for handling transient errors.

Overview ¶

This package implements retry mechanisms with exponential backoff and jitter, designed specifically for handling rate limit and quota errors from external APIs. It integrates with the workqueue package to enable graceful backoff when retries are exhausted.

Features ¶

Exponential backoff with configurable base and maximum delays
Random jitter to prevent thundering herd problems
Context-aware cancellation during retry loops
Integration with workqueue for deferred retry after exhaustion
Generic retry function supporting any return type
Configurable retry predicates to distinguish retryable from permanent errors

Usage ¶

The primary entry point is RetryWithBackoff, which executes a function with automatic retry on transient errors:

cfg := retry.DefaultRetryConfig()
result, err := retry.RetryWithBackoff(
	ctx,
	cfg,
	"fetch_data",
	isRetryable,
	func() (string, error) {
		return api.FetchData()
	},
)

For workqueue integration, use RequeueIfRetryable to convert exhausted retries into a delayed requeue:

err := doWorkWithRetries(ctx)
if requeueErr := retry.RequeueIfRetryable(ctx, err, isRetryable, "OpenAI"); requeueErr != nil {
	return requeueErr
}
return err

Configuration ¶

RetryConfig controls retry behavior:

MaxRetries: Maximum number of retry attempts (0 disables retries)
BaseBackoff: Initial backoff duration (doubled each attempt)
MaxBackoff: Maximum backoff duration (caps exponential growth)
MaxJitter: Maximum random jitter added to each backoff

Use DefaultRetryConfig() for sensible defaults tuned for quota-based rate limits, or construct a custom RetryConfig for specific requirements.

Integration Patterns ¶

Retry with workqueue integration:

func (w *Worker) Process(ctx context.Context, item string) error {
	result, err := retry.RetryWithBackoff(
		ctx,
		retry.DefaultRetryConfig(),
		"process_item",
		isQuotaError,
		func() (string, error) {
			return w.api.Process(item)
		},
	)
	if err != nil {
		if requeueErr := retry.RequeueIfRetryable(ctx, err, isQuotaError, "API"); requeueErr != nil {
			return requeueErr
		}
		return err
	}
	return w.store.Save(result)
}

Custom retry predicate:

func isRetryable(err error) bool {
	var apiErr *APIError
	if errors.As(err, &apiErr) {
		return apiErr.Code == 429 || apiErr.Code >= 500
	}
	return false
}

Index ¶

Constants
func RequeueIfRetryable(ctx context.Context, err error, isRetryable func(error) bool, provider string) error
func RetryWithBackoff[T any](ctx context.Context, cfg RetryConfig, operation string, ...) (T, error)
type RetryConfig
- func DefaultRetryConfig() RetryConfig
- func (c RetryConfig) Validate() error

Constants ¶

View Source

const LLMBackoffDelay = 5 * time.Minute

LLMBackoffDelay is the base delay for requeueing work after LLM API errors exhaust inner retries. This prevents the workqueue from immediately retrying and contributing to API overload.

Variables ¶

This section is empty.

Functions ¶

func RequeueIfRetryable ¶ added in v0.2.0

func RequeueIfRetryable(ctx context.Context, err error, isRetryable func(error) bool, provider string) error

RequeueIfRetryable checks whether err is a retryable LLM API error and, if so, returns a workqueue.RequeueAfter to signal the workqueue to back off instead of immediately retrying. If the error is not retryable, it returns nil and the caller should handle the error normally.

Example ¶

ExampleRequeueIfRetryable demonstrates using RequeueIfRetryable to convert exhausted retries into a workqueue requeue.

package main

import (
	"context"
	"errors"
	"fmt"

	"chainguard.dev/driftlessaf/agents/executor/retry"
)

func main() {
	ctx := context.Background()

	// Simulate an error after retries are exhausted
	apiErr := errors.New("429 rate limit exceeded")

	isRetryable := func(err error) bool {
		return errors.Is(err, apiErr)
	}

	// Check if the error should trigger a requeue
	requeueErr := retry.RequeueIfRetryable(ctx, apiErr, isRetryable, "OpenAI")
	if requeueErr != nil {
		fmt.Println("requeue requested")
		return
	}
	fmt.Println("no requeue needed")
}

Output:
requeue requested

Example (NonRetryable) ¶

ExampleRequeueIfRetryable_nonRetryable demonstrates that non-retryable errors do not trigger a requeue.

package main

import (
	"context"
	"errors"
	"fmt"

	"chainguard.dev/driftlessaf/agents/executor/retry"
)

func main() {
	ctx := context.Background()

	// Permanent error that should not be retried
	permErr := errors.New("permission denied")

	isRetryable := func(err error) bool {
		return false // Not retryable
	}

	requeueErr := retry.RequeueIfRetryable(ctx, permErr, isRetryable, "API")
	if requeueErr != nil {
		fmt.Println("requeue requested")
		return
	}
	fmt.Println("no requeue needed")
}

Output:
no requeue needed

func RetryWithBackoff ¶

func RetryWithBackoff[T any](ctx context.Context, cfg RetryConfig, operation string, isRetryable func(error) bool, fn func() (T, error)) (T, error)

RetryWithBackoff executes the given function with exponential backoff retry. It only retries on errors that are classified as retryable by the provided isRetryable function.

Example ¶

ExampleRetryWithBackoff demonstrates using RetryWithBackoff to handle transient API errors with exponential backoff.

package main

import (
	"context"
	"errors"
	"fmt"

	"chainguard.dev/driftlessaf/agents/executor/retry"
)

func main() {
	ctx := context.Background()
	cfg := retry.DefaultRetryConfig()

	// Define what errors are retryable
	isRetryable := func(err error) bool {
		return errors.Is(err, errRateLimited)
	}

	// Simulate an API call that may fail transiently
	result, err := retry.RetryWithBackoff(
		ctx,
		cfg,
		"fetch_user",
		isRetryable,
		func() (string, error) {
			// Your API call here
			return "user-data", nil
		},
	)
	if err != nil {
		fmt.Printf("failed: %v\n", err)
		return
	}
	fmt.Printf("success: %s\n", result)
}

// errRateLimited is a sentinel error for examples.
var errRateLimited = errors.New("rate limited")

Output:
success: user-data

Example (CustomConfig) ¶

ExampleRetryWithBackoff_customConfig demonstrates using a custom RetryConfig for specific retry requirements.

package main

import (
	"context"
	"errors"
	"fmt"
	"time"

	"chainguard.dev/driftlessaf/agents/executor/retry"
)

func main() {
	ctx := context.Background()

	// Custom configuration for aggressive retries
	cfg := retry.RetryConfig{
		MaxRetries:  3,
		BaseBackoff: 100 * time.Millisecond,
		MaxBackoff:  5 * time.Second,
		MaxJitter:   100 * time.Millisecond,
	}

	isRetryable := func(err error) bool {
		return errors.Is(err, errRateLimited)
	}

	result, err := retry.RetryWithBackoff(
		ctx,
		cfg,
		"quick_operation",
		isRetryable,
		func() (int, error) {
			return 42, nil
		},
	)
	if err != nil {
		fmt.Printf("failed: %v\n", err)
		return
	}
	fmt.Printf("result: %d\n", result)
}

// errRateLimited is a sentinel error for examples.
var errRateLimited = errors.New("rate limited")

Output:
result: 42

Types ¶

type RetryConfig ¶

type RetryConfig struct {
	// MaxRetries is the maximum number of retry attempts (default: 5)
	// 0 means do not retry at all.
	MaxRetries int
	// BaseBackoff is the initial backoff duration (default: 1s, higher than typical due to quota nature)
	BaseBackoff time.Duration
	// MaxBackoff is the maximum backoff duration (default: 60s)
	MaxBackoff time.Duration
	// MaxJitter is the maximum random jitter added to backoff (default: 500ms)
	MaxJitter time.Duration
}

RetryConfig configures retry behavior for API calls. This is particularly useful for handling rate limit and transient server errors.

func DefaultRetryConfig ¶

func DefaultRetryConfig() RetryConfig

DefaultRetryConfig returns a retry configuration suitable for quota and rate limit errors. Uses longer backoffs than typical retry configs because quota-based rate limits often require more time to recover.

Example ¶

ExampleDefaultRetryConfig demonstrates getting the default retry configuration.

package main

import (
	"fmt"

	"chainguard.dev/driftlessaf/agents/executor/retry"
)

func main() {
	cfg := retry.DefaultRetryConfig()

	fmt.Printf("MaxRetries: %d\n", cfg.MaxRetries)
	fmt.Printf("BaseBackoff: %v\n", cfg.BaseBackoff)
	fmt.Printf("MaxBackoff: %v\n", cfg.MaxBackoff)
	fmt.Printf("MaxJitter: %v\n", cfg.MaxJitter)
}

Output:
MaxRetries: 5
BaseBackoff: 1s
MaxBackoff: 1m0s
MaxJitter: 500ms

func (RetryConfig) Validate ¶

func (c RetryConfig) Validate() error

Validate checks that the retry configuration has valid values.

Example ¶

ExampleRetryConfig_Validate demonstrates validating a retry configuration.

package main

import (
	"fmt"
	"time"

	"chainguard.dev/driftlessaf/agents/executor/retry"
)

func main() {
	// Valid configuration
	validCfg := retry.RetryConfig{
		MaxRetries:  5,
		BaseBackoff: time.Second,
		MaxBackoff:  time.Minute,
		MaxJitter:   500 * time.Millisecond,
	}
	if err := validCfg.Validate(); err != nil {
		fmt.Printf("validation failed: %v\n", err)
	} else {
		fmt.Println("valid configuration")
	}

	// Invalid configuration with negative values
	invalidCfg := retry.RetryConfig{
		MaxRetries:  -1,
		BaseBackoff: time.Second,
		MaxBackoff:  time.Minute,
		MaxJitter:   500 * time.Millisecond,
	}
	if err := invalidCfg.Validate(); err != nil {
		fmt.Printf("validation failed: %v\n", err)
	}
}

Output:
valid configuration
validation failed: max retries cannot be negative

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL