retry

package
v2.7.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 24, 2025 License: MIT Imports: 8 Imported by: 0

README

Retry Package

Go Reference

Production-ready retry mechanism with exponential backoff using cenkalti/backoff/v4. This package provides a clean, reusable API for retrying operations without manual retry logic implementation.

Features

  • Exponential Backoff: Configurable backoff strategy with jitter
  • Context Support: Respects context cancellation and timeouts
  • OpenTelemetry Integration: Automatic tracing and logging
  • Permanent Errors: Stop retrying for non-transient errors
  • Flexible Configuration: Fluent API with sensible defaults
  • Type-Safe: Uses generics for type-safe operation definitions

Installation

go get github.com/jasoet/pkg/v2/retry

Quick Start

Basic Usage
package main

import (
    "context"
    "fmt"
    "time"

    "github.com/jasoet/pkg/v2/retry"
)

func main() {
    ctx := context.Background()

    // Default configuration (5 retries, 500ms initial interval, 2x multiplier)
    cfg := retry.DefaultConfig().
        WithName("database.connect").
        WithMaxRetries(3)

    err := retry.Do(ctx, cfg, func(ctx context.Context) error {
        // Your operation that might fail
        return connectToDatabase()
    })

    if err != nil {
        fmt.Printf("Failed after retries: %v\n", err)
    }
}
With OpenTelemetry
import (
    "github.com/jasoet/pkg/v2/retry"
    "github.com/jasoet/pkg/v2/otel"
)

func main() {
    ctx := context.Background()

    // Setup OTel
    otelConfig := otel.NewConfig("my-service").
        WithTracerProvider(tracerProvider).
        WithMeterProvider(meterProvider)

    // Retry with OTel instrumentation
    cfg := retry.DefaultConfig().
        WithName("api.fetch").
        WithMaxRetries(5).
        WithInitialInterval(1 * time.Second).
        WithOTel(otelConfig)

    err := retry.Do(ctx, cfg, func(ctx context.Context) error {
        return fetchFromAPI(ctx)
    })
}
Custom Backoff Strategy
cfg := retry.DefaultConfig().
    WithName("s3.upload").
    WithMaxRetries(10).
    WithInitialInterval(100 * time.Millisecond).
    WithMaxInterval(30 * time.Second).
    WithMultiplier(1.5)

err := retry.Do(ctx, cfg, func(ctx context.Context) error {
    return uploadToS3(data)
})
Permanent Errors (No Retry)
import "github.com/jasoet/pkg/v2/retry"

func validateAndProcess(data string) error {
    if len(data) == 0 {
        // This error should not be retried
        return retry.Permanent(fmt.Errorf("invalid data: empty string"))
    }

    // This error will be retried
    return processData(data)
}

err := retry.Do(ctx, cfg, func(ctx context.Context) error {
    return validateAndProcess(data)
})
With Custom Notifications
err := retry.DoWithNotify(ctx, cfg,
    func(ctx context.Context) error {
        return riskyOperation()
    },
    func(err error, backoff time.Duration) {
        log.Printf("Retry after %v: %v", backoff, err)
        // Send metrics, alerts, etc.
    },
)
Unlimited Retries (Use with Timeout)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()

cfg := retry.DefaultConfig().
    WithName("poll.status").
    WithMaxRetries(0). // Unlimited retries
    WithInitialInterval(1 * time.Second).
    WithMaxInterval(10 * time.Second)

err := retry.Do(ctx, cfg, func(ctx context.Context) error {
    status, err := checkJobStatus()
    if err != nil {
        return err
    }
    if status != "completed" {
        return fmt.Errorf("job not ready: %s", status)
    }
    return nil
})

Configuration

Config Fields
type Config struct {
    // MaxRetries is the maximum number of retry attempts (0 means unlimited)
    // Default: 5
    MaxRetries uint64

    // InitialInterval is the initial retry interval
    // Default: 500ms
    InitialInterval time.Duration

    // MaxInterval caps the maximum retry interval
    // Default: 60s
    MaxInterval time.Duration

    // Multiplier is the exponential backoff multiplier
    // Default: 2.0 (each retry waits 2x longer)
    Multiplier float64

    // OperationName is used for logging and tracing
    // Default: "retry.operation"
    OperationName string

    // OTelConfig enables OpenTelemetry tracing and logging
    // Optional: if nil, no OTel instrumentation
    OTelConfig *otel.Config
}
Fluent API Methods
  • WithName(name string) - Set operation name for logging/tracing
  • WithMaxRetries(n uint64) - Set maximum retry attempts (0 = unlimited)
  • WithInitialInterval(d time.Duration) - Set initial retry interval
  • WithMaxInterval(d time.Duration) - Set maximum retry interval cap
  • WithMultiplier(m float64) - Set exponential backoff multiplier
  • WithOTel(cfg *otel.Config) - Enable OpenTelemetry instrumentation

How It Works

  1. Exponential Backoff: Each retry waits longer than the previous one

    • Interval = InitialInterval × Multiplier^(attempt-1)
    • Capped at MaxInterval
  2. Jitter: Built-in randomization to prevent thundering herd

  3. Context Awareness:

    • Respects context cancellation
    • Respects context deadlines/timeouts
    • Returns context error when cancelled
  4. Permanent Errors:

    • Wrap errors with retry.Permanent() to stop retrying
    • Useful for validation errors, 4xx HTTP errors, etc.

Examples

See examples/retry for complete working examples:

  • Basic retry with defaults
  • Custom backoff configuration
  • OpenTelemetry integration
  • Permanent error handling
  • Context cancellation
  • Unlimited retries with timeout

Best Practices

  1. Set Appropriate MaxRetries: Don't retry forever, use context timeout for long-running operations

  2. Use Permanent Errors: Mark non-transient errors as permanent to avoid unnecessary retries

  3. Configure Backoff Based on Service:

    • Fast operations: 100ms initial, 1.5x multiplier
    • Network calls: 500ms-1s initial, 2x multiplier
    • Heavy operations: 1s+ initial, 2-3x multiplier
  4. Add Context Timeout: Always use context with timeout to prevent infinite waiting

  5. Monitor with OTel: Enable OpenTelemetry for production visibility

Common Use Cases

Database Connection
cfg := retry.DefaultConfig().
    WithName("database.connect").
    WithMaxRetries(5).
    WithInitialInterval(500 * time.Millisecond)

err := retry.Do(ctx, cfg, func(ctx context.Context) error {
    return db.Ping()
})
HTTP API Call
cfg := retry.DefaultConfig().
    WithName("api.call").
    WithMaxRetries(3).
    WithInitialInterval(1 * time.Second)

var response *http.Response
err := retry.Do(ctx, cfg, func(ctx context.Context) error {
    resp, err := http.Get(url)
    if err != nil {
        return err
    }
    if resp.StatusCode >= 500 {
        resp.Body.Close()
        return fmt.Errorf("server error: %d", resp.StatusCode)
    }
    if resp.StatusCode >= 400 {
        resp.Body.Close()
        return retry.Permanent(fmt.Errorf("client error: %d", resp.StatusCode))
    }
    response = resp
    return nil
})
File Upload with S3
cfg := retry.DefaultConfig().
    WithName("s3.upload").
    WithMaxRetries(10).
    WithInitialInterval(200 * time.Millisecond).
    WithMaxInterval(30 * time.Second)

err := retry.Do(ctx, cfg, func(ctx context.Context) error {
    return s3Client.Upload(ctx, bucket, key, data)
})

Testing

The package includes comprehensive tests covering:

  • Success scenarios (first attempt, after retries)
  • Failure scenarios (all retries exhausted)
  • Context cancellation and timeout
  • Exponential backoff behavior
  • Permanent errors
  • Unlimited retries
  • Custom notifications

Run tests:

go test -v -race ./retry

Performance Considerations

  • Low Overhead: Minimal allocations, uses cenkalti/backoff efficiently
  • Context-Aware: Respects cancellation immediately
  • No Goroutine Leaks: Properly cleans up on context cancellation
  • cenkalti/backoff - Underlying backoff implementation
  • otel - OpenTelemetry integration for observability

License

MIT License - see LICENSE for details

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Do

func Do(ctx context.Context, cfg Config, operation Operation) error

Do executes the operation with retry logic using exponential backoff. It returns nil if the operation succeeds, or the last error if all retries are exhausted.

Example:

cfg := retry.DefaultConfig().
	WithName("database.connect").
	WithMaxRetries(3).
	WithOTel(otelConfig)

err := retry.Do(ctx, cfg, func(ctx context.Context) error {
	return db.Ping()
})

func DoWithNotify

func DoWithNotify(
	ctx context.Context,
	cfg Config,
	operation Operation,
	notifyFunc func(error, time.Duration),
) error

DoWithNotify is like Do but calls notifyFunc on each retry with the error and backoff duration. This is useful for custom logging or metrics.

Example:

err := retry.DoWithNotify(ctx, cfg, operation, func(err error, backoff time.Duration) {
	log.Printf("Retry after error: %v (waiting %v)", err, backoff)
})

func Permanent

func Permanent(err error) error

Permanent wraps an error to indicate it should not be retried. Use this when an error is not transient and retrying would be pointless.

Example:

return retry.Permanent(fmt.Errorf("invalid configuration"))

Types

type Config

type Config struct {
	// MaxRetries is the maximum number of retry attempts (0 means unlimited).
	// Default: 5
	MaxRetries uint64

	// InitialInterval is the initial retry interval.
	// Default: 500ms
	InitialInterval time.Duration

	// MaxInterval caps the maximum retry interval.
	// Default: 60s
	MaxInterval time.Duration

	// Multiplier is the exponential backoff multiplier.
	// Default: 2.0 (each retry waits 2x longer)
	Multiplier float64

	// OperationName is used for logging and tracing.
	// Default: "retry.operation"
	OperationName string

	// OTelConfig enables OpenTelemetry tracing and logging.
	// Optional: if nil, no OTel instrumentation
	OTelConfig *pkgotel.Config
}

Config holds retry configuration with exponential backoff.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns a Config with sensible defaults.

func (Config) WithInitialInterval

func (c Config) WithInitialInterval(interval time.Duration) Config

WithInitialInterval sets the initial retry interval.

func (Config) WithMaxInterval

func (c Config) WithMaxInterval(interval time.Duration) Config

WithMaxInterval sets the maximum retry interval.

func (Config) WithMaxRetries

func (c Config) WithMaxRetries(maxRetries uint64) Config

WithMaxRetries sets the maximum number of retry attempts.

func (Config) WithMultiplier

func (c Config) WithMultiplier(multiplier float64) Config

WithMultiplier sets the exponential backoff multiplier.

func (Config) WithName

func (c Config) WithName(name string) Config

WithName sets the operation name for logging and tracing.

func (Config) WithOTel

func (c Config) WithOTel(otelConfig *pkgotel.Config) Config

WithOTel adds OpenTelemetry configuration to the retry config.

type Operation

type Operation func(ctx context.Context) error

Operation is a function that may fail and should be retried. Return nil to indicate success, or an error to trigger a retry.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL