retry

package
v1.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 29, 2025 License: MIT Imports: 10 Imported by: 0

README

Retry

Overview

The Retry component provides functionality for retrying operations with configurable backoff and jitter. It implements an exponential backoff algorithm with jitter to help distribute retry attempts and prevent thundering herd problems in distributed systems.

Features

  • Exponential Backoff: Automatically increase wait time between retry attempts
  • Configurable Jitter: Add randomness to retry intervals to prevent synchronized retries
  • Flexible Error Handling: Customize which errors should trigger retries
  • Context Awareness: Respect context cancellation and timeouts
  • Telemetry Integration: Built-in OpenTelemetry tracing for monitoring retry operations
  • Logging Integration: Comprehensive logging of retry attempts and outcomes
  • Type-Safe API: Generic functions for type-safe operation with Go generics

Installation

go get github.com/abitofhelp/servicelib/retry

Quick Start

package main

import (
    "context"
    "fmt"
    "time"
    
    "github.com/abitofhelp/servicelib/retry"
    "github.com/abitofhelp/servicelib/errors"
)

func main() {
    // Create a retry configuration
    config := retry.DefaultConfig().
        WithMaxRetries(5).
        WithInitialBackoff(100 * time.Millisecond).
        WithMaxBackoff(2 * time.Second)
    
    // Define a function to retry
    operation := func(ctx context.Context) error {
        // Your operation that might fail
        return callExternalService(ctx)
    }
    
    // Define which errors should be retried
    isRetryable := func(err error) bool {
        return errors.IsNetworkError(err) || errors.IsTimeout(err)
    }
    
    // Execute with retry
    ctx := context.Background()
    err := retry.Do(ctx, operation, config, isRetryable)
    if err != nil {
        if errors.IsRetryError(err) {
            fmt.Println("Operation failed after maximum retry attempts")
        } else {
            fmt.Println("Operation failed with non-retryable error:", err)
        }
        return
    }
    
    fmt.Println("Operation succeeded")
}

func callExternalService(ctx context.Context) error {
    // Simulate an external service call
    return nil
}

API Documentation

Core Types
Config

Configuration for retry operations.

type Config struct {
    MaxRetries      int           // Maximum number of retry attempts
    InitialBackoff  time.Duration // Initial backoff duration
    MaxBackoff      time.Duration // Maximum backoff duration
    BackoffFactor   float64       // Factor by which the backoff increases
    JitterFactor    float64       // Factor for random jitter (0-1)
    RetryableErrors []error       // Errors that are considered retryable
}
Options

Additional options for retry operations.

type Options struct {
    // Logger is used for logging retry operations
    Logger *logging.ContextLogger
    // Tracer is used for tracing retry operations
    Tracer telemetry.Tracer
}
RetryableFunc

A function that can be retried.

type RetryableFunc func(ctx context.Context) error
IsRetryableError

A function that determines if an error is retryable.

type IsRetryableError func(err error) bool
Key Methods
DefaultConfig

Returns a default retry configuration.

func DefaultConfig() Config
DefaultOptions

Returns default options for retry operations.

func DefaultOptions() Options
Do

Executes the given function with retry logic using default options.

func Do(ctx context.Context, fn RetryableFunc, config Config, isRetryable IsRetryableError) error
DoWithOptions

Executes the given function with retry logic and custom options.

func DoWithOptions(ctx context.Context, fn RetryableFunc, config Config, isRetryable IsRetryableError, options Options) error

Examples

Currently, there are no dedicated examples for the retry package in the EXAMPLES directory. The following code snippets demonstrate common usage patterns:

Basic Retry with Default Configuration
// Define a function to retry
operation := func(ctx context.Context) error {
    // Your operation that might fail
    return callExternalService(ctx)
}

// Execute with default configuration
err := retry.Do(ctx, operation, retry.DefaultConfig(), nil)
if err != nil {
    // Handle error
}
Custom Retry Configuration
// Create a custom retry configuration
config := retry.DefaultConfig().
    WithMaxRetries(5).                      // Maximum 5 retry attempts
    WithInitialBackoff(200 * time.Millisecond). // Start with 200ms backoff
    WithMaxBackoff(5 * time.Second).        // Cap backoff at 5 seconds
    WithBackoffFactor(2.5).                 // Increase backoff by factor of 2.5
    WithJitterFactor(0.3)                   // Add 30% jitter

// Execute with custom configuration
err := retry.Do(ctx, operation, config, nil)
Custom Error Handling
// Define which errors should be retried
isRetryable := func(err error) bool {
    // Retry network errors, timeouts, and rate limit errors
    return errors.IsNetworkError(err) || 
           errors.IsTimeout(err) || 
           errors.Is(err, errors.New(errors.ResourceExhaustedCode, ""))
}

// Execute with custom error handling
err := retry.Do(ctx, operation, retry.DefaultConfig(), isRetryable)
Retry with Telemetry and Logging
// Create a logger
logger := logging.NewContextLogger(zapLogger)

// Create a tracer
tracer := telemetry.NewOtelTracer(otelTracer)

// Create options with logger and tracer
options := retry.DefaultOptions().
    WithLogger(logger).
    WithOtelTracer(tracer)

// Execute with custom options
err := retry.DoWithOptions(ctx, operation, retry.DefaultConfig(), isRetryable, options)

Best Practices

  1. Choose Appropriate Retry Limits: Set MaxRetries based on your operation's importance and expected recovery time
  2. Use Exponential Backoff: The default BackoffFactor of 2.0 works well for most cases
  3. Always Add Jitter: Use JitterFactor to prevent synchronized retries in distributed systems
  4. Be Selective About Retrying: Only retry errors that are likely to be transient
  5. Respect Context Cancellation: Always pass a proper context that can be cancelled or timed out

Troubleshooting

Common Issues
Too Many Retries

If your application is performing too many retries, consider:

  • Reducing MaxRetries
  • Increasing InitialBackoff
  • Being more selective in your isRetryable function
Long Recovery Times

If recovery takes too long:

  • Decrease BackoffFactor
  • Decrease MaxBackoff
  • Increase MaxRetries for critical operations
High Latency Spikes

If you're seeing latency spikes:

  • Increase JitterFactor to spread out retry attempts
  • Ensure your isRetryable function is correctly identifying transient errors
  • Errors - Error handling and classification for retry operations
  • Circuit - Circuit breaker pattern for preventing cascading failures
  • Telemetry - Telemetry integration for monitoring retry operations

Contributing

Contributions to this component are welcome! Please see the Contributing Guide for more information.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Documentation

Overview

Package retry provides functionality for retrying operations with configurable backoff and jitter.

This package implements a flexible retry mechanism that can be used to retry operations that may fail due to transient errors. It supports:

  • Configurable maximum retry attempts
  • Exponential backoff with configurable initial and maximum durations
  • Jitter to prevent thundering herd problems
  • Customizable retry conditions
  • Integration with OpenTelemetry for tracing
  • Comprehensive logging of retry attempts

The package distinguishes between two types of errors related to retries:

  • RetryError: Used internally by this package to indicate that all retry attempts have been exhausted.
  • RetryableError: Used by external systems to indicate that an error should be retried.

Example usage:

// Define a function to retry
retryableFunc := func(ctx context.Context) error {
    // Some operation that might fail transiently
    return callExternalService(ctx)
}

// Define a function to determine if an error is retryable
isRetryable := func(err error) bool {
    return errors.IsTransientError(err)
}

// Configure retry parameters
config := retry.DefaultConfig().
    WithMaxRetries(5).
    WithInitialBackoff(100 * time.Millisecond).
    WithMaxBackoff(2 * time.Second)

// Execute with retry
err := retry.Do(ctx, retryableFunc, config, isRetryable)
if err != nil {
    // Handle error after all retries have failed
}

Package retry provides functionality for retrying operations with configurable backoff and jitter.

This package uses RetryError from the errors/infra package to represent errors that occur during retry operations. This is different from RetryableError in the errors/wrappers package, which is used to wrap errors that should be retried by external systems.

RetryError: Used internally by this package to indicate that all retry attempts have been exhausted. RetryableError: Used by external systems to indicate that an error should be retried.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Do

func Do(ctx context.Context, fn RetryableFunc, config Config, isRetryable IsRetryableError) error

Do executes the given function with retry logic using default options. This is a convenience wrapper around DoWithOptions that uses DefaultOptions().

Parameters:

  • ctx: The context for the operation. Can be used to cancel the retry operation.
  • fn: The function to execute with retry logic.
  • config: The retry configuration parameters.
  • isRetryable: A function that determines if an error is retryable.

Returns:

  • The error from the last execution of the function, or nil if successful.

func DoWithOptions

func DoWithOptions(ctx context.Context, fn RetryableFunc, config Config, isRetryable IsRetryableError, options Options) error

DoWithOptions executes the given function with retry logic and custom options. It will retry the function according to the provided configuration and retry condition. The function will be retried if it returns an error and the isRetryable function returns true. The retry operation can be canceled by canceling the context.

Parameters:

  • ctx: The context for the operation. Can be used to cancel the retry operation.
  • fn: The function to execute with retry logic.
  • config: The retry configuration parameters.
  • isRetryable: A function that determines if an error is retryable.
  • options: Additional options for the retry operation, such as logging and tracing.

Returns:

  • The error from the last execution of the function, or nil if successful.

func IsNetworkError deprecated

func IsNetworkError(err error) bool

IsNetworkError checks if an error is a network-related error.

Deprecated: Use errors.IsNetworkError instead. This function is maintained for backward compatibility and will be removed in a future version.

Parameters:

  • err: The error to check.

Returns:

  • true if the error is a network-related error, false otherwise.

func IsTimeoutError deprecated

func IsTimeoutError(err error) bool

IsTimeoutError checks if an error is a timeout error.

Deprecated: Use errors.IsTimeout instead. This function is maintained for backward compatibility and will be removed in a future version.

Parameters:

  • err: The error to check.

Returns:

  • true if the error indicates a timeout, false otherwise.

func IsTransientError deprecated

func IsTransientError(err error) bool

IsTransientError checks if an error is a transient error that may be resolved by retrying.

Deprecated: Use errors.IsTransientError instead. This function is maintained for backward compatibility and will be removed in a future version.

Parameters:

  • err: The error to check.

Returns:

  • true if the error is likely transient and may be resolved by retrying, false otherwise.

Types

type Config

type Config struct {
	// MaxRetries is the maximum number of retry attempts that will be made.
	// The total number of attempts will be MaxRetries + 1 (including the initial attempt).
	MaxRetries int

	// InitialBackoff is the duration to wait before the first retry attempt.
	// This value will be multiplied by BackoffFactor for subsequent retries.
	InitialBackoff time.Duration

	// MaxBackoff is the maximum duration to wait between retry attempts.
	// The backoff duration will not exceed this value, regardless of the BackoffFactor.
	MaxBackoff time.Duration

	// BackoffFactor is the factor by which the backoff duration increases after each retry.
	// For example, a BackoffFactor of 2.0 will double the backoff duration after each retry.
	BackoffFactor float64

	// JitterFactor is a factor for adding random jitter to the backoff duration.
	// It should be a value between 0 and 1, where 0 means no jitter and 1 means maximum jitter.
	// Jitter helps prevent multiple retries from occurring simultaneously (thundering herd).
	JitterFactor float64

	// RetryableErrors is a list of specific errors that should be considered retryable.
	// If an error matches one of these errors (using errors.Is), it will be retried.
	RetryableErrors []error
}

Config contains retry configuration parameters. It defines how retry operations should behave, including the number of retries, backoff durations, and jitter factors.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns a default retry configuration with reasonable values. The default configuration includes:

  • 3 maximum retry attempts (4 total attempts including the initial one)
  • 100ms initial backoff
  • 2s maximum backoff
  • 2.0 backoff factor (doubles the backoff after each retry)
  • 0.2 jitter factor (adds up to 20% random jitter to the backoff)

Returns:

  • A Config instance with default values.

func (Config) WithBackoffFactor

func (c Config) WithBackoffFactor(backoffFactor float64) Config

WithBackoffFactor sets the factor by which the backoff duration increases after each retry. If a non-positive value is provided, it will be set to 1.0 (no increase).

Parameters:

  • backoffFactor: The factor by which the backoff increases.

Returns:

  • A new Config instance with the updated BackoffFactor value.

func (Config) WithInitialBackoff

func (c Config) WithInitialBackoff(initialBackoff time.Duration) Config

WithInitialBackoff sets the initial backoff duration. If a non-positive value is provided, it will be set to 1ms.

Parameters:

  • initialBackoff: The duration to wait before the first retry attempt.

Returns:

  • A new Config instance with the updated InitialBackoff value.

func (Config) WithJitterFactor

func (c Config) WithJitterFactor(jitterFactor float64) Config

WithJitterFactor sets the factor for adding random jitter to the backoff duration. The value should be between 0 and 1, where 0 means no jitter and 1 means maximum jitter. If a value outside this range is provided, it will be clamped to the valid range.

Parameters:

  • jitterFactor: The factor for random jitter (0-1).

Returns:

  • A new Config instance with the updated JitterFactor value.

func (Config) WithMaxBackoff

func (c Config) WithMaxBackoff(maxBackoff time.Duration) Config

WithMaxBackoff sets the maximum backoff duration. If a non-positive value is provided, it will be set to the initial backoff duration.

Parameters:

  • maxBackoff: The maximum duration to wait between retry attempts.

Returns:

  • A new Config instance with the updated MaxBackoff value.

func (Config) WithMaxRetries

func (c Config) WithMaxRetries(maxRetries int) Config

WithMaxRetries sets the maximum number of retry attempts. If a negative value is provided, it will be set to 0 (no retries).

Parameters:

  • maxRetries: The maximum number of retry attempts.

Returns:

  • A new Config instance with the updated MaxRetries value.

func (Config) WithRetryableErrors

func (c Config) WithRetryableErrors(retryableErrors []error) Config

WithRetryableErrors sets the list of specific errors that should be considered retryable. If an error matches one of these errors (using errors.Is), it will be retried.

Parameters:

  • retryableErrors: A slice of errors that should be considered retryable.

Returns:

  • A new Config instance with the updated RetryableErrors value.

type IsRetryableError

type IsRetryableError func(err error) bool

IsRetryableError is a function that determines if an error is retryable. It takes an error parameter and returns a boolean indicating whether the error should be retried. Return true to retry the operation, false to stop retrying and return the error.

type Options

type Options struct {
	// Logger is used for logging retry operations.
	// If nil, a no-op logger will be used.
	Logger *logging.ContextLogger

	// Tracer is used for tracing retry operations.
	// It provides integration with OpenTelemetry for distributed tracing.
	Tracer telemetry.Tracer
}

Options contains additional options for the retry operation. These options are not directly related to the retry behavior itself, but provide additional functionality like logging and tracing.

func DefaultOptions

func DefaultOptions() Options

DefaultOptions returns default options for retry operations. The default options include:

  • No logger (a no-op logger will be used)
  • A no-op tracer (no OpenTelemetry integration)

Returns:

  • An Options instance with default values.

func (Options) WithOtelTracer

func (o Options) WithOtelTracer(tracer trace.Tracer) Options

WithOtelTracer returns Options with an OpenTelemetry tracer. This allows users to opt-in to OpenTelemetry tracing if they need it.

Parameters:

  • tracer: An OpenTelemetry trace.Tracer instance.

Returns:

  • A new Options instance with the provided OpenTelemetry tracer.

type RetryableFunc

type RetryableFunc func(ctx context.Context) error

RetryableFunc is a function that can be retried. It takes a context.Context parameter and returns an error. If the function returns nil, it is considered successful. If it returns an error, the retry mechanism will determine whether to retry based on the IsRetryableError function.

type Span

type Span interface {
	// End completes the span
	End()
	// SetAttributes sets attributes on the span
	SetAttributes(attributes ...attribute.KeyValue)
	// RecordError records an error on the span
	RecordError(err error)
}

Span represents a tracing span

type Tracer

type Tracer interface {
	// Start creates a new span
	Start(ctx context.Context, name string) (context.Context, Span)
}

Tracer is an interface for creating spans

func NewNoopTracer

func NewNoopTracer() Tracer

NewNoopTracer creates a new no-op tracer

func NewOtelTracer

func NewOtelTracer(tracer trace.Tracer) Tracer

NewOtelTracer creates a new OpenTelemetry tracer

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL