guardrails

package
v1.31.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 16, 2026 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package guardrails provides composable safety controls for Hector agents.

Guardrails can be applied at multiple interception points:

  • Input: Validate and sanitize user input before agent processing
  • Output: Filter and redact LLM responses before returning to users
  • Tool: Authorize and validate tool calls before execution

Architecture

Guardrails integrate with Hector's existing callback system:

  • InputGuardrail -> BeforeAgentCallback
  • OutputGuardrail -> AfterModelCallback
  • ToolGuardrail -> BeforeToolCallback

Usage

Create guardrails and chain them together:

chain := guardrails.NewInputChain(
    input.NewLengthValidator(10, 10000),
    input.NewInjectionDetector(),
    input.NewSanitizer(),
)

agent, _ := builder.NewAgent("secure-agent").
    WithLLM(llm).
    WithInputGuardrails(chain.Guardrails()...).
    Build()

Configuration

Guardrails can be configured programmatically or via YAML:

config, _ := guardrails.LoadConfig("guardrails.yaml")
chain := config.BuildInputChain()

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func SaveConfig

func SaveConfig(config *Config, path string) error

SaveConfig saves a guardrails configuration to a YAML file.

func ToAfterModelCallback

func ToAfterModelCallback(chain *OutputChain) llmagent.AfterModelCallback

ToAfterModelCallback converts an output guardrail chain to AfterModelCallback. If the chain modifies output, the modified response is returned with intervention metadata.

func ToBeforeAgentCallback

func ToBeforeAgentCallback(chain *InputChain) agent.BeforeAgentCallback

ToBeforeAgentCallback converts an input guardrail chain to BeforeAgentCallback. If the chain blocks, returns a message with the block reason and sets intervention metadata.

func ToBeforeToolCallback

func ToBeforeToolCallback(chain *ToolChain) llmagent.BeforeToolCallback

ToBeforeToolCallback converts a tool guardrail chain to BeforeToolCallback. If the chain blocks, returns an error result.

Types

type Action

type Action string

Action represents what should happen after a guardrail check.

const (
	// ActionAllow continues execution normally.
	ActionAllow Action = "allow"
	// ActionBlock stops execution and returns an error/message.
	ActionBlock Action = "block"
	// ActionModify continues with a modified value.
	ActionModify Action = "modify"
	// ActionWarn logs a warning but continues execution.
	ActionWarn Action = "warn"
)

type AuthorizationConfig

type AuthorizationConfig struct {
	Enabled      bool     `json:"enabled" yaml:"enabled"`
	AllowedTools []string `json:"allowed_tools,omitempty" yaml:"allowed_tools,omitempty"`
	BlockedTools []string `json:"blocked_tools,omitempty" yaml:"blocked_tools,omitempty"`
	Action       Action   `json:"action" yaml:"action"`
	Severity     Severity `json:"severity" yaml:"severity"`
}

AuthorizationConfig configures tool authorization.

type ChainError

type ChainError struct {
	Errors []*GuardrailError
}

ChainError represents multiple errors from a guardrail chain.

func (*ChainError) Add

func (e *ChainError) Add(err *GuardrailError)

Add adds an error to the chain.

func (*ChainError) Error

func (e *ChainError) Error() string

func (*ChainError) HasErrors

func (e *ChainError) HasErrors() bool

HasErrors returns true if any errors were collected.

type ChainMode

type ChainMode string

ChainMode determines how a chain handles multiple guardrails.

const (
	// ChainModeFailFast stops at the first blocking result.
	ChainModeFailFast ChainMode = "fail_fast"
	// ChainModeCollectAll runs all guardrails and collects all violations.
	ChainModeCollectAll ChainMode = "collect_all"
)

type Config

type Config struct {
	// Input guardrail configurations.
	Input InputConfig `json:"input" yaml:"input"`

	// Output guardrail configurations.
	Output OutputConfig `json:"output" yaml:"output"`

	// Tool guardrail configurations.
	Tool ToolConfig `json:"tool" yaml:"tool"`
}

Config represents the YAML/JSON configuration for guardrails.

func DefaultConfig

func DefaultConfig() *Config

DefaultConfig returns a sensible default configuration.

func LoadConfig

func LoadConfig(path string) (*Config, error)

LoadConfig loads a guardrails configuration from a YAML file.

func (*Config) BuildInputChain

func (c *Config) BuildInputChain(builders InputChainBuilders) *InputChain

BuildInputChain creates an InputChain from the configuration. Import the input package separately: import "github.com/verikod/hector/pkg/guardrails/input"

func (*Config) BuildOutputChain

func (c *Config) BuildOutputChain(builders OutputChainBuilders) *OutputChain

BuildOutputChain creates an OutputChain from the configuration.

func (*Config) BuildToolChain

func (c *Config) BuildToolChain(builders ToolChainBuilders) *ToolChain

BuildToolChain creates a ToolChain from the configuration.

type ContentConfig

type ContentConfig struct {
	Enabled         bool     `json:"enabled" yaml:"enabled"`
	BlockedKeywords []string `json:"blocked_keywords,omitempty" yaml:"blocked_keywords,omitempty"`
	BlockedPatterns []string `json:"blocked_patterns,omitempty" yaml:"blocked_patterns,omitempty"`
	Action          Action   `json:"action" yaml:"action"`
	Severity        Severity `json:"severity" yaml:"severity"`
}

ContentConfig configures content filtering.

type GuardrailError

type GuardrailError struct {
	GuardrailName string
	Reason        string
	Severity      Severity
	Details       map[string]any
}

GuardrailError represents an error from a guardrail check.

func NewGuardrailError

func NewGuardrailError(result *Result) *GuardrailError

NewGuardrailError creates a new GuardrailError from a Result.

func (*GuardrailError) Error

func (e *GuardrailError) Error() string

type InjectionConfig

type InjectionConfig struct {
	Enabled       bool     `json:"enabled" yaml:"enabled"`
	Patterns      []string `json:"patterns,omitempty" yaml:"patterns,omitempty"`
	CaseSensitive bool     `json:"case_sensitive" yaml:"case_sensitive"`
	Action        Action   `json:"action" yaml:"action"`
	Severity      Severity `json:"severity" yaml:"severity"`
}

InjectionConfig configures prompt injection detection.

type InputChain

type InputChain struct {
	// contains filtered or unexported fields
}

InputChain runs multiple input guardrails in sequence.

func NewInputChain

func NewInputChain(guardrails ...InputGuardrail) *InputChain

NewInputChain creates a new input guardrail chain.

func (*InputChain) Add

func (c *InputChain) Add(guardrails ...InputGuardrail) *InputChain

Add appends guardrails to the chain.

func (*InputChain) Check

func (c *InputChain) Check(ctx context.Context, input string) (*Result, error)

Check runs all guardrails in the chain and returns the combined result. If mode is FailFast, it stops at the first blocking result. If mode is CollectAll, it runs all guardrails and returns all violations.

func (*InputChain) Guardrails

func (c *InputChain) Guardrails() []InputGuardrail

Guardrails returns the list of guardrails in the chain.

func (*InputChain) WithMode

func (c *InputChain) WithMode(mode ChainMode) *InputChain

WithMode sets the chain mode.

type InputChainBuilders

type InputChainBuilders struct {
	LengthValidator   func(*LengthConfig) InputGuardrail
	InjectionDetector func(*InjectionConfig) InputGuardrail
	Sanitizer         func(*SanitizerConfig) InputGuardrail
}

InputChainBuilders provides factory functions to create input guardrails from config. This decouples config from the input package to avoid circular imports.

type InputConfig

type InputConfig struct {
	// ChainMode for input guardrails.
	ChainMode ChainMode `json:"chain_mode" yaml:"chain_mode"`

	// Length validation settings.
	Length *LengthConfig `json:"length,omitempty" yaml:"length,omitempty"`

	// Injection detection settings.
	Injection *InjectionConfig `json:"injection,omitempty" yaml:"injection,omitempty"`

	// Sanitization settings.
	Sanitizer *SanitizerConfig `json:"sanitizer,omitempty" yaml:"sanitizer,omitempty"`
}

InputConfig contains input guardrail settings.

type InputGuardrail

type InputGuardrail interface {
	// Name returns the guardrail's unique identifier.
	Name() string

	// Check validates the input and returns a result.
	// If the result action is ActionModify, the Modified field contains
	// the transformed input string.
	Check(ctx context.Context, input string) (*Result, error)
}

InputGuardrail validates and potentially transforms user input.

Implementations should be stateless and thread-safe.

type LengthConfig

type LengthConfig struct {
	Enabled   bool     `json:"enabled" yaml:"enabled"`
	MinLength int      `json:"min_length" yaml:"min_length"`
	MaxLength int      `json:"max_length" yaml:"max_length"`
	Action    Action   `json:"action" yaml:"action"`
	Severity  Severity `json:"severity" yaml:"severity"`
}

LengthConfig configures input length validation.

type OutputChain

type OutputChain struct {
	// contains filtered or unexported fields
}

OutputChain runs multiple output guardrails in sequence.

func NewOutputChain

func NewOutputChain(guardrails ...OutputGuardrail) *OutputChain

NewOutputChain creates a new output guardrail chain.

func (*OutputChain) Add

func (c *OutputChain) Add(guardrails ...OutputGuardrail) *OutputChain

Add appends guardrails to the chain.

func (*OutputChain) Check

func (c *OutputChain) Check(ctx context.Context, output string) (*Result, error)

Check runs all guardrails in the chain and returns the combined result.

func (*OutputChain) Guardrails

func (c *OutputChain) Guardrails() []OutputGuardrail

Guardrails returns the list of guardrails in the chain.

func (*OutputChain) WithMode

func (c *OutputChain) WithMode(mode ChainMode) *OutputChain

WithMode sets the chain mode.

type OutputChainBuilders

type OutputChainBuilders struct {
	PIIRedactor   func(*PIIConfig) OutputGuardrail
	ContentFilter func(*ContentConfig) OutputGuardrail
}

OutputChainBuilders provides factory functions to create output guardrails from config.

type OutputConfig

type OutputConfig struct {
	// ChainMode for output guardrails.
	ChainMode ChainMode `json:"chain_mode" yaml:"chain_mode"`

	// PII detection/redaction settings.
	PII *PIIConfig `json:"pii,omitempty" yaml:"pii,omitempty"`

	// Content filtering settings.
	Content *ContentConfig `json:"content,omitempty" yaml:"content,omitempty"`
}

OutputConfig contains output guardrail settings.

type OutputGuardrail

type OutputGuardrail interface {
	// Name returns the guardrail's unique identifier.
	Name() string

	// Check validates the output and returns a result.
	// If the result action is ActionModify, the Modified field contains
	// the transformed output string.
	Check(ctx context.Context, output string) (*Result, error)
}

OutputGuardrail validates and potentially transforms LLM output.

Implementations should be stateless and thread-safe.

type PIIConfig

type PIIConfig struct {
	Enabled          bool       `json:"enabled" yaml:"enabled"`
	DetectEmail      bool       `json:"detect_email" yaml:"detect_email"`
	DetectPhone      bool       `json:"detect_phone" yaml:"detect_phone"`
	DetectSSN        bool       `json:"detect_ssn" yaml:"detect_ssn"`
	DetectCreditCard bool       `json:"detect_credit_card" yaml:"detect_credit_card"`
	RedactMode       RedactMode `json:"redact_mode" yaml:"redact_mode"`
	Action           Action     `json:"action" yaml:"action"`
	Severity         Severity   `json:"severity" yaml:"severity"`
}

PIIConfig configures PII detection and redaction.

type RedactMode

type RedactMode string

RedactMode determines how PII is redacted.

const (
	// RedactModeMask replaces PII with asterisks.
	RedactModeMask RedactMode = "mask"
	// RedactModeRemove removes PII entirely.
	RedactModeRemove RedactMode = "remove"
	// RedactModeHash replaces PII with a hash.
	RedactModeHash RedactMode = "hash"
)

type Result

type Result struct {
	// Action to take based on the check result.
	Action Action `json:"action"`

	// Severity of the issue detected (if any).
	Severity Severity `json:"severity,omitempty"`

	// Reason provides a human-readable explanation.
	Reason string `json:"reason,omitempty"`

	// GuardrailName is the name of the guardrail that produced this result.
	GuardrailName string `json:"guardrail_name"`

	// Modified contains the modified value if Action == ActionModify.
	Modified any `json:"-"`

	// Details contains additional metadata about the check.
	Details map[string]any `json:"details,omitempty"`
}

Result represents the outcome of a guardrail check.

func Allow

func Allow(guardrailName string) *Result

Allow creates an allow result.

func Block

func Block(guardrailName, reason string, severity Severity) *Result

Block creates a blocking result.

func Modify

func Modify(guardrailName string, modified any, reason string) *Result

Modify creates a result that modifies the input/output.

func Warn

func Warn(guardrailName, reason string, severity Severity) *Result

Warn creates a warning result that allows execution to continue.

func (*Result) IsAllowed

func (r *Result) IsAllowed() bool

IsAllowed returns true if execution should continue.

func (*Result) IsBlocking

func (r *Result) IsBlocking() bool

IsBlocking returns true if this result blocks execution.

type SanitizerConfig

type SanitizerConfig struct {
	Enabled          bool `json:"enabled" yaml:"enabled"`
	TrimWhitespace   bool `json:"trim_whitespace" yaml:"trim_whitespace"`
	NormalizeUnicode bool `json:"normalize_unicode" yaml:"normalize_unicode"`
	MaxLength        int  `json:"max_length" yaml:"max_length"`
	StripHTML        bool `json:"strip_html" yaml:"strip_html"`
}

SanitizerConfig configures input sanitization.

type Severity

type Severity string

Severity indicates how critical a guardrail violation is.

const (
	// SeverityLow indicates a minor issue that may be logged.
	SeverityLow Severity = "low"
	// SeverityMedium indicates a notable issue that should be reviewed.
	SeverityMedium Severity = "medium"
	// SeverityHigh indicates a serious issue that may require action.
	SeverityHigh Severity = "high"
	// SeverityCritical indicates an issue that must be blocked.
	SeverityCritical Severity = "critical"
)

type ToolChain

type ToolChain struct {
	// contains filtered or unexported fields
}

ToolChain runs multiple tool guardrails in sequence.

func NewToolChain

func NewToolChain(guardrails ...ToolGuardrail) *ToolChain

NewToolChain creates a new tool guardrail chain.

func (*ToolChain) Add

func (c *ToolChain) Add(guardrails ...ToolGuardrail) *ToolChain

Add appends guardrails to the chain.

func (*ToolChain) Check

func (c *ToolChain) Check(ctx context.Context, toolName string, args map[string]any) (*Result, error)

Check runs all guardrails in the chain and returns the combined result.

func (*ToolChain) Guardrails

func (c *ToolChain) Guardrails() []ToolGuardrail

Guardrails returns the list of guardrails in the chain.

func (*ToolChain) WithMode

func (c *ToolChain) WithMode(mode ChainMode) *ToolChain

WithMode sets the chain mode.

type ToolChainBuilders

type ToolChainBuilders struct {
	Authorizer func(*AuthorizationConfig) ToolGuardrail
}

ToolChainBuilders provides factory functions to create tool guardrails from config.

type ToolConfig

type ToolConfig struct {
	// ChainMode for tool guardrails.
	ChainMode ChainMode `json:"chain_mode" yaml:"chain_mode"`

	// Authorization settings.
	Authorization *AuthorizationConfig `json:"authorization,omitempty" yaml:"authorization,omitempty"`
}

ToolConfig contains tool guardrail settings.

type ToolGuardrail

type ToolGuardrail interface {
	// Name returns the guardrail's unique identifier.
	Name() string

	// Check validates the tool call and returns a result.
	// If the result action is ActionModify, the Modified field contains
	// the transformed arguments map.
	Check(ctx context.Context, toolName string, args map[string]any) (*Result, error)
}

ToolGuardrail validates tool calls before execution.

Implementations should be stateless and thread-safe.

Directories

Path Synopsis
Package moderation provides content moderation through various providers.
Package moderation provides content moderation through various providers.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL