validation

package
v1.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 31, 2025 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package validation provides a comprehensive field validation framework for extracted fields and extended field support.

Overview

The validation framework offers: - Type-specific validators (string, URL, date, image, number) - Validation pipelines for chaining multiple validators - Custom validators for domain-specific rules - Performance optimization with configurable validation modes - Thread-safe validator registry for dynamic field management - Extended field types (categories, tags, related articles, sentiment) - Field transformers for data normalization and format conversion - Comprehensive configuration and metrics system

Quick Start

Basic validation example:

import "github.com/BumpyClock/hermes/internal/extractors/validation"

// Create a string validator
validator := validation.NewStringValidator(validation.StringOptions{
	MinLength: 1,
	MaxLength: 100,
	Required:  true,
})

// Validate a value
err := validator.Validate("hello world")
if err != nil {
	log.Printf("Validation failed: %v", err)
}

Validation Pipeline

Chain multiple validators for complex validation:

pipeline := validation.NewValidationPipeline()
pipeline.AddValidator("length", validation.NewStringValidator(validation.StringOptions{
	MinLength: 5,
	MaxLength: 50,
}))
pipeline.AddValidator("url", validation.NewURLValidator(validation.URLOptions{
	RequireHTTPS: true,
}))

err := pipeline.Validate("https://example.com")

Field Registry

Register custom field definitions dynamically:

fieldDef := validation.FieldDefinition{
	Name:        "custom_field",
	Type:        "string",
	Description: "A custom field for articles",
	Required:    true,
	Validators: []validation.ValidatorInterface{
		validation.NewStringValidator(validation.StringOptions{MinLength: 1}),
	},
}

err := validation.RegisterField(fieldDef)
if err != nil {
	log.Printf("Field registration failed: %v", err)
}

Extended Field Types

Work with specialized field types:

import "github.com/BumpyClock/hermes/internal/extractors/fields"

// Extract categories
categoryExtractor := fields.NewCategoryExtractor()
result := categoryExtractor.Extract([]string{"technology", "science"})
categoryField := result.(fields.CategoryField)

// Extract and normalize tags
tagsExtractor := fields.NewTagsExtractor()
tags := tagsExtractor.Extract([]string{"Web Development", "Go Programming"})
normalizedTags := tags.([]string) // ["web-development", "go-programming"]

Field Transformers

Transform extracted data to standard formats:

// String normalization
stringTransformer := fields.NewStringTransformer()
normalized := stringTransformer.Transform("  hello world  ") // "hello world"

// URL resolution
urlTransformer := fields.NewURLTransformer("https://example.com")
resolved := urlTransformer.Transform("/path") // "https://example.com/path"

// Date parsing
dateTransformer := fields.NewDateTransformer()
parsed := dateTransformer.Transform("2023-12-25T10:30:00Z") // time.Time

Configuration and Profiles

Configure validation behavior with profiles:

// Use strict validation profile
err := validation.SetValidationProfile("strict")
if err != nil {
	log.Printf("Failed to set profile: %v", err)
}

// Create custom configuration
builder := validation.NewValidationConfigBuilder()
config, fieldConfigs := builder.
	WithProfile("custom").
	WithPerformanceMode("thorough").
	WithErrorHandling("collect_all").
	WithTimeout(10 * time.Second).
	Build()

validation.SetGlobalConfig(config)

Performance and Metrics

Monitor validation performance:

// Get global metrics
metrics := validation.GetGlobalMetrics()
log.Printf("Total validations: %d", metrics.TotalValidations)
log.Printf("Success rate: %.2f%%",
	float64(metrics.SuccessfulValidations)/float64(metrics.TotalValidations)*100)
log.Printf("Average time: %v", metrics.AverageValidationTime)

Thread Safety

All validators and the registry are thread-safe:

// Safe to use from multiple goroutines
validator := validation.NewStringValidator(validation.StringOptions{MinLength: 1})

var wg sync.WaitGroup
for i := 0; i < 100; i++ {
	wg.Add(1)
	go func(id int) {
		defer wg.Done()
		err := validator.Validate(fmt.Sprintf("value-%d", id))
		if err != nil {
			log.Printf("Validation %d failed: %v", id, err)
		}
	}(i)
}
wg.Wait()

Custom Validators

Create domain-specific validators:

emailValidator := validation.NewCustomValidator("email", "string",
	func(value interface{}) error {
		str, ok := value.(string)
		if !ok {
			return fmt.Errorf("expected string")
		}
		if !strings.Contains(str, "@") {
			return fmt.Errorf("invalid email format")
		}
		return nil
	})

Validation Profiles

Built-in profiles:

- "strict": Enables all validations, fails fast on first error - "lenient": Permissive validation, collects all errors - "production": Balanced approach, warns on errors but doesn't fail

Custom profiles can be registered:

customProfile := validation.ValidationProfile{
	Name:                 "custom",
	EnableAllValidations: true,
	ErrorHandling:        "warn_only",
	PerformanceMode:      "fast",
}
validation.RegisterValidationProfile("custom", customProfile)

Integration with Parser

The validation framework integrates seamlessly with the parser:

import "github.com/BumpyClock/hermes/internal/parser"

// Validation is automatically applied during field extraction
// Configure validation behavior as needed
validation.SetValidationProfile("production")

// Parse content - validation will be applied to extracted fields
p := parser.NewParser()
result, err := p.Parse("https://example.com", parser.ParserOptions{})

Error Handling

Validation errors provide detailed information:

err := validator.Validate(invalidValue)
if validationErr, ok := err.(*validation.ValidationError); ok {
	log.Printf("Field: %s", validationErr.Field)
	log.Printf("Message: %s", validationErr.Message)
	for i, subErr := range validationErr.Errors {
		log.Printf("Error %d: %v", i+1, subErr)
	}
}

Best Practices

1. **Use appropriate validation profiles** for different environments (dev/staging/prod) 2. **Register custom fields early** in your application initialization 3. **Monitor validation metrics** to identify performance bottlenecks 4. **Use transformers** to normalize data before validation 5. **Chain validators** for complex validation requirements 6. **Handle validation errors gracefully** in production environments 7. **Test validation rules thoroughly** with edge cases 8. **Configure reasonable timeouts** for validation operations

Architecture

The validation framework is designed with the following principles:

- **Modularity**: Each validator type is independent and composable - **Extensibility**: Easy to add new validator types and field definitions - **Performance**: Minimal overhead when validation is disabled - **Thread Safety**: All components are safe for concurrent use - **Configuration**: Flexible configuration system for different environments - **Observability**: Comprehensive metrics and monitoring support

Type System

The framework provides type-safe validation for:

- Strings (length, pattern, required constraints) - URLs (security validation, scheme requirements, domain filtering) - Dates (format parsing, age constraints, future/past requirements) - Images (format validation, size constraints) - Numbers (range validation, integer requirements) - Custom types (domain-specific validation logic)

All validators implement the ValidatorInterface for consistent behavior.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EnableValidation

func EnableValidation(enabled bool)

EnableValidation enables or disables validation globally

func IsValidationEnabled

func IsValidationEnabled() bool

IsValidationEnabled returns whether validation is globally enabled

func RecordGlobalValidation

func RecordGlobalValidation(validationType string, success bool, duration time.Duration)

RecordGlobalValidation records a validation in global metrics

func RegisterField

func RegisterField(field FieldDefinition) error

RegisterField registers a new field definition

func RegisterValidationProfile

func RegisterValidationProfile(name string, profile ValidationProfile)

RegisterValidationProfile registers a new validation profile

func ResetGlobalMetrics

func ResetGlobalMetrics()

ResetGlobalMetrics resets the global validation metrics

func SetGlobalConfig

func SetGlobalConfig(config *ValidationConfig)

SetGlobalConfig sets the global validation configuration

func SetValidationProfile

func SetValidationProfile(profileName string) error

SetValidationProfile sets the current validation profile

func SetupDefaultValidators

func SetupDefaultValidators(parserValidator *ParserValidator)

SetupDefaultValidators configures default validators for common fields

func ValidateConfigProfile

func ValidateConfigProfile(profile ValidationProfile) error

ValidateConfigProfile validates a configuration profile

Types

type BaseValidator

type BaseValidator struct {
	// contains filtered or unexported fields
}

BaseValidator provides common functionality for all validators

func NewBaseValidator

func NewBaseValidator(name, vType string) BaseValidator

NewBaseValidator creates a new base validator

func (*BaseValidator) IsEnabled

func (bv *BaseValidator) IsEnabled() bool

IsEnabled returns the enabled state

func (*BaseValidator) Name

func (bv *BaseValidator) Name() string

Name returns the validator name

func (*BaseValidator) SetEnabled

func (bv *BaseValidator) SetEnabled(enabled bool)

SetEnabled sets the enabled state

func (*BaseValidator) Type

func (bv *BaseValidator) Type() string

Type returns the validator type

type ContextualValidator

type ContextualValidator interface {
	ValidatorInterface
	ValidateWithContext(ctx *ValidationContext) error
}

ContextualValidator extends ValidatorInterface with context support

type CustomValidator

type CustomValidator struct {
	BaseValidator
	// contains filtered or unexported fields
}

CustomValidator allows for domain-specific validation rules

func NewCustomValidator

func NewCustomValidator(name, vType string, validationFunc func(interface{}) error) *CustomValidator

NewCustomValidator creates a validator with a custom validation function

func (*CustomValidator) Validate

func (cv *CustomValidator) Validate(value interface{}) error

Validate validates using the custom validation function

type DateOptions

type DateOptions struct {
	RequireFuture  bool
	RequirePast    bool
	MinAge         time.Duration
	MaxAge         time.Duration
	AllowedFormats []string
}

type DateValidator

type DateValidator struct {
	BaseValidator
	// contains filtered or unexported fields
}

DateValidator validates date fields

func NewDateValidator

func NewDateValidator(options DateOptions) *DateValidator

NewDateValidator creates a new date validator

func (*DateValidator) Validate

func (dv *DateValidator) Validate(value interface{}) error

Validate validates a date value

type FieldDefinition

type FieldDefinition struct {
	Name        string
	Type        string
	Description string
	Required    bool
	Validators  []ValidatorInterface
	Transformer FieldTransformer
	Examples    []interface{}

	// Metadata for documentation generation
	Category   string
	Version    string
	Deprecated bool
}

FieldDefinition describes a field type with its validation rules

func DiscoverFields

func DiscoverFields() []FieldDefinition

DiscoverFields returns all registered field definitions

func GetFieldDefinition

func GetFieldDefinition(name string) (FieldDefinition, bool)

GetFieldDefinition retrieves a field definition by name

func GetFieldsByCategory

func GetFieldsByCategory(category string) []FieldDefinition

GetFieldsByCategory returns fields filtered by category

type FieldTransformer

type FieldTransformer interface {
	Transform(value interface{}) interface{}
	TargetType() string
}

FieldTransformer converts extracted data to standard formats

type FieldValidationConfig

type FieldValidationConfig struct {
	FieldName    string
	Required     bool
	Rules        []ValidationRuleConfig
	Transformers []string // Names of transformers to apply
	Metadata     map[string]interface{}
}

FieldValidationConfig holds validation configuration for a specific field

type FieldValidator

type FieldValidator struct {
	// contains filtered or unexported fields
}

FieldValidator combines validation and transformation for a specific field

func NewFieldValidator

func NewFieldValidator(fieldName string) *FieldValidator

NewFieldValidator creates a new field validator

func (*FieldValidator) AddValidator

func (fv *FieldValidator) AddValidator(name string, validator ValidatorInterface) *FieldValidator

AddValidator adds a validator to the field's pipeline

func (*FieldValidator) SetRequired

func (fv *FieldValidator) SetRequired(required bool) *FieldValidator

SetRequired sets whether the field is required

func (*FieldValidator) SetTransformer

func (fv *FieldValidator) SetTransformer(transformer FieldTransformer) *FieldValidator

SetTransformer sets the field transformer

func (*FieldValidator) ValidateAndTransform

func (fv *FieldValidator) ValidateAndTransform(value interface{}) (interface{}, ValidationResult)

ValidateAndTransform validates and transforms a field value

type ImageOptions

type ImageOptions struct {
	RequireHTTPS   bool
	AllowedFormats []string
	MaxFileSize    int64
	MinWidth       int
	MinHeight      int
}

type ImageValidator

type ImageValidator struct {
	BaseValidator
	// contains filtered or unexported fields
}

ImageValidator validates image URL fields

func NewImageValidator

func NewImageValidator(options ImageOptions) *ImageValidator

NewImageValidator creates a new image validator

func (*ImageValidator) Validate

func (iv *ImageValidator) Validate(value interface{}) error

Validate validates an image URL

type NumberOptions

type NumberOptions struct {
	Min           float64
	Max           float64
	RequireInt    bool
	AllowNegative bool
}

type NumberValidator

type NumberValidator struct {
	BaseValidator
	// contains filtered or unexported fields
}

NumberValidator validates numeric fields

func NewNumberValidator

func NewNumberValidator(options NumberOptions) *NumberValidator

NewNumberValidator creates a new number validator

func (*NumberValidator) Validate

func (nv *NumberValidator) Validate(value interface{}) error

Validate validates a numeric value

type ParserValidator

type ParserValidator struct {
	// contains filtered or unexported fields
}

ParserValidator integrates validation with the parser system

func NewParserValidator

func NewParserValidator() *ParserValidator

NewParserValidator creates a new parser validator

func (*ParserValidator) RegisterFieldValidator

func (pv *ParserValidator) RegisterFieldValidator(fieldName string, validator *FieldValidator)

RegisterFieldValidator registers a validator for a specific field

func (*ParserValidator) SetEnabled

func (pv *ParserValidator) SetEnabled(enabled bool)

SetEnabled enables or disables validation

func (*ParserValidator) ValidateResult

func (pv *ParserValidator) ValidateResult(result interface{}) *ValidatedResult

ValidateResult validates a complete parser result

type StringOptions

type StringOptions struct {
	MinLength  int
	MaxLength  int
	Required   bool
	Pattern    string // Regex pattern
	AllowEmpty bool
	TrimSpaces bool
}

Common validation options structures

type StringValidator

type StringValidator struct {
	BaseValidator
	// contains filtered or unexported fields
}

StringValidator validates string fields

func NewStringValidator

func NewStringValidator(options StringOptions) *StringValidator

NewStringValidator creates a new string validator

func (*StringValidator) Validate

func (sv *StringValidator) Validate(value interface{}) error

Validate validates a string value

type URLOptions

type URLOptions struct {
	RequireHTTPS   bool
	AllowQuery     bool
	AllowFragment  bool
	AllowedDomains []string
	BlockedDomains []string
}

type URLValidator

type URLValidator struct {
	BaseValidator
	// contains filtered or unexported fields
}

URLValidator validates URL fields

func NewURLValidator

func NewURLValidator(options URLOptions) *URLValidator

NewURLValidator creates a new URL validator

func (*URLValidator) Validate

func (uv *URLValidator) Validate(value interface{}) error

Validate validates a URL value

type ValidatedResult

type ValidatedResult struct {
	// Standard parser result fields
	Title         string     `json:"title,omitempty"`
	Content       string     `json:"content,omitempty"`
	Author        string     `json:"author,omitempty"`
	DatePublished *time.Time `json:"date_published,omitempty"`
	LeadImageURL  string     `json:"lead_image_url,omitempty"`
	Dek           string     `json:"dek,omitempty"`
	URL           string     `json:"url,omitempty"`
	Domain        string     `json:"domain,omitempty"`
	Excerpt       string     `json:"excerpt,omitempty"`
	WordCount     int        `json:"word_count,omitempty"`

	// Extended fields
	ExtendedFields map[string]interface{} `json:"extended_fields,omitempty"`

	// Validation information
	ValidationResults map[string]ValidationResult `json:"validation_results,omitempty"`
	ValidationSummary ValidationSummary           `json:"validation_summary"`
}

ValidatedResult represents a parser result with validation information

func ValidateDocument

func ValidateDocument(doc *goquery.Document, url string) *ValidatedResult

ValidateDocument validates extracted content from a goquery document

type ValidationConfig

type ValidationConfig struct {
	Enabled              bool
	Profile              string
	CustomRules          map[string]interface{}
	PerformanceMode      string
	ErrorHandling        string
	MaxValidationTime    time.Duration
	ConcurrentValidation bool
	// contains filtered or unexported fields
}

ValidationConfig holds the global validation configuration

func DefaultValidationConfig

func DefaultValidationConfig() *ValidationConfig

DefaultValidationConfig returns the default validation configuration

func GetGlobalConfig

func GetGlobalConfig() *ValidationConfig

GetGlobalConfig returns the current global validation configuration

type ValidationConfigBuilder

type ValidationConfigBuilder struct {
	// contains filtered or unexported fields
}

ValidationConfigBuilder helps build complex validation configurations

func NewValidationConfigBuilder

func NewValidationConfigBuilder() *ValidationConfigBuilder

NewValidationConfigBuilder creates a new configuration builder

func (*ValidationConfigBuilder) AddFieldConfig

AddFieldConfig adds configuration for a specific field

func (*ValidationConfigBuilder) Build

Build creates the final validation configuration

func (*ValidationConfigBuilder) WithCustomRule

func (vcb *ValidationConfigBuilder) WithCustomRule(key string, value interface{}) *ValidationConfigBuilder

WithCustomRule adds a custom validation rule

func (*ValidationConfigBuilder) WithErrorHandling

func (vcb *ValidationConfigBuilder) WithErrorHandling(strategy string) *ValidationConfigBuilder

WithErrorHandling sets the error handling strategy

func (*ValidationConfigBuilder) WithPerformanceMode

func (vcb *ValidationConfigBuilder) WithPerformanceMode(mode string) *ValidationConfigBuilder

WithPerformanceMode sets the performance mode

func (*ValidationConfigBuilder) WithProfile

func (vcb *ValidationConfigBuilder) WithProfile(profileName string) *ValidationConfigBuilder

WithProfile sets the validation profile

func (*ValidationConfigBuilder) WithTimeout

func (vcb *ValidationConfigBuilder) WithTimeout(timeout time.Duration) *ValidationConfigBuilder

WithTimeout sets the maximum validation time

type ValidationContext

type ValidationContext struct {
	FieldName       string
	FieldValue      interface{}
	DocumentContext map[string]interface{}
	Config          *ValidationConfig
	Metadata        map[string]interface{}
	StartTime       time.Time
}

ValidationContext provides context for validation operations

func NewValidationContext

func NewValidationContext(fieldName string, fieldValue interface{}) *ValidationContext

NewValidationContext creates a new validation context

func (*ValidationContext) Duration

func (vc *ValidationContext) Duration() time.Duration

Duration returns the time elapsed since validation started

func (*ValidationContext) WithDocumentContext

func (vc *ValidationContext) WithDocumentContext(key string, value interface{}) *ValidationContext

WithDocumentContext adds document context to the validation context

func (*ValidationContext) WithMetadata

func (vc *ValidationContext) WithMetadata(key string, value interface{}) *ValidationContext

WithMetadata adds metadata to the validation context

type ValidationError

type ValidationError struct {
	Message string
	Errors  []error
	Field   string
}

ValidationError represents validation failures

func (*ValidationError) Error

func (ve *ValidationError) Error() string

type ValidationMetrics

type ValidationMetrics struct {
	TotalValidations      int64
	SuccessfulValidations int64
	FailedValidations     int64
	AverageValidationTime time.Duration
	ValidationsByType     map[string]int64
	ErrorsByType          map[string]int64
	// contains filtered or unexported fields
}

ValidationMetrics tracks validation performance and statistics

func GetGlobalMetrics

func GetGlobalMetrics() ValidationMetrics

GetGlobalMetrics returns the global validation metrics

func NewValidationMetrics

func NewValidationMetrics() *ValidationMetrics

NewValidationMetrics creates a new metrics tracker

func (*ValidationMetrics) GetMetrics

func (vm *ValidationMetrics) GetMetrics() ValidationMetrics

GetMetrics returns a copy of the current metrics

func (*ValidationMetrics) RecordValidation

func (vm *ValidationMetrics) RecordValidation(validationType string, success bool, duration time.Duration)

RecordValidation records a validation attempt

func (*ValidationMetrics) Reset

func (vm *ValidationMetrics) Reset()

Reset resets all metrics

type ValidationPipeline

type ValidationPipeline struct {
	// contains filtered or unexported fields
}

ValidationPipeline chains multiple validators together

func NewValidationPipeline

func NewValidationPipeline() *ValidationPipeline

NewValidationPipeline creates a new validation pipeline

func (*ValidationPipeline) AddValidator

func (vp *ValidationPipeline) AddValidator(name string, validator ValidatorInterface)

AddValidator adds a validator to the pipeline

func (*ValidationPipeline) SetErrorAggregation

func (vp *ValidationPipeline) SetErrorAggregation(enabled bool)

SetErrorAggregation enables or disables error aggregation

func (*ValidationPipeline) Validate

func (vp *ValidationPipeline) Validate(value interface{}) error

Validate runs all validators in the pipeline

type ValidationProfile

type ValidationProfile struct {
	Name                 string
	EnableAllValidations bool
	ErrorHandling        string // "fail_fast", "collect_all", "warn_only"
	PerformanceMode      string // "fast", "thorough"
	CustomRules          map[string]interface{}
}

ValidationProfile defines validation behavior settings

func CreateValidationProfileFromConfig

func CreateValidationProfileFromConfig(config *ValidationConfig) ValidationProfile

CreateValidationProfileFromConfig creates a validation profile from a config

func GetCurrentProfile

func GetCurrentProfile() ValidationProfile

GetCurrentProfile returns the current validation profile

func GetValidationProfile

func GetValidationProfile(name string) ValidationProfile

GetValidationProfile retrieves a validation profile by name

type ValidationResult

type ValidationResult struct {
	Valid      bool                   `json:"valid"`
	Errors     []string               `json:"errors,omitempty"`
	Warnings   []string               `json:"warnings,omitempty"`
	Confidence float64                `json:"confidence"`
	Metadata   map[string]interface{} `json:"metadata,omitempty"`
}

ValidationResult represents the validation outcome for a single field

type ValidationRuleConfig

type ValidationRuleConfig struct {
	FieldName     string
	RuleType      string
	Enabled       bool
	Severity      string // "error", "warning", "info"
	Parameters    map[string]interface{}
	CustomMessage string
}

ValidationRuleConfig represents configuration for specific validation rules

type ValidationSummary

type ValidationSummary struct {
	TotalFields   int     `json:"total_fields"`
	ValidFields   int     `json:"valid_fields"`
	InvalidFields int     `json:"invalid_fields"`
	WarningFields int     `json:"warning_fields"`
	OverallValid  bool    `json:"overall_valid"`
	Confidence    float64 `json:"confidence"`
}

ValidationSummary provides an overview of validation results

type ValidatorInterface

type ValidatorInterface interface {
	// Validate checks if the given value is valid according to validator rules
	Validate(value interface{}) error

	// Name returns the validator name for identification and error reporting
	Name() string

	// Type returns the data type this validator handles (string, url, date, etc.)
	Type() string

	// SetEnabled allows enabling/disabling validation for performance control
	SetEnabled(enabled bool)

	// IsEnabled returns whether validation is currently enabled
	IsEnabled() bool
}

ValidatorInterface defines the contract for all field validators

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL