anomaly_detection_job

package
v0.12.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 19, 2025 License: Apache-2.0 Imports: 30 Imported by: 0

README

ML Anomaly Detection Job Resource

This resource creates and manages Machine Learning anomaly detection jobs in Elasticsearch. Anomaly detection identifies unusual patterns in data based on historical data patterns.

Key Features

  • Complete API Coverage: Supports all ML anomaly detection job API options including:

    • Analysis configuration with detectors, influencers, and bucket span
    • Data description for time-based data
    • Analysis limits for memory management
    • Model plot configuration for detailed views
    • Datafeed configuration for data ingestion
    • Custom settings and retention policies
  • Job Lifecycle Management:

    • Create new anomaly detection jobs
    • Update job configurations (limited fields)
    • Delete jobs (automatically closes jobs before deletion)
    • Import existing jobs
  • Framework Migration: Built using Terraform Plugin Framework for better performance and type safety

Supported Operations

Create Job
  • PUT /_ml/anomaly_detectors/{job_id}
  • Supports all job configuration options
  • Includes optional datafeed configuration
Read Job
  • GET /_ml/anomaly_detectors/{job_id}
  • Retrieves current job configuration and status
Update Job
  • POST /_ml/anomaly_detectors/{job_id}/_update
  • Updates modifiable job properties:
    • description
    • groups
    • model_plot_config
    • analysis_limits.model_memory_limit
    • renormalization_window_days
    • results_retention_days
    • custom_settings
    • background_persist_interval
Delete Job
  • POST /_ml/anomaly_detectors/{job_id}/_close (if needed)
  • DELETE /_ml/anomaly_detectors/{job_id}

Configuration Examples

Basic Count Detector
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "basic" {
  job_id      = "basic-count-job"
  description = "Basic count anomaly detection"

  analysis_config = {
    bucket_span = "15m"
    detectors = [
      {
        function = "count"
        detector_description = "Count anomalies"
      }
    ]
  }

  data_description = {
    time_field  = "@timestamp"
    time_format = "epoch_ms"
  }

  analysis_limits = {
    model_memory_limit = "10mb"
  }
}
Advanced Multi-Detector Job
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "advanced" {
  job_id      = "advanced-web-analytics"
  description = "Advanced web analytics anomaly detection"
  groups      = ["web", "analytics"]

  analysis_config = {
    bucket_span = "15m"
    detectors = [
      {
        function = "count"
        by_field_name = "client_ip"
        detector_description = "High request count per IP"
      },
      {
        function = "mean"
        field_name = "response_time"
        over_field_name = "url.path"
        detector_description = "Response time anomalies by URL"
      },
      {
        function = "distinct_count"
        field_name = "user_id"
        detector_description = "Unique user count anomalies"
      }
    ]
    influencers = ["client_ip", "url.path", "status_code"]
  }

  data_description = {
    time_field  = "@timestamp"
    time_format = "epoch_ms"
  }

  analysis_limits = {
    model_memory_limit = "100mb"
    categorization_examples_limit = 10
  }

  model_plot_config = {
    enabled = true
    annotations_enabled = true
  }

  datafeed_config = {
    datafeed_id = "datafeed-advanced-web-analytics"
    indices = ["web-logs-*"]
    query = jsonencode({
      bool = {
        filter = [
          {
            range = {
              "@timestamp" = {
                gte = "now-7d"
              }
            }
          }
        ]
      }
    })
    frequency = "30s"
    query_delay = "60s"
    scroll_size = 1000
  }

  model_snapshot_retention_days = 30
  results_retention_days = 90
  daily_model_snapshot_retention_after_days = 7
}
Categorization Job
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "categorization" {
  job_id      = "log-categorization"
  description = "Log message categorization job"

  analysis_config = {
    bucket_span = "1h"
    categorization_field_name = "message"
    categorization_filters = [
      "\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\b",  # IP addresses
      "\\b[A-Fa-f0-9]{8,}\\b"                             # Hex values
    ]
    detectors = [
      {
        function = "count"
        by_field_name = "mlcategory"
        detector_description = "Log category count anomalies"
      }
    ]
    per_partition_categorization = {
      enabled = true
      stop_on_warn = true
    }
  }

  data_description = {
    time_field  = "@timestamp"
    time_format = "epoch_ms"
  }

  analysis_limits = {
    model_memory_limit = "200mb"
    categorization_examples_limit = 20
  }
}

Field Validation

The resource includes comprehensive validation:

  • job_id: Must contain only lowercase alphanumeric characters, hyphens, and underscores
  • bucket_span: Must be a valid time interval (e.g., "15m", "1h")
  • detector.function: Must be one of the supported ML functions
  • memory_limit: Must be a valid memory size format
  • time_format: Supports epoch, epoch_ms, or custom patterns

Import Support

Existing ML anomaly detection jobs can be imported:

terraform import elasticstack_elasticsearch_ml_anomaly_detection_job.example existing-job-id

Error Handling

The resource handles various error scenarios:

  • Job not found: Gracefully removes resource from state
  • Insufficient ML capacity: Provides clear error messages
  • Configuration conflicts: Validates detector configurations
  • Memory limits: Warns about memory usage patterns

Best Practices

  1. Memory Sizing: Start with conservative memory limits and increase as needed
  2. Bucket Span: Choose appropriate bucket spans based on data frequency
  3. Detectors: Use specific field names for better anomaly detection
  4. Influencers: Include relevant fields that might influence anomalies
  5. Datafeeds: Use appropriate query delays for real-time data

Limitations

  • Some job properties cannot be updated after creation (analysis_config structure)
  • Jobs must be closed before deletion (handled automatically)
  • Datafeed creation is included but separate datafeed management is recommended for complex scenarios

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetSchema

func GetSchema() schema.Schema

func NewAnomalyDetectionJobResource

func NewAnomalyDetectionJobResource() resource.Resource

Types

type AnalysisConfigAPIModel

type AnalysisConfigAPIModel struct {
	BucketSpan                 string                              `json:"bucket_span"`
	CategorizationFieldName    string                              `json:"categorization_field_name,omitempty"`
	CategorizationFilters      []string                            `json:"categorization_filters,omitempty"`
	Detectors                  []DetectorAPIModel                  `json:"detectors"`
	Influencers                []string                            `json:"influencers,omitempty"`
	Latency                    string                              `json:"latency,omitempty"`
	ModelPruneWindow           string                              `json:"model_prune_window,omitempty"`
	MultivariateByFields       *bool                               `json:"multivariate_by_fields,omitempty"`
	PerPartitionCategorization *PerPartitionCategorizationAPIModel `json:"per_partition_categorization,omitempty"`
	SummaryCountFieldName      string                              `json:"summary_count_field_name,omitempty"`
}

AnalysisConfigAPIModel represents the analysis configuration in API format

type AnalysisConfigTFModel

type AnalysisConfigTFModel struct {
	BucketSpan                 types.String `tfsdk:"bucket_span"`
	CategorizationFieldName    types.String `tfsdk:"categorization_field_name"`
	CategorizationFilters      types.List   `tfsdk:"categorization_filters"`
	Detectors                  types.List   `tfsdk:"detectors"`
	Influencers                types.List   `tfsdk:"influencers"`
	Latency                    types.String `tfsdk:"latency"`
	ModelPruneWindow           types.String `tfsdk:"model_prune_window"`
	MultivariateByFields       types.Bool   `tfsdk:"multivariate_by_fields"`
	PerPartitionCategorization types.Object `tfsdk:"per_partition_categorization"`
	SummaryCountFieldName      types.String `tfsdk:"summary_count_field_name"`
}

AnalysisConfigTFModel represents the analysis configuration

type AnalysisLimitsAPIModel

type AnalysisLimitsAPIModel struct {
	CategorizationExamplesLimit *int64 `json:"categorization_examples_limit,omitempty"`
	ModelMemoryLimit            string `json:"model_memory_limit,omitempty"`
}

AnalysisLimitsAPIModel represents analysis limits in API format

type AnalysisLimitsTFModel

type AnalysisLimitsTFModel struct {
	CategorizationExamplesLimit types.Int64            `tfsdk:"categorization_examples_limit"`
	ModelMemoryLimit            customtypes.MemorySize `tfsdk:"model_memory_limit"`
}

AnalysisLimitsTFModel represents analysis limits configuration

type AnomalyDetectionJobAPIModel

type AnomalyDetectionJobAPIModel struct {
	JobID                                string                   `json:"job_id"`
	Description                          string                   `json:"description,omitempty"`
	Groups                               []string                 `json:"groups,omitempty"`
	AnalysisConfig                       AnalysisConfigAPIModel   `json:"analysis_config"`
	AnalysisLimits                       *AnalysisLimitsAPIModel  `json:"analysis_limits,omitempty"`
	DataDescription                      DataDescriptionAPIModel  `json:"data_description"`
	ModelPlotConfig                      *ModelPlotConfigAPIModel `json:"model_plot_config,omitempty"`
	AllowLazyOpen                        *bool                    `json:"allow_lazy_open,omitempty"`
	BackgroundPersistInterval            string                   `json:"background_persist_interval,omitempty"`
	CustomSettings                       map[string]interface{}   `json:"custom_settings,omitempty"`
	DailyModelSnapshotRetentionAfterDays *int64                   `json:"daily_model_snapshot_retention_after_days,omitempty"`
	ModelSnapshotRetentionDays           *int64                   `json:"model_snapshot_retention_days,omitempty"`
	RenormalizationWindowDays            *int64                   `json:"renormalization_window_days,omitempty"`
	ResultsIndexName                     string                   `json:"results_index_name,omitempty"`
	ResultsRetentionDays                 *int64                   `json:"results_retention_days,omitempty"`

	// Read-only fields
	CreateTime      interface{} `json:"create_time,omitempty"`
	JobType         string      `json:"job_type,omitempty"`
	JobVersion      string      `json:"job_version,omitempty"`
	ModelSnapshotID string      `json:"model_snapshot_id,omitempty"`
}

AnomalyDetectionJobAPIModel represents the API model for ML anomaly detection jobs

type AnomalyDetectionJobTFModel

type AnomalyDetectionJobTFModel struct {
	ID                                   types.String         `tfsdk:"id"`
	ElasticsearchConnection              types.List           `tfsdk:"elasticsearch_connection"`
	JobID                                types.String         `tfsdk:"job_id"`
	Description                          types.String         `tfsdk:"description"`
	Groups                               types.Set            `tfsdk:"groups"`
	AnalysisConfig                       types.Object         `tfsdk:"analysis_config"`
	AnalysisLimits                       types.Object         `tfsdk:"analysis_limits"`
	DataDescription                      types.Object         `tfsdk:"data_description"`
	ModelPlotConfig                      types.Object         `tfsdk:"model_plot_config"`
	AllowLazyOpen                        types.Bool           `tfsdk:"allow_lazy_open"`
	BackgroundPersistInterval            types.String         `tfsdk:"background_persist_interval"`
	CustomSettings                       jsontypes.Normalized `tfsdk:"custom_settings"`
	DailyModelSnapshotRetentionAfterDays types.Int64          `tfsdk:"daily_model_snapshot_retention_after_days"`
	ModelSnapshotRetentionDays           types.Int64          `tfsdk:"model_snapshot_retention_days"`
	RenormalizationWindowDays            types.Int64          `tfsdk:"renormalization_window_days"`
	ResultsIndexName                     types.String         `tfsdk:"results_index_name"`
	ResultsRetentionDays                 types.Int64          `tfsdk:"results_retention_days"`

	// Read-only computed fields
	CreateTime      types.String `tfsdk:"create_time"`
	JobType         types.String `tfsdk:"job_type"`
	JobVersion      types.String `tfsdk:"job_version"`
	ModelSnapshotID types.String `tfsdk:"model_snapshot_id"`
}

AnomalyDetectionJobTFModel represents the Terraform resource model for ML anomaly detection jobs

type AnomalyDetectionJobUpdateAPIModel

type AnomalyDetectionJobUpdateAPIModel struct {
	Description                          *string                  `json:"description,omitempty"`
	Groups                               []string                 `json:"groups,omitempty"`
	AnalysisLimits                       *AnalysisLimitsAPIModel  `json:"analysis_limits,omitempty"`
	ModelPlotConfig                      *ModelPlotConfigAPIModel `json:"model_plot_config,omitempty"`
	AllowLazyOpen                        *bool                    `json:"allow_lazy_open,omitempty"`
	BackgroundPersistInterval            *string                  `json:"background_persist_interval,omitempty"`
	CustomSettings                       map[string]interface{}   `json:"custom_settings,omitempty"`
	DailyModelSnapshotRetentionAfterDays *int64                   `json:"daily_model_snapshot_retention_after_days,omitempty"`
	ModelSnapshotRetentionDays           *int64                   `json:"model_snapshot_retention_days,omitempty"`
	RenormalizationWindowDays            *int64                   `json:"renormalization_window_days,omitempty"`
	ResultsRetentionDays                 *int64                   `json:"results_retention_days,omitempty"`
}

AnomalyDetectionJobUpdateAPIModel represents the API model for updating ML anomaly detection jobs This includes only the fields that can be updated after job creation

func (*AnomalyDetectionJobUpdateAPIModel) BuildFromPlan

BuildFromPlan populates the AnomalyDetectionJobUpdateAPIModel from the plan and state models

type ChunkingConfigAPIModel

type ChunkingConfigAPIModel struct {
	Mode     string `json:"mode"`
	TimeSpan string `json:"time_span,omitempty"`
}

ChunkingConfigAPIModel represents chunking configuration in API format

type CustomRuleAPIModel

type CustomRuleAPIModel struct {
	Actions    []interface{}           `json:"actions,omitempty"`
	Conditions []RuleConditionAPIModel `json:"conditions,omitempty"`
}

CustomRuleAPIModel represents a custom rule in API format

type CustomRuleTFModel

type CustomRuleTFModel struct {
	Actions    types.List `tfsdk:"actions"`
	Conditions types.List `tfsdk:"conditions"`
}

CustomRuleTFModel represents a custom rule configuration

type DataDescriptionAPIModel

type DataDescriptionAPIModel struct {
	TimeField  string `json:"time_field,omitempty"`
	TimeFormat string `json:"time_format,omitempty"`
}

DataDescriptionAPIModel represents data description in API format

type DataDescriptionTFModel

type DataDescriptionTFModel struct {
	TimeField  types.String `tfsdk:"time_field"`
	TimeFormat types.String `tfsdk:"time_format"`
}

DataDescriptionTFModel represents data description configuration

type DelayedDataCheckConfigAPIModel

type DelayedDataCheckConfigAPIModel struct {
	CheckWindow string `json:"check_window,omitempty"`
	Enabled     bool   `json:"enabled"`
}

DelayedDataCheckConfigAPIModel represents delayed data check configuration in API format

type DetectorAPIModel

type DetectorAPIModel struct {
	ByFieldName         string               `json:"by_field_name,omitempty"`
	DetectorDescription string               `json:"detector_description,omitempty"`
	ExcludeFrequent     string               `json:"exclude_frequent,omitempty"`
	FieldName           string               `json:"field_name,omitempty"`
	Function            string               `json:"function"`
	OverFieldName       string               `json:"over_field_name,omitempty"`
	PartitionFieldName  string               `json:"partition_field_name,omitempty"`
	UseNull             *bool                `json:"use_null,omitempty"`
	CustomRules         []CustomRuleAPIModel `json:"custom_rules,omitempty"`
}

DetectorAPIModel represents a detector configuration in API format

type DetectorTFModel

type DetectorTFModel struct {
	ByFieldName         types.String `tfsdk:"by_field_name"`
	DetectorDescription types.String `tfsdk:"detector_description"`
	ExcludeFrequent     types.String `tfsdk:"exclude_frequent"`
	FieldName           types.String `tfsdk:"field_name"`
	Function            types.String `tfsdk:"function"`
	OverFieldName       types.String `tfsdk:"over_field_name"`
	PartitionFieldName  types.String `tfsdk:"partition_field_name"`
	UseNull             types.Bool   `tfsdk:"use_null"`
	CustomRules         types.List   `tfsdk:"custom_rules"`
}

DetectorTFModel represents a detector configuration

type IndicesOptionsAPIModel

type IndicesOptionsAPIModel struct {
	ExpandWildcards   []string `json:"expand_wildcards,omitempty"`
	IgnoreUnavailable *bool    `json:"ignore_unavailable,omitempty"`
	AllowNoIndices    *bool    `json:"allow_no_indices,omitempty"`
	IgnoreThrottled   *bool    `json:"ignore_throttled,omitempty"`
}

IndicesOptionsAPIModel represents indices options in API format

type ModelPlotConfigAPIModel

type ModelPlotConfigAPIModel struct {
	AnnotationsEnabled *bool  `json:"annotations_enabled,omitempty"`
	Enabled            bool   `json:"enabled"`
	Terms              string `json:"terms,omitempty"`
}

ModelPlotConfigAPIModel represents model plot configuration in API format

type ModelPlotConfigTFModel

type ModelPlotConfigTFModel struct {
	AnnotationsEnabled types.Bool   `tfsdk:"annotations_enabled"`
	Enabled            types.Bool   `tfsdk:"enabled"`
	Terms              types.String `tfsdk:"terms"`
}

ModelPlotConfigTFModel represents model plot configuration

type PerPartitionCategorizationAPIModel

type PerPartitionCategorizationAPIModel struct {
	Enabled    bool  `json:"enabled"`
	StopOnWarn *bool `json:"stop_on_warn,omitempty"`
}

PerPartitionCategorizationAPIModel represents per-partition categorization in API format

type PerPartitionCategorizationTFModel

type PerPartitionCategorizationTFModel struct {
	Enabled    types.Bool `tfsdk:"enabled"`
	StopOnWarn types.Bool `tfsdk:"stop_on_warn"`
}

PerPartitionCategorizationTFModel represents per-partition categorization configuration

type RuleConditionAPIModel

type RuleConditionAPIModel struct {
	AppliesTo string  `json:"applies_to"`
	Operator  string  `json:"operator"`
	Value     float64 `json:"value"`
}

RuleConditionAPIModel represents a rule condition in API format

type RuleConditionTFModel

type RuleConditionTFModel struct {
	AppliesTo types.String  `tfsdk:"applies_to"`
	Operator  types.String  `tfsdk:"operator"`
	Value     types.Float64 `tfsdk:"value"`
}

RuleConditionTFModel represents a rule condition

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL