anomaly_detection_job

package

v0.12.2 Latest Latest Go to latest Published: Nov 19, 2025 License: Apache-2.0 Imports: 30 Imported by: 0

README ¶

ML Anomaly Detection Job Resource

This resource creates and manages Machine Learning anomaly detection jobs in Elasticsearch. Anomaly detection identifies unusual patterns in data based on historical data patterns.

Key Features

Complete API Coverage: Supports all ML anomaly detection job API options including:
- Analysis configuration with detectors, influencers, and bucket span
- Data description for time-based data
- Analysis limits for memory management
- Model plot configuration for detailed views
- Datafeed configuration for data ingestion
- Custom settings and retention policies
Job Lifecycle Management:
- Create new anomaly detection jobs
- Update job configurations (limited fields)
- Delete jobs (automatically closes jobs before deletion)
- Import existing jobs
Framework Migration: Built using Terraform Plugin Framework for better performance and type safety

Supported Operations

Create Job

PUT /_ml/anomaly_detectors/{job_id}
Supports all job configuration options
Includes optional datafeed configuration

Read Job

GET /_ml/anomaly_detectors/{job_id}
Retrieves current job configuration and status

Update Job

POST /_ml/anomaly_detectors/{job_id}/_update
Updates modifiable job properties:
- description
- groups
- model_plot_config
- analysis_limits.model_memory_limit
- renormalization_window_days
- results_retention_days
- custom_settings
- background_persist_interval

Delete Job

POST /_ml/anomaly_detectors/{job_id}/_close (if needed)
DELETE /_ml/anomaly_detectors/{job_id}

Configuration Examples

Basic Count Detector

resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "basic" {
  job_id      = "basic-count-job"
  description = "Basic count anomaly detection"

  analysis_config = {
    bucket_span = "15m"
    detectors = [
      {
        function = "count"
        detector_description = "Count anomalies"
      }
    ]
  }

  data_description = {
    time_field  = "@timestamp"
    time_format = "epoch_ms"
  }

  analysis_limits = {
    model_memory_limit = "10mb"
  }
}

Advanced Multi-Detector Job

resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "advanced" {
  job_id      = "advanced-web-analytics"
  description = "Advanced web analytics anomaly detection"
  groups      = ["web", "analytics"]

  analysis_config = {
    bucket_span = "15m"
    detectors = [
      {
        function = "count"
        by_field_name = "client_ip"
        detector_description = "High request count per IP"
      },
      {
        function = "mean"
        field_name = "response_time"
        over_field_name = "url.path"
        detector_description = "Response time anomalies by URL"
      },
      {
        function = "distinct_count"
        field_name = "user_id"
        detector_description = "Unique user count anomalies"
      }
    ]
    influencers = ["client_ip", "url.path", "status_code"]
  }

  data_description = {
    time_field  = "@timestamp"
    time_format = "epoch_ms"
  }

  analysis_limits = {
    model_memory_limit = "100mb"
    categorization_examples_limit = 10
  }

  model_plot_config = {
    enabled = true
    annotations_enabled = true
  }

  datafeed_config = {
    datafeed_id = "datafeed-advanced-web-analytics"
    indices = ["web-logs-*"]
    query = jsonencode({
      bool = {
        filter = [
          {
            range = {
              "@timestamp" = {
                gte = "now-7d"
              }
            }
          }
        ]
      }
    })
    frequency = "30s"
    query_delay = "60s"
    scroll_size = 1000
  }

  model_snapshot_retention_days = 30
  results_retention_days = 90
  daily_model_snapshot_retention_after_days = 7
}

Categorization Job

resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "categorization" {
  job_id      = "log-categorization"
  description = "Log message categorization job"

  analysis_config = {
    bucket_span = "1h"
    categorization_field_name = "message"
    categorization_filters = [
      "\\b\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\b",  # IP addresses
      "\\b[A-Fa-f0-9]{8,}\\b"                             # Hex values
    ]
    detectors = [
      {
        function = "count"
        by_field_name = "mlcategory"
        detector_description = "Log category count anomalies"
      }
    ]
    per_partition_categorization = {
      enabled = true
      stop_on_warn = true
    }
  }

  data_description = {
    time_field  = "@timestamp"
    time_format = "epoch_ms"
  }

  analysis_limits = {
    model_memory_limit = "200mb"
    categorization_examples_limit = 20
  }
}

Field Validation

The resource includes comprehensive validation:

job_id: Must contain only lowercase alphanumeric characters, hyphens, and underscores
bucket_span: Must be a valid time interval (e.g., "15m", "1h")
detector.function: Must be one of the supported ML functions
memory_limit: Must be a valid memory size format
time_format: Supports epoch, epoch_ms, or custom patterns

Import Support

Existing ML anomaly detection jobs can be imported:

terraform import elasticstack_elasticsearch_ml_anomaly_detection_job.example existing-job-id

Error Handling

The resource handles various error scenarios:

Job not found: Gracefully removes resource from state
Insufficient ML capacity: Provides clear error messages
Configuration conflicts: Validates detector configurations
Memory limits: Warns about memory usage patterns

Best Practices

Memory Sizing: Start with conservative memory limits and increase as needed
Bucket Span: Choose appropriate bucket spans based on data frequency
Detectors: Use specific field names for better anomaly detection
Influencers: Include relevant fields that might influence anomalies
Datafeeds: Use appropriate query delays for real-time data

Limitations

Some job properties cannot be updated after creation (analysis_config structure)
Jobs must be closed before deletion (handled automatically)
Datafeed creation is included but separate datafeed management is recommended for complex scenarios

Documentation ¶

Index ¶

func GetSchema() schema.Schema
func NewAnomalyDetectionJobResource() resource.Resource
type AnalysisConfigAPIModel
type AnalysisConfigTFModel
type AnalysisLimitsAPIModel
type AnalysisLimitsTFModel
type AnomalyDetectionJobAPIModel
type AnomalyDetectionJobTFModel
type AnomalyDetectionJobUpdateAPIModel
- func (u *AnomalyDetectionJobUpdateAPIModel) BuildFromPlan(ctx context.Context, plan, state *AnomalyDetectionJobTFModel) (bool, fwdiags.Diagnostics)
type ChunkingConfigAPIModel
type CustomRuleAPIModel
type CustomRuleTFModel
type DataDescriptionAPIModel
type DataDescriptionTFModel
type DelayedDataCheckConfigAPIModel
type DetectorAPIModel
type DetectorTFModel
type IndicesOptionsAPIModel
type ModelPlotConfigAPIModel
type ModelPlotConfigTFModel
type PerPartitionCategorizationAPIModel
type PerPartitionCategorizationTFModel
type RuleConditionAPIModel
type RuleConditionTFModel

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func GetSchema ¶

func GetSchema() schema.Schema

func NewAnomalyDetectionJobResource ¶

func NewAnomalyDetectionJobResource() resource.Resource

Types ¶

type AnalysisConfigAPIModel ¶

type AnalysisConfigAPIModel struct {
	BucketSpan                 string                              `json:"bucket_span"`
	CategorizationFieldName    string                              `json:"categorization_field_name,omitempty"`
	CategorizationFilters      []string                            `json:"categorization_filters,omitempty"`
	Detectors                  []DetectorAPIModel                  `json:"detectors"`
	Influencers                []string                            `json:"influencers,omitempty"`
	Latency                    string                              `json:"latency,omitempty"`
	ModelPruneWindow           string                              `json:"model_prune_window,omitempty"`
	MultivariateByFields       *bool                               `json:"multivariate_by_fields,omitempty"`
	PerPartitionCategorization *PerPartitionCategorizationAPIModel `json:"per_partition_categorization,omitempty"`
	SummaryCountFieldName      string                              `json:"summary_count_field_name,omitempty"`
}

AnalysisConfigAPIModel represents the analysis configuration in API format

type AnalysisConfigTFModel ¶

type AnalysisConfigTFModel struct {
	BucketSpan                 types.String `tfsdk:"bucket_span"`
	CategorizationFieldName    types.String `tfsdk:"categorization_field_name"`
	CategorizationFilters      types.List   `tfsdk:"categorization_filters"`
	Detectors                  types.List   `tfsdk:"detectors"`
	Influencers                types.List   `tfsdk:"influencers"`
	Latency                    types.String `tfsdk:"latency"`
	ModelPruneWindow           types.String `tfsdk:"model_prune_window"`
	MultivariateByFields       types.Bool   `tfsdk:"multivariate_by_fields"`
	PerPartitionCategorization types.Object `tfsdk:"per_partition_categorization"`
	SummaryCountFieldName      types.String `tfsdk:"summary_count_field_name"`
}

AnalysisConfigTFModel represents the analysis configuration

type AnalysisLimitsAPIModel ¶

type AnalysisLimitsAPIModel struct {
	CategorizationExamplesLimit *int64 `json:"categorization_examples_limit,omitempty"`
	ModelMemoryLimit            string `json:"model_memory_limit,omitempty"`
}

AnalysisLimitsAPIModel represents analysis limits in API format

type AnalysisLimitsTFModel ¶

type AnalysisLimitsTFModel struct {
	CategorizationExamplesLimit types.Int64            `tfsdk:"categorization_examples_limit"`
	ModelMemoryLimit            customtypes.MemorySize `tfsdk:"model_memory_limit"`
}

AnalysisLimitsTFModel represents analysis limits configuration

type AnomalyDetectionJobAPIModel ¶

type AnomalyDetectionJobAPIModel struct {
	JobID                                string                   `json:"job_id"`
	Description                          string                   `json:"description,omitempty"`
	Groups                               []string                 `json:"groups,omitempty"`
	AnalysisConfig                       AnalysisConfigAPIModel   `json:"analysis_config"`
	AnalysisLimits                       *AnalysisLimitsAPIModel  `json:"analysis_limits,omitempty"`
	DataDescription                      DataDescriptionAPIModel  `json:"data_description"`
	ModelPlotConfig                      *ModelPlotConfigAPIModel `json:"model_plot_config,omitempty"`
	AllowLazyOpen                        *bool                    `json:"allow_lazy_open,omitempty"`
	BackgroundPersistInterval            string                   `json:"background_persist_interval,omitempty"`
	CustomSettings                       map[string]interface{}   `json:"custom_settings,omitempty"`
	DailyModelSnapshotRetentionAfterDays *int64                   `json:"daily_model_snapshot_retention_after_days,omitempty"`
	ModelSnapshotRetentionDays           *int64                   `json:"model_snapshot_retention_days,omitempty"`
	RenormalizationWindowDays            *int64                   `json:"renormalization_window_days,omitempty"`
	ResultsIndexName                     string                   `json:"results_index_name,omitempty"`
	ResultsRetentionDays                 *int64                   `json:"results_retention_days,omitempty"`

	// Read-only fields
	CreateTime      interface{} `json:"create_time,omitempty"`
	JobType         string      `json:"job_type,omitempty"`
	JobVersion      string      `json:"job_version,omitempty"`
	ModelSnapshotID string      `json:"model_snapshot_id,omitempty"`
}

AnomalyDetectionJobAPIModel represents the API model for ML anomaly detection jobs

type AnomalyDetectionJobTFModel ¶

type AnomalyDetectionJobTFModel struct {
	ID                                   types.String         `tfsdk:"id"`
	ElasticsearchConnection              types.List           `tfsdk:"elasticsearch_connection"`
	JobID                                types.String         `tfsdk:"job_id"`
	Description                          types.String         `tfsdk:"description"`
	Groups                               types.Set            `tfsdk:"groups"`
	AnalysisConfig                       types.Object         `tfsdk:"analysis_config"`
	AnalysisLimits                       types.Object         `tfsdk:"analysis_limits"`
	DataDescription                      types.Object         `tfsdk:"data_description"`
	ModelPlotConfig                      types.Object         `tfsdk:"model_plot_config"`
	AllowLazyOpen                        types.Bool           `tfsdk:"allow_lazy_open"`
	BackgroundPersistInterval            types.String         `tfsdk:"background_persist_interval"`
	CustomSettings                       jsontypes.Normalized `tfsdk:"custom_settings"`
	DailyModelSnapshotRetentionAfterDays types.Int64          `tfsdk:"daily_model_snapshot_retention_after_days"`
	ModelSnapshotRetentionDays           types.Int64          `tfsdk:"model_snapshot_retention_days"`
	RenormalizationWindowDays            types.Int64          `tfsdk:"renormalization_window_days"`
	ResultsIndexName                     types.String         `tfsdk:"results_index_name"`
	ResultsRetentionDays                 types.Int64          `tfsdk:"results_retention_days"`

	// Read-only computed fields
	CreateTime      types.String `tfsdk:"create_time"`
	JobType         types.String `tfsdk:"job_type"`
	JobVersion      types.String `tfsdk:"job_version"`
	ModelSnapshotID types.String `tfsdk:"model_snapshot_id"`
}

AnomalyDetectionJobTFModel represents the Terraform resource model for ML anomaly detection jobs

type AnomalyDetectionJobUpdateAPIModel ¶

type AnomalyDetectionJobUpdateAPIModel struct {
	Description                          *string                  `json:"description,omitempty"`
	Groups                               []string                 `json:"groups,omitempty"`
	AnalysisLimits                       *AnalysisLimitsAPIModel  `json:"analysis_limits,omitempty"`
	ModelPlotConfig                      *ModelPlotConfigAPIModel `json:"model_plot_config,omitempty"`
	AllowLazyOpen                        *bool                    `json:"allow_lazy_open,omitempty"`
	BackgroundPersistInterval            *string                  `json:"background_persist_interval,omitempty"`
	CustomSettings                       map[string]interface{}   `json:"custom_settings,omitempty"`
	DailyModelSnapshotRetentionAfterDays *int64                   `json:"daily_model_snapshot_retention_after_days,omitempty"`
	ModelSnapshotRetentionDays           *int64                   `json:"model_snapshot_retention_days,omitempty"`
	RenormalizationWindowDays            *int64                   `json:"renormalization_window_days,omitempty"`
	ResultsRetentionDays                 *int64                   `json:"results_retention_days,omitempty"`
}

AnomalyDetectionJobUpdateAPIModel represents the API model for updating ML anomaly detection jobs This includes only the fields that can be updated after job creation

func (*AnomalyDetectionJobUpdateAPIModel) BuildFromPlan ¶

func (u *AnomalyDetectionJobUpdateAPIModel) BuildFromPlan(ctx context.Context, plan, state *AnomalyDetectionJobTFModel) (bool, fwdiags.Diagnostics)

BuildFromPlan populates the AnomalyDetectionJobUpdateAPIModel from the plan and state models

type ChunkingConfigAPIModel ¶

type ChunkingConfigAPIModel struct {
	Mode     string `json:"mode"`
	TimeSpan string `json:"time_span,omitempty"`
}

ChunkingConfigAPIModel represents chunking configuration in API format

type CustomRuleAPIModel ¶

type CustomRuleAPIModel struct {
	Actions    []interface{}           `json:"actions,omitempty"`
	Conditions []RuleConditionAPIModel `json:"conditions,omitempty"`
}

CustomRuleAPIModel represents a custom rule in API format

type CustomRuleTFModel ¶

type CustomRuleTFModel struct {
	Actions    types.List `tfsdk:"actions"`
	Conditions types.List `tfsdk:"conditions"`
}

CustomRuleTFModel represents a custom rule configuration

type DataDescriptionAPIModel ¶

type DataDescriptionAPIModel struct {
	TimeField  string `json:"time_field,omitempty"`
	TimeFormat string `json:"time_format,omitempty"`
}

DataDescriptionAPIModel represents data description in API format

type DataDescriptionTFModel ¶

type DataDescriptionTFModel struct {
	TimeField  types.String `tfsdk:"time_field"`
	TimeFormat types.String `tfsdk:"time_format"`
}

DataDescriptionTFModel represents data description configuration

type DelayedDataCheckConfigAPIModel ¶

type DelayedDataCheckConfigAPIModel struct {
	CheckWindow string `json:"check_window,omitempty"`
	Enabled     bool   `json:"enabled"`
}

DelayedDataCheckConfigAPIModel represents delayed data check configuration in API format

type DetectorAPIModel ¶

type DetectorAPIModel struct {
	ByFieldName         string               `json:"by_field_name,omitempty"`
	DetectorDescription string               `json:"detector_description,omitempty"`
	ExcludeFrequent     string               `json:"exclude_frequent,omitempty"`
	FieldName           string               `json:"field_name,omitempty"`
	Function            string               `json:"function"`
	OverFieldName       string               `json:"over_field_name,omitempty"`
	PartitionFieldName  string               `json:"partition_field_name,omitempty"`
	UseNull             *bool                `json:"use_null,omitempty"`
	CustomRules         []CustomRuleAPIModel `json:"custom_rules,omitempty"`
}

DetectorAPIModel represents a detector configuration in API format

type DetectorTFModel ¶

type DetectorTFModel struct {
	ByFieldName         types.String `tfsdk:"by_field_name"`
	DetectorDescription types.String `tfsdk:"detector_description"`
	ExcludeFrequent     types.String `tfsdk:"exclude_frequent"`
	FieldName           types.String `tfsdk:"field_name"`
	Function            types.String `tfsdk:"function"`
	OverFieldName       types.String `tfsdk:"over_field_name"`
	PartitionFieldName  types.String `tfsdk:"partition_field_name"`
	UseNull             types.Bool   `tfsdk:"use_null"`
	CustomRules         types.List   `tfsdk:"custom_rules"`
}

DetectorTFModel represents a detector configuration

type IndicesOptionsAPIModel ¶

type IndicesOptionsAPIModel struct {
	ExpandWildcards   []string `json:"expand_wildcards,omitempty"`
	IgnoreUnavailable *bool    `json:"ignore_unavailable,omitempty"`
	AllowNoIndices    *bool    `json:"allow_no_indices,omitempty"`
	IgnoreThrottled   *bool    `json:"ignore_throttled,omitempty"`
}

IndicesOptionsAPIModel represents indices options in API format

type ModelPlotConfigAPIModel ¶

type ModelPlotConfigAPIModel struct {
	AnnotationsEnabled *bool  `json:"annotations_enabled,omitempty"`
	Enabled            bool   `json:"enabled"`
	Terms              string `json:"terms,omitempty"`
}

ModelPlotConfigAPIModel represents model plot configuration in API format

type ModelPlotConfigTFModel ¶

type ModelPlotConfigTFModel struct {
	AnnotationsEnabled types.Bool   `tfsdk:"annotations_enabled"`
	Enabled            types.Bool   `tfsdk:"enabled"`
	Terms              types.String `tfsdk:"terms"`
}

ModelPlotConfigTFModel represents model plot configuration

type PerPartitionCategorizationAPIModel ¶

type PerPartitionCategorizationAPIModel struct {
	Enabled    bool  `json:"enabled"`
	StopOnWarn *bool `json:"stop_on_warn,omitempty"`
}

PerPartitionCategorizationAPIModel represents per-partition categorization in API format

type PerPartitionCategorizationTFModel ¶

type PerPartitionCategorizationTFModel struct {
	Enabled    types.Bool `tfsdk:"enabled"`
	StopOnWarn types.Bool `tfsdk:"stop_on_warn"`
}

PerPartitionCategorizationTFModel represents per-partition categorization configuration

type RuleConditionAPIModel ¶

type RuleConditionAPIModel struct {
	AppliesTo string  `json:"applies_to"`
	Operator  string  `json:"operator"`
	Value     float64 `json:"value"`
}

RuleConditionAPIModel represents a rule condition in API format

type RuleConditionTFModel ¶

type RuleConditionTFModel struct {
	AppliesTo types.String  `tfsdk:"applies_to"`
	Operator  types.String  `tfsdk:"operator"`
	Value     types.Float64 `tfsdk:"value"`
}

RuleConditionTFModel represents a rule condition

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL