anomaly

package
v0.3.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2026 License: MIT Imports: 25 Imported by: 0

Documentation

Overview

Package anomaly provides log anomaly detection using ONNX models or a built-in heuristic fallback.

Package anomaly provides log anomaly detection structures and algorithms.

The standard demonstration templates and seed pool setups are compiled based on the evaluations from LLMLog (VLDB 2025) and CoLA (VLDB 2025). Licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0).

Package anomaly provides log anomaly detection structures and algorithms.

LLMLog (Large Language Model-based Log Template Generation via LLM-driven Multi-Round Annotation) is based on the greedy set-cover dynamic demonstration selection algorithm (Algorithm 3) proposed in: "LLMLog: Advanced Log Template Generation via LLM-driven Multi-Round Annotation" (VLDB 2025) by Fei Teng, Haoyang Li, and Lei Chen. Licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0). For license details, see: https://creativecommons.org/licenses/by-nc-nd/4.0/

Package anomaly provides anomaly detection algorithms based on LSH (Locality Sensitive Hashing) and other clustering techniques.

LogLSHD (Locality-Sensitive Hashing with Sequence-Alignment Clustering) is based on the algorithm and research proposed in: "RT-LogAAS: A Real-Time Log Anomaly Analysis System for Net-Cloud" Licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). For license details, see: https://creativecommons.org/licenses/by/4.0/

Index

Constants

This section is empty.

Variables

View Source
var DefaultSeedPool = []DemoInstance{

	{
		Source:   "nginx",
		Template: `<*> - - [<*>] "GET <*> HTTP/1.1" 200 <*>`,
		Anomaly:  false,
		Score:    0.0,
		Reason:   "routine HTTP GET request with 200 success response",
	},
	{
		Source:   "nginx",
		Template: `<*> - - [<*>] "POST <*> HTTP/1.1" 201 <*>`,
		Anomaly:  false,
		Score:    0.05,
		Reason:   "routine HTTP POST request with 201 success response",
	},

	{
		Source:   "nginx",
		Template: `<*> - - [<*>] "GET <*> HTTP/1.1" 500 <*>`,
		Anomaly:  true,
		Score:    0.85,
		Reason:   "HTTP 500 internal server error indicating server-side application failure",
	},
	{
		Source:   "nginx",
		Template: `<*> - - [<*>] "GET /etc/passwd HTTP/1.1" 400 <*>`,
		Anomaly:  true,
		Score:    0.95,
		Reason:   "unauthorized directory traversal security probe attempting to read configuration",
	},

	{
		Source:   "postgres",
		Template: `<*> [info] connection received: host=<*>`,
		Anomaly:  false,
		Score:    0.0,
		Reason:   "routine database connection received from a client",
	},
	{
		Source:   "postgres",
		Template: `<*> [info] statement: SELECT <*> FROM <*>`,
		Anomaly:  false,
		Score:    0.0,
		Reason:   "routine database SELECT query execution",
	},

	{
		Source:   "postgres",
		Template: `<*> [error] password authentication failed for user <*>`,
		Anomaly:  true,
		Score:    0.90,
		Reason:   "failed database authentication attempt indicating potential brute force or misconfiguration",
	},
	{
		Source:   "postgres",
		Template: `<*> [fatal] remaining connection slots are reserved for non-replication superuser connections`,
		Anomaly:  true,
		Score:    0.95,
		Reason:   "database connection slots exhausted causing service disruption",
	},

	{
		Source:   "node",
		Template: `info: server listening on port <*>`,
		Anomaly:  false,
		Score:    0.0,
		Reason:   "routine application server initialization log",
	},
	{
		Source:   "node",
		Template: `debug: query executed in <*> ms`,
		Anomaly:  false,
		Score:    0.0,
		Reason:   "routine database query execution latency debugging",
	},

	{
		Source:   "node",
		Template: `error: uncaught exception: <*> at <*>`,
		Anomaly:  true,
		Score:    0.98,
		Reason:   "uncaught application exception leading to severe instability or crash",
	},
	{
		Source:   "node",
		Template: `warn: memory usage exceeded threshold: <*> mb`,
		Anomaly:  true,
		Score:    0.80,
		Reason:   "high memory consumption exceeding predefined safety limits",
	},
}

DefaultSeedPool contains standard templates for Nginx, PostgreSQL, and Node.js.

View Source
var PoolMu sync.RWMutex

PoolMu protects concurrent reading/writing of DefaultSeedPool.

Functions

func LFF added in v0.3.4

func LFF(line string) string

LFF transforms a raw log line into its Elastic Log Format Fingerprint (LFF) following the exact inside-out ES|QL execution order: 1. Whitespaces collapsed -> ' ' 2. Letters -> 'a' 3. Digits -> '0' 4. Multiple 'a' tokens with single space collapsed -> 'a' 5. Special characters/symbols preserved.

func LoadFeedbackDemos added in v0.3.4

func LoadFeedbackDemos(filePath string) error

LoadFeedbackDemos reads a JSONL feedback file, parses feedback records, and prepends them to DefaultSeedPool so they are preferred during demonstration selection.

func Normalize added in v0.3.4

func Normalize(line string) string

Normalize cleans and normalizes a log line by stripping parameter noise (timestamps, IPs, UUIDs).

Types

type BatchStorage added in v0.3.4

type BatchStorage struct {
	// contains filtered or unexported fields
}

BatchStorage wraps any Storage engine and performs in-memory LRU deduplication and buffered batch-writing inside a background worker goroutine.

func NewBatchStorage added in v0.3.4

func NewBatchStorage(inner Storage, batchSize int, timeout time.Duration) *BatchStorage

NewBatchStorage instantiates a BatchStorage wrapper with a 10,000-entry LRU deduplicator.

func (*BatchStorage) Close added in v0.3.4

func (b *BatchStorage) Close() error

Close flushes the remaining buffer and closes the inner storage engine.

func (*BatchStorage) Write added in v0.3.4

func (b *BatchStorage) Write(ctx context.Context, rec StorageRecord) error

Write checks-and-adds the log's normalized template to the LRU cache atomically. If the template signature is already written, it is skipped cleanly.

func (*BatchStorage) WriteBatch added in v0.3.4

func (b *BatchStorage) WriteBatch(ctx context.Context, records []StorageRecord) error

WriteBatch bypasses the queue and writes the batch directly to the inner storage.

type ClickHouseStorage added in v0.3.4

type ClickHouseStorage struct {
	// contains filtered or unexported fields
}

ClickHouseStorage persists records to a ClickHouse table.

func NewClickHouseStorage added in v0.3.4

func NewClickHouseStorage(dsn, table string) (*ClickHouseStorage, error)

NewClickHouseStorage connects to a ClickHouse instance via DSN.

func (*ClickHouseStorage) Close added in v0.3.4

func (c *ClickHouseStorage) Close() error

Close closes the database connection cleanly.

func (*ClickHouseStorage) Write added in v0.3.4

Write appends a single record as a batch of one.

func (*ClickHouseStorage) WriteBatch added in v0.3.4

func (c *ClickHouseStorage) WriteBatch(ctx context.Context, records []StorageRecord) error

WriteBatch performs high-speed bulk inserts using ClickHouse batch prepared statements.

type Cluster added in v0.3.4

type Cluster struct {
	// ID is the unique identifier for the cluster.
	ID int
	// Template is the log template representing this cluster.
	Template []string
	// SimHash is the similarity hash for the template.
	SimHash uint64
	// Count is the number of logs assigned to this cluster.
	Count int
}

Cluster represents a log cluster identified by the anomaly detector.

type DemoInstance added in v0.3.4

type DemoInstance struct {
	Source   string   // the source system name, e.g. "nginx", "postgres", "node"
	Tokens   []string // tokenized version of the template
	Template string   // the template string, e.g. "<*> - - [<*>] \"get /index.html http/1.1\" 200 <*>\""
	Anomaly  bool     // whether it is an anomaly
	Score    float64  // the anomaly score, 0.0-1.0
	Reason   string   // reason description
}

DemoInstance represents a labeled log template demonstration for LLMLog.

func SelectDefaultDemonstrations added in v0.3.4

func SelectDefaultDemonstrations(targetTokens []string, maxDemos int) []DemoInstance

SelectDefaultDemonstrations is a thread-safe wrapper helper that automatically acquires PoolMu.RLock() before calling SelectDemonstrations on DefaultSeedPool.

func SelectDemonstrations added in v0.3.4

func SelectDemonstrations(targetTokens []string, pool []DemoInstance, maxDemos int) []DemoInstance

SelectDemonstrations returns the best matching few-shot examples for the target tokens. The selection follows a greedy set-cover logic to maximize target token coverage.

type Detector

type Detector interface {
	Score(ctx context.Context, line string) (Result, error)
}

Detector scores log lines for anomaly probability.

func NewLLMDetector added in v0.3.4

func NewLLMDetector(endpoint, model string, threshold float64, contextLines int) Detector

NewLLMDetector exposes the internal newLLMDetector function for public use.

type ElasticsearchStorage added in v0.3.4

type ElasticsearchStorage struct {
	// contains filtered or unexported fields
}

ElasticsearchStorage persists records to an OpenSearch index.

func NewElasticsearchStorage added in v0.3.4

func NewElasticsearchStorage(cfg opensearch.Config, index string) (*ElasticsearchStorage, error)

NewElasticsearchStorage connects to an OpenSearch instance via configuration.

func (*ElasticsearchStorage) Close added in v0.3.4

func (e *ElasticsearchStorage) Close() error

Close is a no-op for HTTP client connection.

func (*ElasticsearchStorage) Write added in v0.3.4

Write appends a single record as a bulk batch of one.

func (*ElasticsearchStorage) WriteBatch added in v0.3.4

func (e *ElasticsearchStorage) WriteBatch(ctx context.Context, records []StorageRecord) error

WriteBatch performs high-speed bulk inserts using the OpenSearch bulk API.

type EmbeddedDetector

type EmbeddedDetector struct {
	// contains filtered or unexported fields
}

EmbeddedDetector wraps an ONNX model or falls back to a heuristic scorer.

func NewEmbeddedDetector

func NewEmbeddedDetector(opts Options) (*EmbeddedDetector, error)

NewEmbeddedDetector creates a detector backed by the ONNX model at opts.ModelPath, or a heuristic if empty.

func (*EmbeddedDetector) Score

func (d *EmbeddedDetector) Score(ctx context.Context, line string) (Result, error)

Score returns the anomaly score for a single log line.

type LSHDDetector added in v0.3.4

type LSHDDetector interface {
	Detector
	Template(line string) (string, error)
}

LSHDDetector defines the interface for LSHD anomaly detection.

func NewLSHDDetector added in v0.3.4

func NewLSHDDetector() LSHDDetector

NewLSHDDetector creates a new LSHDDetector and initializes its internal data structures.

type Options

type Options struct {
	ModelPath     string
	TokenizerPath string
	Threshold     float64
	Window        int

	// LLMEndpoint is the base URL of an OpenAI-compatible API (Ollama: http://localhost:11434/v1,
	// LM Studio: http://localhost:1234/v1). When set, log lines are scored via chat completions.
	LLMEndpoint string
	// LLMModel is the model name sent to the LLM endpoint. Defaults to "llama3" when empty.
	LLMModel string
	// LLMContextLines is the number of recent log lines sent as context with each LLM request.
	// 0 disables context (single-line mode). Default 5 when unset.
	LLMContextLines int
	// FilterThreshold enables CoLA-style two-tier detection when > 0. The fast detector
	// (heuristic when LLM-only, ONNX when ensemble) runs first; the LLM is only invoked
	// when the fast score is at or above this value. Lines below it are returned as normal
	// without an LLM call. Recommended value: 0.40. 0 disables filtering (default).
	FilterThreshold float64
	// FreqWindow is the short-window size used for rate-ratio burst detection. When a log
	// template's occurrence rate in the last FreqWindow lines exceeds FreqRatio × its
	// long-term baseline rate, it is flagged as a frequency spike. Default 100; 0 disables.
	FreqWindow int
	// FreqRatio is the short/long rate ratio that triggers a freq-spike score. Default 5.0.
	FreqRatio float64
	// Preprocessor configures a preprocessor to run before anomaly detection.
	Preprocessor string
}

Options configures the anomaly detector.

type Result

type Result struct {
	Score    float64
	Anomaly  bool
	Reason   string
	Original string
}

Result holds the outcome of scoring a single log line.

type Storage added in v0.3.4

type Storage interface {
	Write(ctx context.Context, rec StorageRecord) error
	WriteBatch(ctx context.Context, records []StorageRecord) error
	Close() error
}

Storage defines the interface for persisting scored/anomalous log records.

type StorageRecord added in v0.3.4

type StorageRecord struct {
	Timestamp string  `json:"ts"`
	Source    string  `json:"source"`
	Line      string  `json:"line"`
	Score     float64 `json:"score"`
	Reason    string  `json:"reason"`
	Anomaly   bool    `json:"anomaly"`
}

StorageRecord holds the structured metadata of an anomaly log record.

type UDPStorage added in v0.3.4

type UDPStorage struct {
	// contains filtered or unexported fields
}

UDPStorage streams JSON-formatted logs over connectionless UDP sockets (fire-and-forget).

func NewUDPStorage added in v0.3.4

func NewUDPStorage(address string) (*UDPStorage, error)

NewUDPStorage connects to a remote UDP address (e.g. "127.0.0.1:514").

func (*UDPStorage) Close added in v0.3.4

func (u *UDPStorage) Close() error

Close closes the UDP socket cleanly.

func (*UDPStorage) Write added in v0.3.4

func (u *UDPStorage) Write(_ context.Context, rec StorageRecord) error

Write marshals and writes a single JSON record to the UDP socket with a newline delimiter.

func (*UDPStorage) WriteBatch added in v0.3.4

func (u *UDPStorage) WriteBatch(ctx context.Context, records []StorageRecord) error

WriteBatch writes each record in the batch as an individual UDP packet (MTU-friendly).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL