Documentation
¶
Overview ¶
Package anomaly provides log anomaly detection using ONNX models or a built-in heuristic fallback.
Package anomaly provides log anomaly detection structures and algorithms.
The standard demonstration templates and seed pool setups are compiled based on the evaluations from LLMLog (VLDB 2025) and CoLA (VLDB 2025). Licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0).
Package anomaly provides log anomaly detection structures and algorithms.
LLMLog (Large Language Model-based Log Template Generation via LLM-driven Multi-Round Annotation) is based on the greedy set-cover dynamic demonstration selection algorithm (Algorithm 3) proposed in: "LLMLog: Advanced Log Template Generation via LLM-driven Multi-Round Annotation" (VLDB 2025) by Fei Teng, Haoyang Li, and Lei Chen. Licensed under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0). For license details, see: https://creativecommons.org/licenses/by-nc-nd/4.0/
Package anomaly provides anomaly detection algorithms based on LSH (Locality Sensitive Hashing) and other clustering techniques.
LogLSHD (Locality-Sensitive Hashing with Sequence-Alignment Clustering) is based on the algorithm and research proposed in: "RT-LogAAS: A Real-Time Log Anomaly Analysis System for Net-Cloud" Licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). For license details, see: https://creativecommons.org/licenses/by/4.0/
Index ¶
- Variables
- func LFF(line string) string
- func LoadFeedbackDemos(filePath string) error
- func Normalize(line string) string
- type BatchStorage
- type ClickHouseStorage
- type Cluster
- type DemoInstance
- type Detector
- type ElasticsearchStorage
- type EmbeddedDetector
- type LSHDDetector
- type Options
- type Result
- type Storage
- type StorageRecord
- type UDPStorage
Constants ¶
This section is empty.
Variables ¶
var DefaultSeedPool = []DemoInstance{ { Source: "nginx", Template: `<*> - - [<*>] "GET <*> HTTP/1.1" 200 <*>`, Anomaly: false, Score: 0.0, Reason: "routine HTTP GET request with 200 success response", }, { Source: "nginx", Template: `<*> - - [<*>] "POST <*> HTTP/1.1" 201 <*>`, Anomaly: false, Score: 0.05, Reason: "routine HTTP POST request with 201 success response", }, { Source: "nginx", Template: `<*> - - [<*>] "GET <*> HTTP/1.1" 500 <*>`, Anomaly: true, Score: 0.85, Reason: "HTTP 500 internal server error indicating server-side application failure", }, { Source: "nginx", Template: `<*> - - [<*>] "GET /etc/passwd HTTP/1.1" 400 <*>`, Anomaly: true, Score: 0.95, Reason: "unauthorized directory traversal security probe attempting to read configuration", }, { Source: "postgres", Template: `<*> [info] connection received: host=<*>`, Anomaly: false, Score: 0.0, Reason: "routine database connection received from a client", }, { Source: "postgres", Template: `<*> [info] statement: SELECT <*> FROM <*>`, Anomaly: false, Score: 0.0, Reason: "routine database SELECT query execution", }, { Source: "postgres", Template: `<*> [error] password authentication failed for user <*>`, Anomaly: true, Score: 0.90, Reason: "failed database authentication attempt indicating potential brute force or misconfiguration", }, { Source: "postgres", Template: `<*> [fatal] remaining connection slots are reserved for non-replication superuser connections`, Anomaly: true, Score: 0.95, Reason: "database connection slots exhausted causing service disruption", }, { Source: "node", Template: `info: server listening on port <*>`, Anomaly: false, Score: 0.0, Reason: "routine application server initialization log", }, { Source: "node", Template: `debug: query executed in <*> ms`, Anomaly: false, Score: 0.0, Reason: "routine database query execution latency debugging", }, { Source: "node", Template: `error: uncaught exception: <*> at <*>`, Anomaly: true, Score: 0.98, Reason: "uncaught application exception leading to severe instability or crash", }, { Source: "node", Template: `warn: memory usage exceeded threshold: <*> mb`, Anomaly: true, Score: 0.80, Reason: "high memory consumption exceeding predefined safety limits", }, }
DefaultSeedPool contains standard templates for Nginx, PostgreSQL, and Node.js.
var PoolMu sync.RWMutex
PoolMu protects concurrent reading/writing of DefaultSeedPool.
Functions ¶
func LFF ¶ added in v0.3.4
LFF transforms a raw log line into its Elastic Log Format Fingerprint (LFF) following the exact inside-out ES|QL execution order: 1. Whitespaces collapsed -> ' ' 2. Letters -> 'a' 3. Digits -> '0' 4. Multiple 'a' tokens with single space collapsed -> 'a' 5. Special characters/symbols preserved.
func LoadFeedbackDemos ¶ added in v0.3.4
LoadFeedbackDemos reads a JSONL feedback file, parses feedback records, and prepends them to DefaultSeedPool so they are preferred during demonstration selection.
Types ¶
type BatchStorage ¶ added in v0.3.4
type BatchStorage struct {
// contains filtered or unexported fields
}
BatchStorage wraps any Storage engine and performs in-memory LRU deduplication and buffered batch-writing inside a background worker goroutine.
func NewBatchStorage ¶ added in v0.3.4
func NewBatchStorage(inner Storage, batchSize int, timeout time.Duration) *BatchStorage
NewBatchStorage instantiates a BatchStorage wrapper with a 10,000-entry LRU deduplicator.
func (*BatchStorage) Close ¶ added in v0.3.4
func (b *BatchStorage) Close() error
Close flushes the remaining buffer and closes the inner storage engine.
func (*BatchStorage) Write ¶ added in v0.3.4
func (b *BatchStorage) Write(ctx context.Context, rec StorageRecord) error
Write checks-and-adds the log's normalized template to the LRU cache atomically. If the template signature is already written, it is skipped cleanly.
func (*BatchStorage) WriteBatch ¶ added in v0.3.4
func (b *BatchStorage) WriteBatch(ctx context.Context, records []StorageRecord) error
WriteBatch bypasses the queue and writes the batch directly to the inner storage.
type ClickHouseStorage ¶ added in v0.3.4
type ClickHouseStorage struct {
// contains filtered or unexported fields
}
ClickHouseStorage persists records to a ClickHouse table.
func NewClickHouseStorage ¶ added in v0.3.4
func NewClickHouseStorage(dsn, table string) (*ClickHouseStorage, error)
NewClickHouseStorage connects to a ClickHouse instance via DSN.
func (*ClickHouseStorage) Close ¶ added in v0.3.4
func (c *ClickHouseStorage) Close() error
Close closes the database connection cleanly.
func (*ClickHouseStorage) Write ¶ added in v0.3.4
func (c *ClickHouseStorage) Write(ctx context.Context, rec StorageRecord) error
Write appends a single record as a batch of one.
func (*ClickHouseStorage) WriteBatch ¶ added in v0.3.4
func (c *ClickHouseStorage) WriteBatch(ctx context.Context, records []StorageRecord) error
WriteBatch performs high-speed bulk inserts using ClickHouse batch prepared statements.
type Cluster ¶ added in v0.3.4
type Cluster struct {
// ID is the unique identifier for the cluster.
ID int
// Template is the log template representing this cluster.
Template []string
// SimHash is the similarity hash for the template.
SimHash uint64
// Count is the number of logs assigned to this cluster.
Count int
}
Cluster represents a log cluster identified by the anomaly detector.
type DemoInstance ¶ added in v0.3.4
type DemoInstance struct {
Source string // the source system name, e.g. "nginx", "postgres", "node"
Tokens []string // tokenized version of the template
Template string // the template string, e.g. "<*> - - [<*>] \"get /index.html http/1.1\" 200 <*>\""
Anomaly bool // whether it is an anomaly
Score float64 // the anomaly score, 0.0-1.0
Reason string // reason description
}
DemoInstance represents a labeled log template demonstration for LLMLog.
func SelectDefaultDemonstrations ¶ added in v0.3.4
func SelectDefaultDemonstrations(targetTokens []string, maxDemos int) []DemoInstance
SelectDefaultDemonstrations is a thread-safe wrapper helper that automatically acquires PoolMu.RLock() before calling SelectDemonstrations on DefaultSeedPool.
func SelectDemonstrations ¶ added in v0.3.4
func SelectDemonstrations(targetTokens []string, pool []DemoInstance, maxDemos int) []DemoInstance
SelectDemonstrations returns the best matching few-shot examples for the target tokens. The selection follows a greedy set-cover logic to maximize target token coverage.
type ElasticsearchStorage ¶ added in v0.3.4
type ElasticsearchStorage struct {
// contains filtered or unexported fields
}
ElasticsearchStorage persists records to an OpenSearch index.
func NewElasticsearchStorage ¶ added in v0.3.4
func NewElasticsearchStorage(cfg opensearch.Config, index string) (*ElasticsearchStorage, error)
NewElasticsearchStorage connects to an OpenSearch instance via configuration.
func (*ElasticsearchStorage) Close ¶ added in v0.3.4
func (e *ElasticsearchStorage) Close() error
Close is a no-op for HTTP client connection.
func (*ElasticsearchStorage) Write ¶ added in v0.3.4
func (e *ElasticsearchStorage) Write(ctx context.Context, rec StorageRecord) error
Write appends a single record as a bulk batch of one.
func (*ElasticsearchStorage) WriteBatch ¶ added in v0.3.4
func (e *ElasticsearchStorage) WriteBatch(ctx context.Context, records []StorageRecord) error
WriteBatch performs high-speed bulk inserts using the OpenSearch bulk API.
type EmbeddedDetector ¶
type EmbeddedDetector struct {
// contains filtered or unexported fields
}
EmbeddedDetector wraps an ONNX model or falls back to a heuristic scorer.
func NewEmbeddedDetector ¶
func NewEmbeddedDetector(opts Options) (*EmbeddedDetector, error)
NewEmbeddedDetector creates a detector backed by the ONNX model at opts.ModelPath, or a heuristic if empty.
type LSHDDetector ¶ added in v0.3.4
LSHDDetector defines the interface for LSHD anomaly detection.
func NewLSHDDetector ¶ added in v0.3.4
func NewLSHDDetector() LSHDDetector
NewLSHDDetector creates a new LSHDDetector and initializes its internal data structures.
type Options ¶
type Options struct {
ModelPath string
TokenizerPath string
Threshold float64
Window int
// LLMEndpoint is the base URL of an OpenAI-compatible API (Ollama: http://localhost:11434/v1,
// LM Studio: http://localhost:1234/v1). When set, log lines are scored via chat completions.
LLMEndpoint string
// LLMModel is the model name sent to the LLM endpoint. Defaults to "llama3" when empty.
LLMModel string
// LLMContextLines is the number of recent log lines sent as context with each LLM request.
// 0 disables context (single-line mode). Default 5 when unset.
LLMContextLines int
// FilterThreshold enables CoLA-style two-tier detection when > 0. The fast detector
// (heuristic when LLM-only, ONNX when ensemble) runs first; the LLM is only invoked
// when the fast score is at or above this value. Lines below it are returned as normal
// without an LLM call. Recommended value: 0.40. 0 disables filtering (default).
FilterThreshold float64
// FreqWindow is the short-window size used for rate-ratio burst detection. When a log
// template's occurrence rate in the last FreqWindow lines exceeds FreqRatio × its
// long-term baseline rate, it is flagged as a frequency spike. Default 100; 0 disables.
FreqWindow int
// FreqRatio is the short/long rate ratio that triggers a freq-spike score. Default 5.0.
FreqRatio float64
// Preprocessor configures a preprocessor to run before anomaly detection.
Preprocessor string
}
Options configures the anomaly detector.
type Storage ¶ added in v0.3.4
type Storage interface {
Write(ctx context.Context, rec StorageRecord) error
WriteBatch(ctx context.Context, records []StorageRecord) error
Close() error
}
Storage defines the interface for persisting scored/anomalous log records.
type StorageRecord ¶ added in v0.3.4
type StorageRecord struct {
Timestamp string `json:"ts"`
Source string `json:"source"`
Line string `json:"line"`
Score float64 `json:"score"`
Reason string `json:"reason"`
Anomaly bool `json:"anomaly"`
}
StorageRecord holds the structured metadata of an anomaly log record.
type UDPStorage ¶ added in v0.3.4
type UDPStorage struct {
// contains filtered or unexported fields
}
UDPStorage streams JSON-formatted logs over connectionless UDP sockets (fire-and-forget).
func NewUDPStorage ¶ added in v0.3.4
func NewUDPStorage(address string) (*UDPStorage, error)
NewUDPStorage connects to a remote UDP address (e.g. "127.0.0.1:514").
func (*UDPStorage) Close ¶ added in v0.3.4
func (u *UDPStorage) Close() error
Close closes the UDP socket cleanly.
func (*UDPStorage) Write ¶ added in v0.3.4
func (u *UDPStorage) Write(_ context.Context, rec StorageRecord) error
Write marshals and writes a single JSON record to the UDP socket with a newline delimiter.
func (*UDPStorage) WriteBatch ¶ added in v0.3.4
func (u *UDPStorage) WriteBatch(ctx context.Context, records []StorageRecord) error
WriteBatch writes each record in the batch as an individual UDP packet (MTU-friendly).