timeseries

package
v1.2.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2026 License: MIT Imports: 12 Imported by: 0

README

timeseries

Package timeseries stores and reads time-bucketed aggregations for event streams. It supports counts, sums, averages, sample standard deviation, min/max, HLL distinct counts, TDigest percentiles, geometric mean, circular mean, and z-score dependencies through queries.Field.

A *client.Client can be passed directly to Execute, Read, and Delete. Tests can use smaller mocks that implement the operation methods required by the path being tested.

Modes

Request.Range chooses the storage mode:

  • Range > 0: fixed time-window mode using deterministic multi-granularity cells in set timeseries_set plus metadata in set timeseries_meta.
  • Range == 0: custom all-time mode using one record in timeseries_set.

Request.TTL is required in both modes. Fixed-mode cell TTLs are derived from Range; custom-mode writes use Request.TTL. The maximum range is MaxRange, which is two years.

Strengths

  • Execute, Read, and Delete accept small client interfaces, so production callers can pass *client.Client and tests can mock only the operations they exercise.
  • Fixed-mode keys are deterministic: a SHA-1 primary key is derived from Name and ordered GroupBy values, and cell keys add hierarchical cell IDs. Delete can reconstruct those keys without a separate key-tracking record.
  • Range > 0 mode bounds storage by writing only the active cells for the selected multi-granularity plan, while reads drill into boundary cells instead of scanning every possible fine-grained cell.
  • Range == 0 custom mode gives a simple all-time aggregate in one timeseries_set record when callers do not need a wall-clock window.
  • The package supports the aggregation fields implemented by queries.Field, including approximate HLL distinct counts and TDigest percentiles.

Weaknesses / Tradeoffs

  • TTL must be positive for every request. In fixed mode, cell TTL is derived from Range and metadata TTL is Range * 2; Request.TTL is not the fixed cell-retention knob. In custom mode, writes use Request.TTL.
  • Fixed ranges larger than MaxRange are rejected. Larger valid ranges also use coarser granularity levels, which reduces cell volume but gives less fine-grained storage resolution.
  • Total storage scales with the number of distinct group scopes in use. Per request, operation count scales with requested fields and cells touched by the selected range. FieldDistinct and FieldPercentile add server sketch/digest storage per active cell.
  • Fixed reads and writes use metadata lookups plus batch operations. Min/max fields and stale-cell cleanup can add extra reads or deletes, and deleting a large fixed range enumerates all possible hierarchical cell keys.
  • Missing GroupBy values become empty strings with MapSource; missing or non-numeric Value fields become 0. Missing event_time uses time.Now(), so callers that need deterministic event-time bucketing should provide it explicitly.

Request

Important request fields:

  • Name: variable name used in key derivation.
  • Namespace: FrogoDB namespace.
  • GroupBy: source keys whose string values scope the aggregation.
  • Range: time window, or zero for all-time custom mode.
  • Fields: aggregation flags from package queries.
  • Value: source key for the numeric value.
  • DistinctBy: source key used for FieldDistinct.
  • PercentileP: TDigest quantile for FieldPercentile, expressed from 0 to 1.
  • TTL: required positive expiration duration.
  • IncludeCurrent: when true, write before reading; when false, read before writing.

GroupBy order is significant for timeseries keys.

Example

func updateStats(ctx context.Context) error {
    c, err := client.New("127.0.0.1:3000")
    if err != nil {
        return err
    }
    defer c.Close()

    src := queries.NewMapSource(map[string]any{
        "event_id":         "evt-123",
        "standard.user_id": "user-42",
        "amount":           42.5,
        "event_time":       time.Now(),
    })

    req := timeseries.Request{
        Name:           "user_amounts_24h",
        Namespace:      "scoring",
        GroupBy:        []string{"standard.user_id"},
        Range:          24 * time.Hour,
        Fields:         queries.FieldCount | queries.FieldAvg | queries.FieldSTD | queries.FieldMin | queries.FieldMax,
        Value:          "amount",
        TTL:            48 * time.Hour,
        IncludeCurrent: true,
    }

    result, err := timeseries.Execute(ctx, c, req, src)
    if err != nil {
        return err
    }

    _ = result.Count
    _ = result.Average()
    _ = result.STD()
    return nil
}

Use Read to query without writing the current event:

result, err := timeseries.Read(ctx, c, req, src)

Use Delete to remove stored data for the same request and source scope:

err := timeseries.Delete(ctx, c, req, src)

Results

Result exposes raw aggregate fields and helper methods:

  • Count, Sum, Min, Max, DistinctCount, and Percentile.
  • Average() returns Sum / Count, or zero when count is zero.
  • STD() returns sample standard deviation, or zero with fewer than two samples.
  • GeometricMean() uses SumLog.
  • CircularMean() uses SumSin and SumCos.
  • ZScore(currentValue) uses the result average and standard deviation.

For FieldDistinct, set DistinctBy; otherwise no HLL element is written. For FieldPercentile, set PercentileP, for example 0.95.

Event time

Writes and range reads use queries.EventTime(src). Provide event_time as a time.Time, int64 Unix seconds, or int Unix seconds when events should be bucketed by event time. If omitted, the helper uses time.Now().

Documentation

Index

Constants

View Source
const (
	GranularityMinutes = "minutes"
	GranularityHours   = "hours"
	GranularityDays    = "days"
	GranularityDays30  = "days30"
)

Granularity levels for timeseries cell bucketing.

View Source
const (
	CellsSizeMinutes int64 = 60
	CellsSizeHours   int64 = 24
	CellsSizeDays    int64 = 30
	CellsSizeDays30  int64 = 24
)

Default ring buffer capacities for fine-grained granularity levels.

View Source
const (
	SetTimeseries = "timeseries_set"
	SetMeta       = "timeseries_meta"
)

Set names for timeseries storage.

View Source
const KeyMultiplier int64 = 100

KeyMultiplier is the factor used to build hierarchical cell identifiers. Each coarser granularity shifts the finer cell ID left by this factor.

View Source
const MaxRange = 2 * 365 * 24 * time.Hour

MaxRange is the maximum supported time range (2 years). Matches original: estimator ErrTwoBigTimerange.

Variables

View Source
var ErrRangeTooLarge = errors.New("time range exceeds 2 years maximum")

ErrRangeTooLarge is returned when Range exceeds MaxRange.

Functions

func Delete

func Delete(ctx context.Context, c deleteQueryClient, req Request, src queries.Source) error

Delete removes all stored data for a timeseries variable. Reconstructs all possible cell keys from the range and metadata, then batch-deletes them. No key tracking set is needed — keys are deterministic.

For custom mode (Range=0): deletes the single record. For fixed mode (Range>0): computes all hierarchical cell positions across all granularity levels and deletes them + metadata.

Types

type GranLevel

type GranLevel struct {
	Name     string
	CellSize int64
}

GranLevel describes a granularity level with its ring-buffer cell count. The coarsest level has a dynamic cell count derived from the time range; finer levels use fixed defaults.

type Request

type Request struct {
	Name           string        // variable name
	Namespace      string        // FrogoDB namespace
	GroupBy        []string      // source keys to group by
	Range          time.Duration // time window (0 = all-time custom mode)
	Fields         queries.Field // which fields to compute
	PercentileP    float64       // p value for percentile (0-1)
	Value          string        // source key for the numeric value
	DistinctBy     string        // source key for distinct counting
	TTL            time.Duration // REQUIRED. Expiration for stored data.
	IncludeCurrent bool          // true: write then read; false: read then write
}

Request describes a timeseries query. Range > 0 enables multi-granularity cell mode (fixed query). Range == 0 enables single-record all-time mode (custom query).

func (Request) Validate

func (r Request) Validate() error

Validate checks that required fields are set.

type Result

type Result struct {
	Count         int64
	Sum           float64
	Min           float64
	Max           float64
	DistinctCount int64
	SumQ          float64 // for STD
	SumLog        float64 // for GMean
	SumCos        float64 // for CMean
	SumSin        float64 // for CMean
	Percentile    float64 // from TDigest
}

Result holds aggregated timeseries values read from one or more cells.

func Execute

func Execute(ctx context.Context, c executeClient, req Request, src queries.Source) (Result, error)

Execute writes the current event and reads aggregated results. The order of write vs read is controlled by IncludeCurrent.

func Read

func Read(ctx context.Context, c readClient, req Request, src queries.Source) (Result, error)

Read performs a read-only query without writing the current event.

func (Result) Average

func (r Result) Average() float64

Average returns sum/count. Returns 0 when count is zero.

func (Result) CircularMean

func (r Result) CircularMean() float64

CircularMean returns atan2(sumSin, sumCos).

func (Result) GeometricMean

func (r Result) GeometricMean() float64

GeometricMean returns exp(sumLog / count). Returns 0 when count is zero.

func (Result) STD

func (r Result) STD() float64

STD returns the sample standard deviation: sqrt((sumQ - sum*sum/count) / (count-1)). Returns 0 when count < 2.

func (Result) ZScore

func (r Result) ZScore(currentValue float64) float64

ZScore returns (currentValue - avg) / std. Returns 0 when std is zero.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL