The highest tagged major version is v2.

resampler

package

v1.0.2 Latest Latest Go to latest Published: Dec 18, 2025 License: MIT Imports: 3 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ClusterCockpit/cc-lib

Links

Open Source Insights

README ¶

Resampler Package

The resampler package provides efficient time-series data downsampling algorithms for reducing the number of data points while preserving important characteristics of the data.

Overview

When working with time-series data, it's often necessary to reduce the number of data points for visualization, storage, or transmission purposes. This package implements two downsampling strategies:

SimpleResampler: Fast, straightforward downsampling by selecting every nth point
LargestTriangleThreeBucket (LTTB): Perceptually-aware downsampling that preserves visual characteristics

Algorithms

SimpleResampler

The SimpleResampler function performs simple downsampling by selecting every nth point from the input data.

Characteristics:

Speed: Fastest algorithm, O(n) time complexity
Quality: May miss important features like peaks and valleys
Use case: When speed is critical and data is relatively uniform

Example:

import "github.com/ClusterCockpit/cc-lib/resampler"
import "github.com/ClusterCockpit/cc-lib/schema"

data := []schema.Float{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0}
downsampled, newFreq, err := resampler.SimpleResampler(data, 1, 2)
if err != nil {
    log.Fatal(err)
}
// downsampled contains every 2nd point: [1.0, 3.0, 5.0, 7.0]

LargestTriangleThreeBucket (LTTB)

The LargestTriangleThreeBucket function implements a sophisticated downsampling algorithm that preserves the visual characteristics of time-series data by selecting points that form the largest triangles with their neighbors.

Characteristics:

Speed: Still efficient, O(n) time complexity
Quality: Excellent preservation of peaks, valleys, and trends
Use case: When visual fidelity is important (charts, graphs, monitoring dashboards)

How it works:

The data is divided into buckets (first and last points are always kept)
For each bucket, the algorithm selects the point that forms the largest triangle with:
- The previously selected point
- The average of the next bucket
This maximizes visual area and preserves important features

Example:

import "github.com/ClusterCockpit/cc-lib/resampler"
import "github.com/ClusterCockpit/cc-lib/schema"

// Generate some sample data with peaks
data := make([]schema.Float, 1000)
for i := range data {
    data[i] = schema.Float(math.Sin(float64(i) * 0.1))
}

// Downsample from 1000 points to 100 points
downsampled, newFreq, err := resampler.LargestTriangleThreeBucket(data, 1, 10)
if err != nil {
    log.Fatal(err)
}
// downsampled contains 100 points that preserve the sine wave's visual characteristics

API Reference

SimpleResampler

func SimpleResampler(data []schema.Float, oldFrequency int64, newFrequency int64) ([]schema.Float, int64, error)

Parameters:

data: Input time-series data points
oldFrequency: Original sampling frequency (points per time unit)
newFrequency: Target sampling frequency (must be a multiple of oldFrequency)

Returns:

Downsampled data slice
Actual frequency used (may be oldFrequency if downsampling wasn't performed)
Error if newFrequency is not a multiple of oldFrequency

LargestTriangleThreeBucket

func LargestTriangleThreeBucket(data []schema.Float, oldFrequency int64, newFrequency int64) ([]schema.Float, int64, error)

Parameters:

data: Input time-series data points
oldFrequency: Original sampling frequency (points per time unit)
newFrequency: Target sampling frequency (must be a multiple of oldFrequency)

Returns:

Downsampled data slice
Actual frequency used (may be oldFrequency if downsampling wasn't performed)
Error if newFrequency is not a multiple of oldFrequency

Behavior Notes

Both functions will return the original data unchanged if:

Either frequency is 0
newFrequency <= oldFrequency (no downsampling needed)
The resulting data would have fewer than 1 point
The original data has fewer than 100 points
The downsampled data would have the same or more points than the original

NaN Handling

Both algorithms properly handle NaN (Not a Number) values:

SimpleResampler: Preserves NaN values if they fall on selected points
LargestTriangleThreeBucket: Considers NaN values in area calculations and preserves them appropriately

Performance Comparison

Algorithm	Time Complexity	Space Complexity	Visual Quality	Speed
SimpleResampler	O(n)	O(m)	Good	Fastest
LTTB	O(n)	O(m)	Excellent	Fast

Where:

n = number of input points
m = number of output points

Benchmark results (10,000 input points → 1,000 output points):

BenchmarkSimpleResampler-8                  50000    ~25000 ns/op
BenchmarkLargestTriangleThreeBucket-8       10000    ~120000 ns/op

LTTB is approximately 4-5x slower than SimpleResampler but still very fast for most use cases.

When to Use Which Algorithm

Use SimpleResampler when:

Speed is the primary concern
Data is relatively uniform without important features
You need the absolute fastest downsampling
Visual quality is not critical

Use LargestTriangleThreeBucket when:

Displaying data in charts or graphs
Visual fidelity is important
Data contains peaks, valleys, or trends that must be preserved
You're building monitoring dashboards or visualization tools
The slight performance overhead is acceptable

References

LTTB Algorithm Paper: Downsampling Time Series for Visual Representation by Sveinn Steinarsson
Original Implementation: haoel/downsampling

License

Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
Licensed under the MIT License. See LICENSE file for details.

Documentation ¶

Overview ¶

Package resampler provides time-series data downsampling algorithms.

This package implements two downsampling strategies for reducing the number of data points in time-series data while preserving important characteristics:

SimpleResampler: A fast, straightforward algorithm that selects every nth point
LargestTriangleThreeBucket (LTTB): A perceptually-aware algorithm that preserves visual characteristics by selecting points that maximize the area of triangles formed with neighboring points

Both algorithms are designed to work with schema.Float data and handle NaN values appropriately. They require that the new sampling frequency is a multiple of the old frequency.

References:

LTTB Algorithm: https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf
Implementation adapted from: https://github.com/haoel/downsampling

Index ¶

Variables
func LargestTriangleThreeBucket(data []schema.Float, oldFrequency int64, newFrequency int64) ([]schema.Float, int64, error)
func SetMinimumRequiredPoints(setVal int)
func SimpleResampler(data []schema.Float, oldFrequency int64, newFrequency int64) ([]schema.Float, int64, error)

Constants ¶

This section is empty.

Variables ¶

View Source

var MinimumRequiredPoints int = 1000

Default number of points required to trigger resampling. Otherwise, time series of original timestep will be returned without resampling

Functions ¶

func LargestTriangleThreeBucket ¶

func LargestTriangleThreeBucket(data []schema.Float, oldFrequency int64, newFrequency int64) ([]schema.Float, int64, error)

LargestTriangleThreeBucket (LTTB) performs perceptually-aware downsampling.

LTTB is a downsampling algorithm that preserves the visual characteristics of time-series data by selecting points that form the largest triangles with their neighbors. This ensures that important peaks, valleys, and trends are retained even when significantly reducing the number of points.

Algorithm Overview:

The data is divided into buckets (except first and last points which are always kept)
For each bucket, the algorithm selects the point that forms the largest triangle with the previous selected point and the average of the next bucket
This maximizes the visual area and preserves important features

Time Complexity: O(n) where n is the number of input points Space Complexity: O(m) where m is the number of output points

Parameters:

data: input time-series data points
oldFrequency: original sampling frequency (points per time unit)
newFrequency: target sampling frequency (must be a multiple of oldFrequency)

Returns:

Downsampled data slice
Actual frequency used (may be oldFrequency if downsampling wasn't performed)
Error if newFrequency is not a multiple of oldFrequency

The function returns the original data unchanged if:

Either frequency is 0
newFrequency <= oldFrequency (no downsampling needed)
The resulting data would have fewer than 1 point
The original data has fewer than 100 points
The downsampled data would have the same or more points than the original

NaN Handling: The algorithm properly handles NaN values and preserves them in the output when they represent the maximum area point in a bucket.

Example:

data := []schema.Float{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0}
downsampled, freq, err := LargestTriangleThreeBucket(data, 1, 2)
// Returns downsampled data preserving visual characteristics

References:

Original paper: https://skemman.is/bitstream/1946/15343/3/SS_MSthesis.pdf
Adapted from: https://github.com/haoel/downsampling/blob/master/core/lttb.go

func SetMinimumRequiredPoints ¶ added in v1.0.0

func SetMinimumRequiredPoints(setVal int)

func SimpleResampler ¶

func SimpleResampler(data []schema.Float, oldFrequency int64, newFrequency int64) ([]schema.Float, int64, error)

SimpleResampler performs simple downsampling by selecting every nth point.

This is the fastest downsampling method but may miss important features in the data. It works by calculating a step size (newFrequency / oldFrequency) and selecting every step-th point from the original data.

Parameters:

data: input time-series data points
oldFrequency: original sampling frequency (points per time unit)
newFrequency: target sampling frequency (must be a multiple of oldFrequency)

Returns:

Downsampled data slice
Actual frequency used (may be oldFrequency if downsampling wasn't performed)
Error if newFrequency is not a multiple of oldFrequency

The function returns the original data unchanged if:

Either frequency is 0
newFrequency <= oldFrequency (no downsampling needed)
The resulting data would have fewer than 1 point
The original data has fewer than 100 points
The downsampled data would have the same or more points than the original

Example:

data := []schema.Float{1.0, 2.0, 3.0, 4.0, 5.0, 6.0}
downsampled, freq, err := SimpleResampler(data, 1, 2)
// Returns: [1.0, 3.0, 5.0], 2, nil

Types ¶

This section is empty.

Source Files ¶

View all Source files

resampler.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL