histogram

package module
v3.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 6, 2022 License: Apache-2.0 Imports: 2 Imported by: 1

README

Histogram

Build Status GoDoc

Fast Go implementation of Ben-Haim's and Yom-Tov's streaming histogram algorithm, as described in their A Streaming Parallel Decision Tree Algorithm (2010, PDF) paper.

Documentation

Please see the API documentation for package and API descriptions and examples.

Credits

Documentation

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Histogram

type Histogram struct {
	// contains filtered or unexported fields
}

Histogram is a probabilistic, fixed-size data structure, able to accommodate massive data streams while predicting distributions and quantiles much more accurately than a sample-based approach.

Please note that a historgram is not thread-safe. All operations must be protected by a mutex if used across multiple goroutines.

Example
// Create a new instance
h := histogram.New(16)

// Add values
for i := 1; i < 100; i++ {
	h.Add(float64(i))
}

fmt.Printf("min  : %.1f\n", h.Min())
fmt.Printf("max  : %.1f\n", h.Max())
fmt.Printf("sum  : %.0f\n", h.Sum())
fmt.Printf("mean : %.1f\n", h.Mean())
fmt.Printf("p50  : %.1f\n", h.Quantile(0.5))
fmt.Printf("p95  : %.1f\n", h.Quantile(0.95))
Output:
min  : 1.0
max  : 99.0
sum  : 4950
mean : 50.0
p50  : 49.8
p95  : 94.6

func New

func New(sz int) *Histogram

New creates a new histogram with a maximum size.

func (*Histogram) Add

func (h *Histogram) Add(v float64)

Add is the same as AddWeight(v, 1)

func (*Histogram) AddN

func (h *Histogram) AddN(v float64, n int)

AddN is the same as AddWeight(v, float64(n))

func (*Histogram) AddWeight

func (h *Histogram) AddWeight(v, w float64)

AddWeight adds observations of v with the weight w to the distribution.

func (*Histogram) Bin

func (h *Histogram) Bin(i int) (value, weight float64)

Bin returns bin (bucket) data. Requested index must be 0 <= i < NumBins() or it will panic.

func (*Histogram) Copy

func (h *Histogram) Copy(x *Histogram) *Histogram

Copy copies h to x and returns x. If x is passed as nil a new Histogram will be inited.

func (*Histogram) Count

func (h *Histogram) Count() int

Count returns the observed weight truncated to the next integer.

func (*Histogram) Max

func (h *Histogram) Max() float64

Max returns the largest observed value. Returns NaN if Count is zero.

func (*Histogram) Mean

func (h *Histogram) Mean() float64

Mean returns the (approximate) average observed value. Returns NaN if Count is zero.

func (*Histogram) Merge

func (h *Histogram) Merge(x, y *Histogram)

Merge sets h to the union x ∪ y.

func (*Histogram) MergeWith

func (h *Histogram) MergeWith(x *Histogram)

MergeWith sets h to the union h ∪ x.

func (*Histogram) Min

func (h *Histogram) Min() float64

Min returns the smallest observed value. Returns NaN if Count is zero.

func (*Histogram) NumBins

func (h *Histogram) NumBins() int

NumBins returns bin (bucket) count.

func (*Histogram) Quantile

func (h *Histogram) Quantile(q float64) float64

Quantile returns the (approximate) quantile of the distribution. Accepted values for q are between 0.0 and 1.0. Returns NaN if Count is zero or bad inputs.

func (*Histogram) Reset

func (h *Histogram) Reset(sz int)

Reset resets the struct to its initial state with a specific size.

func (*Histogram) Sum

func (h *Histogram) Sum() float64

Sum returns the (approximate) sum of all observed values. Returns NaN if Count is zero.

func (*Histogram) Variance

func (h *Histogram) Variance() float64

Variance returns the (approximate) sample variance of the distribution. Returns NaN if Count is zero.

func (*Histogram) Weight

func (h *Histogram) Weight() float64

Weight returns the observed weight (usually, the number of items seen).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL