stats

package
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 14, 2025 License: MIT Imports: 5 Imported by: 0

README

stats

stats provides a Starlark module for comprehensive statistics functions. It is a wrapper around the Go package: https://github.com/montanaflynn/stats

Functions

euclidean_distance(data1, data2) float

Calculates the straight line distance between two points in Euclidean space.

Parameters
name type description
data1 list First dataset of numerical values.
data2 list Second dataset of numerical values.
Examples

basic

Calculate Euclidean distance between two points.

load("stats", "euclidean_distance")
print(euclidean_distance([3, 4], [0, 0]))  # Output: 5.0
manhattan_distance(data1, data2) float

Computes the sum of the absolute differences of their coordinates.

Parameters
name type description
data1 list First dataset of numerical values.
data2 list Second dataset of numerical values.
Examples

basic

Calculate Manhattan distance between two points.

load("stats", "manhattan_distance")
print(manhattan_distance([3, 4], [0, 0]))  # Output: 7.0
softmax(data) list

Applies the Softmax function, useful for converting scores to probabilities.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Apply Softmax function to a dataset.

load("stats", "softmax")
print(softmax([1, 1, 1]))  # Output: [0.3333333333333333, 0.3333333333333333, 0.3333333333333333]
sigmoid(data) list

Applies the Sigmoid function, often used for binary classification.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Apply Sigmoid function to a dataset.

load("stats", "sigmoid")
print(sigmoid([0, 2, 4]))  # Output: [0.5, 0.8807970779778823, 0.9820137900379085]
mode(data) list

Determines the most frequently occurring data points in a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate mode of a dataset.

load("stats", "mode")
print(mode([1, 2, 2, 3, 4]))  # Output: [2.0]
sum(data) float

Computes the sum of a series of numbers.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate sum of a dataset.

load("stats", "sum")
print(sum([1, 2, 3, 4]))  # Output: 10.0
max(data) float

Finds the maximum value in a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Find maximum value in a dataset.

load("stats", "max")
print(max([1, 2, 3, 4]))  # Output: 4.0
min(data) float

Finds the minimum value in a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Find minimum value in a dataset.

load("stats", "min")
print(min([1, 2, 3, 4]))  # Output: 1.0
midrange(data) float

Calculates the midrange, the average of the maximum and minimum values.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate midrange of a dataset.

load("stats", "midrange")
print(midrange([1, 2, 3, 4]))  # Output: 2.5
average(data) float

Alias for mean. Calculates the arithmetic mean of a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate average of a dataset.

load("stats", "average")
print(average([1, 2, 2, 3]))  # Output: 2.0
mean(data) float

Calculates the arithmetic mean of a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate mean of a dataset.

load("stats", "mean")
print(mean([1, 2, 3, 4]))  # Output: 2.5
geometric_mean(data) float

Computes the geometric mean, useful for datasets with exponential growth.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate geometric mean of a dataset.

load("stats", "geometric_mean")
print(geometric_mean([1, 2, 3, 4]))  # Output: 2.213363839400643
harmonic_mean(data) float

Computes the harmonic mean, effective for rates and ratios.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate harmonic mean of a dataset.

load("stats", "harmonic_mean")
print(harmonic_mean([1, 2, 3, 6]))  # Output: 2.0
trimean(data) float

Calculates the trimean, a measure of a dataset's tendency.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate trimean of a dataset.

load("stats", "trimean")
print(trimean([1, 2, 3, 4, 5]))  # Output: 3.0
median(data) float

Finds the middle value of a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate median of a dataset.

load("stats", "median")
print(median([1, 2, 3, 4, 5]))  # Output: 3.0
percentile(data, p) float

Determines the value below which a given percentage of observations in a group of observations falls.

Parameters
name type description
data list Dataset of numerical values.
p float The percentile to compute.
Examples

basic

Calculate 50th percentile of a dataset.

load("stats", "percentile")
print(percentile([1, 2, 3, 4, 5], 50))  # Output: 2.5
percentile_nearest_rank(data, p) float

Determines the value below which a given percentage of observations in a group of observations falls using the nearest rank method.

Parameters
name type description
data list Dataset of numerical values.
p float The percentile to compute.
Examples

basic

Calculate 50th percentile using nearest rank method.

load("stats", "percentile_nearest_rank")
print(percentile_nearest_rank([1, 2, 3, 4, 5], 50))  # Output: 3.0
variance(data) float

Calculates the variance of a dataset, a measure of dispersion.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate variance of a dataset.

load("stats", "variance")
print(variance([1, 2, 3, 4, 5]))  # Output: 2.0
covariance(data1, data2) float

Measures how changes in one variable are associated with changes in another variable.

Parameters
name type description
data1 list First dataset of numerical values.
data2 list Second dataset of numerical values.
Examples

basic

Calculate covariance between two datasets.

load("stats", "covariance")
print(covariance([1, 2, 3], [4, 5, 6]))  # Output: 1.0
covariance_population(data1, data2) float

Calculates the covariance for an entire population.

Parameters
name type description
data1 list First dataset of numerical values.
data2 list Second dataset of numerical values.
Examples

basic

Calculate population covariance between two datasets.

load("stats", "covariance_population")
print(covariance_population([1, 2, 3], [4, 5, 6]))  # Output: 1.0
population_variance(data) float

Computes the variance of an entire population.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate population variance of a dataset.

load("stats", "population_variance")
print(population_variance([1, 2, 3, 4, 5]))  # Output: 2.0
sample_variance(data) float

Computes the variance of a sample from the population.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate sample variance of a dataset.

load("stats", "sample_variance")
print(sample_variance([1, 2, 3, 4, 5]))  # Output: 2.5
correlation(data1, data2) float

Computes the correlation coefficient between two datasets.

Parameters
name type description
data1 list First dataset of numerical values.
data2 list Second dataset of numerical values.
Examples

basic

Calculate correlation between two datasets.

load("stats", "correlation")
print(correlation([1, 2, 3], [1, 2, 3]))  # Output: 1.0
pearson(data1, data2) float

Computes Pearson's correlation coefficient.

Parameters
name type description
data1 list First dataset of numerical values.
data2 list Second dataset of numerical values.
Examples

basic

Calculate Pearson's correlation coefficient.

load("stats", "pearson")
print(pearson([1, 2, 3], [1, 2, 3]))  # Output: 1.0
standard_deviation(data) float

Calculates the standard deviation of a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate standard deviation of a dataset.

load("stats", "standard_deviation")
print(standard_deviation([1, 2, 3, 4, 5]))  # Output: 1.4142135623730951
stddev(data) float

Alias for standard_deviation. Calculates the standard deviation of a dataset.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate stddev of a dataset.

load("stats", "stddev")
print(stddev([1, 2, 3, 4, 5]))  # Output: 1.4142135623730951
stddev_sample(data) float

Calculates the standard deviation of a sample.

Parameters
name type description
data list Dataset of numerical values.
Examples

basic

Calculate standard deviation of a sample.

load("stats", "stddev_sample")
print(stddev_sample([1, 2, 3, 4, 5]))  # Output: 1.5811388300841898
sample(data, take, replace=False) list

Randomly samples elements from the dataset.

Parameters
name type description
data list Dataset of numerical values.
take int Number of elements to sample.
replace bool Optional. If True, sampling is done with replacement.
Examples

basic

Sample elements from a dataset.

load("stats", "sample")
print(sample(data=[1, 2, 3, 4], take=3, replace=False))

Documentation

Overview

Package stats provides a Starlark module for comprehensive statistics functions. It's a wrapper around the Go package: https://github.com/montanaflynn/stats

Index

Constants

View Source
const ModuleName = "stats"

ModuleName defines the expected name for this Module when used in starlark's load() function, eg: load('stats', 'md5')

Variables

This section is empty.

Functions

func LoadModule

func LoadModule() (starlark.StringDict, error)

LoadModule loads the hashlib module. It is concurrency-safe and idempotent.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL