cluster

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 16, 2026 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package cluster provides clustering algorithms for vector indexing.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CosineSimilarity

func CosineSimilarity(a, b []float64) float64

CosineSimilarity computes cosine similarity between two vectors.

func NormalizeVector

func NormalizeVector(v []float64) []float64

NormalizeVector returns a unit-length copy of the vector.

Types

type Config

type Config struct {
	K         int     // Number of clusters
	MaxIter   int     // Maximum iterations (default: 100)
	Tolerance float64 // Convergence tolerance (default: 1e-4)
	Seed      int64   // Random seed
	NumInit   int     // Number of initializations to try (default: 1). Best inertia wins.
}

Config holds k-means configuration.

func DefaultConfig

func DefaultConfig(k int) Config

DefaultConfig returns default k-means configuration.

type KMeans

type KMeans struct {
	K          int         // Number of clusters
	MaxIter    int         // Maximum iterations
	Tolerance  float64     // Convergence tolerance
	Seed       int64       // Random seed for reproducibility
	NumInit    int         // Number of initializations (best inertia wins)
	Centroids  [][]float64 // Cluster centroids (after Fit)
	Labels     []int       // Cluster assignment for each vector (after Fit)
	Iterations int         // Actual iterations run
	Inertia    float64     // Final inertia (sum of squared distances to centroids)
}

KMeans performs k-means clustering on vectors.

func NewKMeans

func NewKMeans(cfg Config) *KMeans

NewKMeans creates a new k-means clusterer.

func (*KMeans) Fit

func (km *KMeans) Fit(vectors [][]float64) error

Fit runs k-means clustering on the given vectors. If NumInit > 1, runs multiple initializations in parallel and keeps the best result.

func (*KMeans) GetClusterSizes

func (km *KMeans) GetClusterSizes() []int

GetClusterSizes returns the number of vectors in each cluster.

func (*KMeans) Predict

func (km *KMeans) Predict(query []float64) int

Predict returns the nearest centroid index for a query vector.

func (*KMeans) PredictTopK

func (km *KMeans) PredictTopK(query []float64, k int) []int

PredictTopK returns the indices of the K nearest centroids.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL