core

package
v1.1.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 9, 2025 License: MIT Imports: 10 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// Minimum variance/norm threshold to avoid division by zero
	MinVarianceThreshold = 1e-8
)

Variables

This section is empty.

Functions

func ApplyTransform

func ApplyTransform(data types.Matrix, columns []int, transform TransformType) (types.Matrix, error)

ApplyTransform applies a mathematical transformation to specified columns

func CalculateConfidenceEllipse

func CalculateConfidenceEllipse(x, y []float64, confidenceLevel float64) (centerX, centerY, majorAxis, minorAxis, angle float64, err error)

CalculateConfidenceEllipse computes the parameters for a confidence ellipse for a 2D set of points at the specified confidence level. Returns center coordinates, semi-major/minor axes, and rotation angle.

Reference: Johnson & Wichern (2007) Applied Multivariate Statistical Analysis

func CalculateGroupEllipses

func CalculateGroupEllipses(scores mat.Matrix, groups []string, pcX, pcY int, confidenceLevel float64) (map[string]EllipseParams, error)

CalculateGroupEllipses computes confidence ellipse parameters for each group in the data. scores is a 2D matrix where each row is an observation, columns are PC scores. groups is a slice indicating the group membership of each observation. pcX and pcY are the indices of the principal components to use (0-based).

func CalculateMaxComponents added in v0.9.10

func CalculateMaxComponents(rows, cols int) int

CalculateMaxComponents calculates the maximum number of components for a data matrix

func CalculateMetricsFromPCAResult

func CalculateMetricsFromPCAResult(result *types.PCAResult, preprocessedData types.Matrix) ([]types.SampleMetrics, error)

CalculateMetricsFromPCAResult is a convenience function that calculates metrics directly from PCAResult

func CheckForConstantColumns added in v0.9.10

func CheckForConstantColumns(data types.Matrix) ([]int, error)

CheckForConstantColumns checks if any columns have zero or near-zero variance

func ComputeAutoCorrelation added in v1.1.0

func ComputeAutoCorrelation(data [][]float64, maxLag int) ([][]float64, error)

ComputeAutoCorrelation computes the autocorrelation function for lag selection guidance Returns autocorrelation values for each variable up to maxLag

func CopyMatrixData added in v0.9.10

func CopyMatrixData(source *mat.Dense) []float64

CopyMatrixData creates a flat array copy of matrix data for gonum operations

func CreateFloat2DSlice added in v0.9.10

func CreateFloat2DSlice(rows, cols int) [][]float64

CreateFloat2DSlice creates a new 2D float64 slice with specified dimensions

func CreateFloatSlice added in v0.9.10

func CreateFloatSlice(size int) []float64

CreateFloatSlice creates a new float64 slice of specified size

func CreateWorkingCopy added in v0.9.10

func CreateWorkingCopy(original *mat.Dense) *mat.Dense

CreateWorkingCopy creates a working copy of a matrix for deflation operations

func EstimateTemporalPCAMemory added in v1.1.0

func EstimateTemporalPCAMemory(samples, variables, lags int) (bytes int64, warning string)

EstimateTemporalPCAMemory estimates memory usage for temporal PCA Returns estimated bytes and a warning message if memory usage is high

func ExtractColumn added in v0.9.10

func ExtractColumn(m *mat.Dense, col int) []float64

ExtractColumn extracts a column from a matrix as a slice

func ExtractRow added in v0.9.10

func ExtractRow(m *mat.Dense, row int) []float64

ExtractRow extracts a row from a matrix as a slice

func GetColumnRanks

func GetColumnRanks(data types.Matrix) ([]int, error)

GetColumnRanks returns column indices sorted by variance (descending)

func GetVarianceByColumn

func GetVarianceByColumn(data types.Matrix) ([]float64, error)

GetVarianceByColumn calculates variance for each column

func InitializeMatrix added in v0.9.10

func InitializeMatrix(rows, cols int) *mat.Dense

InitializeMatrix creates a new matrix with specified dimensions

func InitializeScoresAndLoadings added in v0.9.10

func InitializeScoresAndLoadings(nSamples, nFeatures, nComponents int) (*mat.Dense, *mat.Dense)

InitializeScoresAndLoadings creates new score and loading matrices for PCA algorithms

func InitializeSquareMatrix added in v0.9.10

func InitializeSquareMatrix(size int) *mat.Dense

InitializeSquareMatrix creates a new square matrix

func InitializeVector added in v0.9.10

func InitializeVector(size int) *mat.VecDense

InitializeVector creates a new vector of specified size

func NewKernelPCAEngine

func NewKernelPCAEngine() types.PCAEngine

NewKernelPCAEngine creates a new Kernel PCA engine

func NewPCAEngine

func NewPCAEngine() types.PCAEngine

NewPCAEngine creates a new PCA engine instance

func NewPCAEngineForMethod

func NewPCAEngineForMethod(method string) types.PCAEngine

NewPCAEngineForMethod creates a PCA engine for the specified method

func NewTemporalPCAEngine added in v1.1.0

func NewTemporalPCAEngine() types.PCAEngine

NewTemporalPCAEngine creates a new Temporal PCA engine instance

func RemoveOutliers

func RemoveOutliers(data types.Matrix, threshold float64) (types.Matrix, []int, error)

RemoveOutliers removes outliers based on z-score

func ValidateComponentCount added in v0.9.10

func ValidateComponentCount(components, maxComponents int) error

ValidateComponentCount validates the number of components requested

func ValidateDataForPCA

func ValidateDataForPCA(data types.Matrix, selectedCols []int) error

ValidateDataForPCA checks if data is suitable for PCA after handling missing values

func ValidateDataMatrix added in v0.9.10

func ValidateDataMatrix(data types.Matrix) error

ValidateDataMatrix validates the basic structure and content of a data matrix

func ValidateKernelConfig added in v0.9.10

func ValidateKernelConfig(config types.PCAConfig) error

ValidateKernelConfig validates kernel-specific configuration

func ValidateNaNValues added in v0.9.10

func ValidateNaNValues(data types.Matrix, allowNaN bool) error

ValidateNaNValues checks for NaN values in the data matrix

func ValidatePCAInput added in v0.9.10

func ValidatePCAInput(data types.Matrix, config types.PCAConfig) error

ValidatePCAInput performs complete validation for PCA input

func ValidateVectorPair added in v0.9.10

func ValidateVectorPair(x, y []float64) error

ValidateVectorPair validates that two vectors have the same length

Types

type CorrelationRequest

type CorrelationRequest struct {
	Scores              mat.Matrix           // PC scores matrix (samples × components)
	MetadataNumeric     map[string][]float64 // Numeric metadata columns
	MetadataCategorical map[string][]string  // Categorical metadata columns
	Components          []int                // Which PCs to include (0-based)
	Method              string               // "pearson" or "spearman"
}

CorrelationRequest defines the input for correlation calculations

type CorrelationResult

type CorrelationResult struct {
	Correlations map[string][]float64 // Variable name -> correlations with each PC
	PValues      map[string][]float64 // Variable name -> p-values
	Variables    []string             // Order of variables
	Components   []string             // PC labels
}

CorrelationResult contains the correlation analysis results

func CalculateEigencorrelations

func CalculateEigencorrelations(request CorrelationRequest) (*CorrelationResult, error)

CalculateEigencorrelations computes correlations between PC scores and metadata variables

This function calculates Pearson or Spearman correlations between principal component scores and external metadata variables (both numeric and categorical). For categorical variables, one-hot encoding is performed before correlation calculation.

Reference: Jolliffe, I.T. (2002). Principal Component Analysis, 2nd edition. Springer.

type EllipseParams

type EllipseParams struct {
	CenterX         float64
	CenterY         float64
	MajorAxis       float64
	MinorAxis       float64
	Angle           float64 // in radians
	ConfidenceLevel float64
}

EllipseParams contains parameters for drawing a confidence ellipse

type KernelPCAImpl

type KernelPCAImpl struct {
	// contains filtered or unexported fields
}

KernelPCAImpl implements the PCAEngine interface for Kernel PCA Kernel PCA performs nonlinear dimensionality reduction by projecting data into a higher-dimensional feature space using kernel functions, then performing PCA in that space. Reference: Schölkopf, B., Smola, A., & Müller, K.R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299-1319.

func (*KernelPCAImpl) Fit

func (kpca *KernelPCAImpl) Fit(data types.Matrix, config types.PCAConfig) (*types.PCAResult, error)

Fit trains the Kernel PCA model on the provided data

func (*KernelPCAImpl) FitTransform

func (kpca *KernelPCAImpl) FitTransform(data types.Matrix, config types.PCAConfig) (*types.PCAResult, error)

FitTransform fits the model and transforms the data in one step

func (*KernelPCAImpl) Transform

func (kpca *KernelPCAImpl) Transform(data types.Matrix) (types.Matrix, error)

Transform projects new data into the kernel PCA space

type KernelType

type KernelType string

KernelType represents the type of kernel function to use

const (
	// KernelRBF is the Radial Basis Function (Gaussian) kernel
	KernelRBF KernelType = "rbf"
	// KernelLinear is the linear kernel (equivalent to standard PCA)
	KernelLinear KernelType = "linear"
	// KernelPoly is the polynomial kernel
	KernelPoly KernelType = "poly"
)

type MissingValueHandler

type MissingValueHandler struct {
	// contains filtered or unexported fields
}

MissingValueHandler handles missing values in data matrices

func NewMissingValueHandler

func NewMissingValueHandler(strategy types.MissingValueStrategy) *MissingValueHandler

NewMissingValueHandler creates a new missing value handler

func (*MissingValueHandler) HandleMissingValues

func (h *MissingValueHandler) HandleMissingValues(data types.Matrix, missingInfo *types.MissingValueInfo, selectedCols []int) (types.Matrix, error)

HandleMissingValues processes missing values according to the specified strategy It only considers missing values in the selected columns

type PCAImpl

type PCAImpl struct {
	// contains filtered or unexported fields
}

PCAImpl implements the PCAEngine interface

func (*PCAImpl) Fit

func (p *PCAImpl) Fit(data types.Matrix, config types.PCAConfig) (*types.PCAResult, error)

Fit trains the PCA model on the provided data

func (*PCAImpl) FitTransform

func (p *PCAImpl) FitTransform(data types.Matrix, config types.PCAConfig) (*types.PCAResult, error)

FitTransform fits the model and transforms the data in one step

func (*PCAImpl) SetLoadings

func (p *PCAImpl) SetLoadings(loadings types.Matrix, nComponents int) error

SetLoadings sets the loadings matrix and marks the engine as fitted

func (*PCAImpl) SetPreprocessor

func (p *PCAImpl) SetPreprocessor(preprocessor *Preprocessor)

SetPreprocessor sets the preprocessor for the PCA engine

func (*PCAImpl) Transform

func (p *PCAImpl) Transform(data types.Matrix) (types.Matrix, error)

Transform applies the fitted PCA model to new data

type PCAMetricsCalculator

type PCAMetricsCalculator struct {
	// contains filtered or unexported fields
}

PCAMetricsCalculator calculates advanced metrics for PCA results

func NewPCAMetricsCalculator

func NewPCAMetricsCalculator(scores, loadings *mat.Dense, mean, stdDev []float64) *PCAMetricsCalculator

NewPCAMetricsCalculator creates a new metrics calculator

func (*PCAMetricsCalculator) CalculateMetrics

func (m *PCAMetricsCalculator) CalculateMetrics(originalData types.Matrix) ([]types.SampleMetrics, error)

CalculateMetrics computes all metrics for each sample

func (*PCAMetricsCalculator) CalculateQLimits

func (m *PCAMetricsCalculator) CalculateQLimits(eigenvalues []float64, totalComponents int) (limit95, limit99 float64)

CalculateQLimits calculates the confidence limits for Q-residuals (SPE - Squared Prediction Error) Reference: Jackson, J.E., & Mudholkar, G.S. (1979). Control procedures for residuals associated with principal component analysis. Technometrics, 21(3), 341-349.

func (*PCAMetricsCalculator) CalculateT2Limits

func (m *PCAMetricsCalculator) CalculateT2Limits() (limit95, limit99 float64)

CalculateT2Limits calculates the confidence limits for Hotelling's T² statistic

type Preprocessor

type Preprocessor struct {
	// Preprocessing parameters
	MeanCenter    bool
	StandardScale bool
	RobustScale   bool
	ScaleOnly     bool
	SNV           bool
	VectorNorm    bool
	// contains filtered or unexported fields
}

Preprocessor handles data preprocessing for PCA

func NewPreprocessor

func NewPreprocessor(meanCenter, standardScale, robustScale bool) *Preprocessor

NewPreprocessor creates a new preprocessor instance

func NewPreprocessorFull

func NewPreprocessorFull(meanCenter, standardScale, robustScale, snv, vectorNorm bool) *Preprocessor

NewPreprocessorFull creates a new preprocessor instance with all options

func NewPreprocessorWithScaleOnly

func NewPreprocessorWithScaleOnly(meanCenter, standardScale, robustScale, scaleOnly, snv, vectorNorm bool) *Preprocessor

NewPreprocessorWithScaleOnly creates a new preprocessor instance with scale-only option

func (*Preprocessor) Fit

func (p *Preprocessor) Fit(data types.Matrix) error

Fit calculates preprocessing parameters from the data

func (*Preprocessor) FitTransform

func (p *Preprocessor) FitTransform(data types.Matrix) (types.Matrix, error)

FitTransform fits the preprocessor and transforms the data

func (*Preprocessor) GetMADs

func (p *Preprocessor) GetMADs() []float64

GetMADs returns the fitted MAD (Median Absolute Deviation) values

func (*Preprocessor) GetMeans

func (p *Preprocessor) GetMeans() []float64

GetMeans returns the fitted mean values

func (*Preprocessor) GetMedians

func (p *Preprocessor) GetMedians() []float64

GetMedians returns the fitted median values

func (*Preprocessor) GetRowMeans

func (p *Preprocessor) GetRowMeans() []float64

GetRowMeans returns the fitted row mean values (for SNV)

func (*Preprocessor) GetRowStdDevs

func (p *Preprocessor) GetRowStdDevs() []float64

GetRowStdDevs returns the fitted row standard deviation values (for SNV)

func (*Preprocessor) GetStdDevs

func (p *Preprocessor) GetStdDevs() []float64

GetStdDevs returns the fitted standard deviation values (original, before scaling)

func (*Preprocessor) InverseTransform

func (p *Preprocessor) InverseTransform(data types.Matrix) (types.Matrix, error)

InverseTransform reverses the preprocessing Note: When SNV is combined with column-wise preprocessing, the inverse transform only reverses the column-wise operations. Full reversal of SNV after column preprocessing would require storing the full transformed matrix.

func (*Preprocessor) IsSNVEnabled

func (p *Preprocessor) IsSNVEnabled() bool

IsSNVEnabled returns whether SNV preprocessing is enabled

func (*Preprocessor) SetFittedParameters

func (p *Preprocessor) SetFittedParameters(means, stdDevs, medians, mads, rowMeans, rowStdDevs []float64) error

SetFittedParameters sets the fitted parameters for the preprocessor

func (*Preprocessor) Transform

func (p *Preprocessor) Transform(data types.Matrix) (types.Matrix, error)

Transform applies the preprocessing to data

type TemporalPCAImpl added in v1.1.0

type TemporalPCAImpl struct {
	// contains filtered or unexported fields
}

TemporalPCAImpl implements the PCAEngine interface for Temporal PCA (SSA-style) Based on Singular Spectrum Analysis (SSA) methodology.

References: - Broomhead & King (1986): "Extracting qualitative dynamics from experimental data" - Vautard & Ghil (1989): "Singular spectrum analysis in nonlinear dynamics" - Golyandina et al. (2001): "Analysis of Time Series Structure: SSA and related techniques" - Ghil et al. (2002): "Advanced spectral methods for climatic time series"

func (*TemporalPCAImpl) ComputeVariableImportance added in v1.1.0

func (t *TemporalPCAImpl) ComputeVariableImportance() ([][]float64, error)

ComputeVariableImportance aggregates loadings across lags to show variable importance Returns a matrix where rows are components and columns are original variables Uses RMS (Root Mean Square) aggregation to capture overall contribution strength

func (*TemporalPCAImpl) Fit added in v1.1.0

func (t *TemporalPCAImpl) Fit(data types.Matrix, config types.PCAConfig) (*types.PCAResult, error)

Fit trains the Temporal PCA model on the provided data Algorithm complexity: O((T-L+1) × (p×L)²) for SVD computation where T is samples, p is variables, L is lags

func (*TemporalPCAImpl) FitTransform added in v1.1.0

func (t *TemporalPCAImpl) FitTransform(data types.Matrix, config types.PCAConfig) (*types.PCAResult, error)

FitTransform fits the model and transforms the data in one step

func (*TemporalPCAImpl) GetLagContributions added in v1.1.0

func (t *TemporalPCAImpl) GetLagContributions() ([][]float64, error)

GetLagContributions returns the contribution of each lag to the total variance Returns a matrix where rows are components and columns are lags

func (*TemporalPCAImpl) GetLoadingForLag added in v1.1.0

func (t *TemporalPCAImpl) GetLoadingForLag(variable, lag, component int) (float64, error)

GetLoadingForLag returns the loading vector for a specific variable and lag variable: 0-indexed variable number lag: 0-indexed lag number (0 = current time, 1 = t-1, etc.) component: 0-indexed component number

func (*TemporalPCAImpl) ReconstructionError added in v1.1.0

func (t *TemporalPCAImpl) ReconstructionError(data types.Matrix) ([]float64, error)

ReconstructionError computes the reconstruction error for each sample This is computed in the standardized lag space as per SSA methodology Algorithm complexity: O((T-L+1) × (p×L) × k) where k is the number of components

func (*TemporalPCAImpl) Transform added in v1.1.0

func (t *TemporalPCAImpl) Transform(data types.Matrix) (types.Matrix, error)

Transform applies the fitted Temporal PCA model to new data

type TransformType

type TransformType string

VariableTransform applies mathematical transformations to variables

const (
	TransformLog        TransformType = "log"
	TransformSqrt       TransformType = "sqrt"
	TransformSquare     TransformType = "square"
	TransformReciprocal TransformType = "reciprocal"
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL