sparse

package
v0.36.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 19, 2026 License: MIT Imports: 19 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EnsureModelDownloaded

func EnsureModelDownloaded(ctx context.Context) (string, error)

EnsureModelDownloaded pulls the model files into the local cache if missing. We only need the tokenizers for sparse vector generation.

func GenerateSparseVector

func GenerateSparseVector(ctx context.Context, text string) (*schema.SparseVector, error)

GenerateSparseVector builds a normalized sparse vector from text using the registered provider. If no provider is registered, it uses the default BoWProvider for backward compatibility.

func RegisterProvider

func RegisterProvider(p Provider)

RegisterProvider registers a sparse vector provider, replacing the default.

Types

type BoWProvider

type BoWProvider struct {
	// contains filtered or unexported fields
}

BoWProvider implements the Provider interface using a Bag-of-Words approach with a pretrained tokenizer.

func NewBoWProvider

func NewBoWProvider() *BoWProvider

NewBoWProvider creates a new Bag-of-Words sparse provider.

func (*BoWProvider) GenerateSparseVector

func (p *BoWProvider) GenerateSparseVector(ctx context.Context, text string) (*schema.SparseVector, error)

GenerateSparseVector builds a normalized BOW sparse vector from text. Special tokens (PAD, CLS, SEP) are filtered to reduce noise.

type Provider

type Provider interface {
	GenerateSparseVector(ctx context.Context, text string) (*schema.SparseVector, error)
}

Provider defines the interface for generating sparse vectors.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL