stringscore

package
v0.60.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 19, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BestPairCombinationJaroWinkler added in v0.51.0

func BestPairCombinationJaroWinkler(searchTokens []string, indexedTokens []string) float64

BestPairCombinationJaroWinkler compares a search query to an indexed term with improved handling of short words and spacing variations

func BestPairCombinationJaroWinklerWeighted added in v0.57.0

func BestPairCombinationJaroWinklerWeighted(searchTokens []string, indexedTokens []string, searchWeights []float64, indexWeights []float64) float64

BestPairCombinationJaroWinklerWeighted is like BestPairCombinationJaroWinkler but uses TF-IDF weights.

func BestPairsJaroWinkler

func BestPairsJaroWinkler(searchTokens []string, indexedTokens []string) float64

BestPairsJaroWinkler compares a search query to an indexed term (name, address, etc) and returns a decimal fraction score.

The algorithm splits each string into tokens, and does a pairwise Jaro-Winkler score of all token combinations (outer product). The best match for each search token is chosen, such that each index token can be matched at most once.

The pairwise scores are combined into an average in a way that corrects for character length, and the fraction of the indexed term that didn't match.

func BestPairsJaroWinklerWeighted added in v0.57.0

func BestPairsJaroWinklerWeighted(searchTokens []string, indexedTokens []string, searchWeights []float64, indexWeights []float64) float64

BestPairsJaroWinklerWeighted compares a search query to an indexed term using TF-IDF weights. The algorithm is similar to BestPairsJaroWinkler but uses TF-IDF weights instead of character length to weight the importance of each matched term pair.

searchWeights and indexWeights should have the same length as their corresponding token slices. If weights are nil or have different lengths, falls back to unweighted scoring.

func GenerateWordCombinations added in v0.51.0

func GenerateWordCombinations(tokens []string) [][]string

GenerateWordCombinations creates variations of the input words by combining short words with their neighbors, to handle cases like "JSC ARGUMENT" vs "JSCARGUMENT"

func JaroWinkler

func JaroWinkler(s1, s2 string) float64

jaroWinkler runs the similarly named algorithm over the two input strings and averages their match percentages according to the second string (assumed to be the user's query)

Terms are compared between a few adjacent terms and accumulate the highest near-neighbor match.

For more details see https://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance

func JaroWinklerWithFavoritism

func JaroWinklerWithFavoritism(indexedTerm, query string, favoritism float64) float64

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL