Documentation
¶
Overview ¶
Fuzzy searching allows for flexibly matching a string with partial input, useful for filtering data very quickly based on lightweight user input.
Index ¶
- func Analyze(text string) []string
- func Find(source string, targets []string) []string
- func FindFold(source string, targets []string) []string
- func FindNormalized(source string, targets []string) []string
- func FindNormalizedFold(source string, targets []string) []string
- func Intersection(a []int, b []int) []int
- func LevenshteinDistance(s, t string) int
- func LowercaseFilter(tokens []string) []string
- func Match(source, target string) bool
- func MatchFold(source, target string) bool
- func MatchNormalized(source, target string) bool
- func MatchNormalizedFold(source, target string) bool
- func RankMatch(source, target string) int
- func RankMatchFold(source, target string) int
- func RankMatchNormalized(source, target string) int
- func RankMatchNormalizedFold(source, target string) int
- func StopwordFilter(tokens []string, stopwords set.GenericDataSet[string]) []string
- func Tokenize(text string) []string
- type Document
- type Index
- type Rank
- type Ranks
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Find ¶
Find will return a list of strings in targets that fuzzy matches source.
Example ¶
fmt.Print(Find("whl", []string{"cartwheel", "foobar", "wheel", "baz"}))
Output: [cartwheel wheel]
func FindNormalized ¶
FindNormalized is a unicode-normalized version of Find.
func FindNormalizedFold ¶
FindNormalizedFold is a unicode-normalized and case-insensitive version of Find.
func LevenshteinDistance ¶
LevenshteinDistance measures the difference between two strings. The Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other.
This implemention is optimized to use O(minInt(m,n)) space and is based on the optimized C version found here: http://en.wikibooks.org/wiki/Algorithm_implementation/Strings/Levenshtein_distance#C
func LowercaseFilter ¶
LowercaseFilter lowercase all tokens
func Match ¶
Match returns true if source matches target using a fuzzy-searching algorithm. Note that it doesn't implement Levenshtein distance (see RankMatch instead), but rather a simplified version where there's no approximation. The method will return true only if each character in the source can be found in the target and occurs after the preceding matches.
Example ¶
fmt.Print(Match("twl", "cartwheel"))
Output: true
func MatchNormalized ¶
MatchNormalized is a unicode-normalized version of Match.
func MatchNormalizedFold ¶
MatchNormalizedFold is a unicode-normalized and case-insensitive version of Match.
func RankMatch ¶
RankMatch is similar to Match except it will measure the Levenshtein distance between the source and the target and return its result. If there was no match, it will return -1. Given the requirements of match, RankMatch only needs to perform a subset of the Levenshtein calculation, only deletions need be considered, required additions and substitutions would fail the match test.
Example ¶
fmt.Print(RankMatch("twl", "cartwheel"))
Output: 6
func RankMatchFold ¶
RankMatchFold is a case-insensitive version of RankMatch.
func RankMatchNormalized ¶
RankMatchNormalized is a unicode-normalized version of RankMatch.
func RankMatchNormalizedFold ¶
RankMatchNormalizedFold is a unicode-normalized and case-insensitive version of RankMatch.
func StopwordFilter ¶
func StopwordFilter(tokens []string, stopwords set.GenericDataSet[string]) []string
StopwordFilter filer of stopworld
Types ¶
type Ranks ¶
type Ranks []Rank
func RankFind ¶
RankFind is similar to Find, except it will also rank all matches using Levenshtein distance.
Example ¶
fmt.Printf("%+v", RankFind("whl", []string{"cartwheel", "foobar", "wheel", "baz"}))
Output: [{Source:whl Target:cartwheel Distance:6 OriginalIndex:0} {Source:whl Target:wheel Distance:2 OriginalIndex:2}]
func RankFindFold ¶
RankFindFold is a case-insensitive version of RankFind.
func RankFindNormalized ¶
RankFindNormalized is a unicode-normalized version of RankFind.
func RankFindNormalizedFold ¶
RankFindNormalizedFold is a unicode-normalized and case-insensitive version of RankFind.