fuzzysearch

package
v1.0.7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 16, 2025 License: GPL-3.0 Imports: 8 Imported by: 0

Documentation

Overview

Fuzzy searching allows for flexibly matching a string with partial input, useful for filtering data very quickly based on lightweight user input.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Analyze

func Analyze(text string) []string

Analyze analyze string

func Find

func Find(source string, targets []string) []string

Find will return a list of strings in targets that fuzzy matches source.

Example
fmt.Print(Find("whl", []string{"cartwheel", "foobar", "wheel", "baz"}))
Output:

[cartwheel wheel]

func FindFold

func FindFold(source string, targets []string) []string

FindFold is a case-insensitive version of Find.

func FindNormalized

func FindNormalized(source string, targets []string) []string

FindNormalized is a unicode-normalized version of Find.

func FindNormalizedFold

func FindNormalizedFold(source string, targets []string) []string

FindNormalizedFold is a unicode-normalized and case-insensitive version of Find.

func Intersection

func Intersection(a []int, b []int) []int

Intersection intersect two slices

func LevenshteinDistance

func LevenshteinDistance(s, t string) int

LevenshteinDistance measures the difference between two strings. The Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other.

This implemention is optimized to use O(minInt(m,n)) space and is based on the optimized C version found here: http://en.wikibooks.org/wiki/Algorithm_implementation/Strings/Levenshtein_distance#C

func LowercaseFilter

func LowercaseFilter(tokens []string) []string

LowercaseFilter lowercase all tokens

func Match

func Match(source, target string) bool

Match returns true if source matches target using a fuzzy-searching algorithm. Note that it doesn't implement Levenshtein distance (see RankMatch instead), but rather a simplified version where there's no approximation. The method will return true only if each character in the source can be found in the target and occurs after the preceding matches.

Example
fmt.Print(Match("twl", "cartwheel"))
Output:

true

func MatchFold

func MatchFold(source, target string) bool

MatchFold is a case-insensitive version of Match.

func MatchNormalized

func MatchNormalized(source, target string) bool

MatchNormalized is a unicode-normalized version of Match.

func MatchNormalizedFold

func MatchNormalizedFold(source, target string) bool

MatchNormalizedFold is a unicode-normalized and case-insensitive version of Match.

func RankMatch

func RankMatch(source, target string) int

RankMatch is similar to Match except it will measure the Levenshtein distance between the source and the target and return its result. If there was no match, it will return -1. Given the requirements of match, RankMatch only needs to perform a subset of the Levenshtein calculation, only deletions need be considered, required additions and substitutions would fail the match test.

Example
fmt.Print(RankMatch("twl", "cartwheel"))
Output:

6

func RankMatchFold

func RankMatchFold(source, target string) int

RankMatchFold is a case-insensitive version of RankMatch.

func RankMatchNormalized

func RankMatchNormalized(source, target string) int

RankMatchNormalized is a unicode-normalized version of RankMatch.

func RankMatchNormalizedFold

func RankMatchNormalizedFold(source, target string) int

RankMatchNormalizedFold is a unicode-normalized and case-insensitive version of RankMatch.

func StopwordFilter

func StopwordFilter(tokens []string, stopwords set.GenericDataSet[string]) []string

StopwordFilter filer of stopworld

func Tokenize

func Tokenize(text string) []string

Tokenize tokenize string

Types

type Document

type Document struct {
	Text string
	ID   int
}

Document document

func NewDocument

func NewDocument(id int, text string) Document

NewDocument return new document

func (Document) Terms

func (d Document) Terms() []string

type Index

type Index map[string][]int

Index search index

func NewIndex

func NewIndex() Index

NewIndex return new index

func (Index) Add

func (idx Index) Add(docs ...Document)

Add add documents to index

func (Index) Remove

func (idx Index) Remove(docs ...Document)

Remove remove documents from index

func (Index) Search

func (idx Index) Search(text string) []int

Search search documents

type Rank

type Rank struct {
	// Source is used as the source for matching.
	Source string

	// Target is the word matched against.
	Target string

	// Distance is the Levenshtein distance between Source and Target.
	Distance int

	// Location of Target in original list
	OriginalIndex int
}

type Ranks

type Ranks []Rank

func RankFind

func RankFind(source string, targets []string) Ranks

RankFind is similar to Find, except it will also rank all matches using Levenshtein distance.

Example
fmt.Printf("%+v", RankFind("whl", []string{"cartwheel", "foobar", "wheel", "baz"}))
Output:

[{Source:whl Target:cartwheel Distance:6 OriginalIndex:0} {Source:whl Target:wheel Distance:2 OriginalIndex:2}]

func RankFindFold

func RankFindFold(source string, targets []string) Ranks

RankFindFold is a case-insensitive version of RankFind.

func RankFindNormalized

func RankFindNormalized(source string, targets []string) Ranks

RankFindNormalized is a unicode-normalized version of RankFind.

func RankFindNormalizedFold

func RankFindNormalizedFold(source string, targets []string) Ranks

RankFindNormalizedFold is a unicode-normalized and case-insensitive version of RankFind.

func (Ranks) Len

func (r Ranks) Len() int

func (Ranks) Less

func (r Ranks) Less(i, j int) bool

func (Ranks) Swap

func (r Ranks) Swap(i, j int)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL