matcher

package
v2.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 1, 2026 License: GPL-3.0 Imports: 6 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FindTokenSignatureMatches

func FindTokenSignatureMatches(
	mediaType slugs.MediaType,
	queryName string,
	candidates []database.MediaTitle,
) []string

FindTokenSignatureMatches finds candidates where the token signature exactly matches the query signature. This enables word-order independent matching: "Crystal Space Quest" matches "Quest Space Crystal".

The query and candidates must have word boundaries (e.g., "Super Mario World"), not slugs. The mediaType ensures consistent parsing between query and indexed candidates. Returns the slugs of matched titles.

func GenerateTokenSignature

func GenerateTokenSignature(mediaType slugs.MediaType, gameName string) string

GenerateTokenSignature creates a normalized, sorted token signature for word-order independent matching. Uses the same tokenization pipeline as slugification to ensure consistency.

IMPORTANT: Requires input with word boundaries (e.g., "Super Mario World"), not slugs. Slugs have already lost word boundary information and will produce incorrect signatures.

The mediaType parameter ensures media-type-specific parsing is applied (e.g., ParseGame for games), matching the indexing pipeline used in GenerateSlugWithMetadata.

Example:

GenerateTokenSignature(slugs.MediaTypeGame, "Super Mario World") → "mario_super_world"
GenerateTokenSignature(slugs.MediaTypeGame, "Mario World Super") → "mario_super_world"

Types

type FuzzyMatch

type FuzzyMatch struct {
	Slug       string
	Similarity float32
}

FuzzyMatch represents a slug that matches the query with a similarity score.

func ApplyDamerauLevenshteinTieBreaker

func ApplyDamerauLevenshteinTieBreaker(query string, matches []FuzzyMatch, topN int) []FuzzyMatch

ApplyDamerauLevenshteinTieBreaker refines fuzzy matches using Damerau-Levenshtein distance to handle transposition errors (e.g., "crono tigger" → "Chrono Trigger").

It takes the top N candidates from Jaro-Winkler and re-ranks them by edit distance. This two-stage approach is more accurate than either algorithm alone while remaining fast.

func FindFuzzyMatches

func FindFuzzyMatches(query string, candidates []string, maxDistance int, minSimilarity float32) []FuzzyMatch

FindFuzzyMatches returns slugs that fuzzy match the query using Jaro-Winkler similarity. Jaro-Winkler is optimized for short strings and heavily weights matching prefixes, making it ideal for game titles where users typically get the start correct. It also naturally handles British/American spelling variations (e.g., "colour" vs "color"). Results are filtered by maxDistance and minSimilarity, sorted by similarity (best first).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL