testsuite

package
v0.0.0-...-e40b9d7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 19, 2026 License: AGPL-3.0 Imports: 18 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AliasesFor

func AliasesFor(in, v []byte) bool

AliasesFor reports whether v points inside the backing array of in. Borrowed tokens must alias the input, owned tokens must not.

func AssertTerms

func AssertTerms(t *testing.T, fn tokenizer.Tokenizer, in string, want []Term)

func BruteForceLevenshteinMatches

func BruteForceLevenshteinMatches(k, m int, keyword string, terms ...string) []string

BruteForceLevenshteinMatches is the oracle counterpart of CollectLevenshteinMatches: it scans every term, keeps those within edit distance k of keyword, sorts them byte-ascending (the automaton's yield order) and caps the result at m (m <= 0 means unlimited).

func CollectLevenshteinMatches

func CollectLevenshteinMatches(k, m int, keyword string, terms ...string) []string

CollectLevenshteinMatches builds a token tree from terms, runs the automaton for keyword/k/m and returns the matched values as strings, in the order the automaton yielded them. Values are copied out since Matches aliases tree keys. It returns nil when levenshtein.New rejects the parameters.

func CompileEnglishQuery

func CompileEnglishQuery(t *testing.T, q string) *query.SimpleQuery

func CompileQueryWith

func CompileQueryWith(t *testing.T, q string, def tokenizer.Tokenizer, fields map[uint64]tokenizer.Tokenizer) *query.SimpleQuery

func CompileSpanishQuery

func CompileSpanishQuery(t *testing.T, q string) *query.SimpleQuery

func EnglishMatchedSet

func EnglishMatchedSet(t *testing.T, q string, s *storage.Storage) []string

func IndexOfDocument

func IndexOfDocument(s *storage.Storage, id string) (uint64, bool)

IndexOfDocument returns the internal index assigned to an external doc id after the alphabetical sort performed by SortAndBuildFrom.

func LevenshteinDistance

func LevenshteinDistance(a, b []byte) int

LevenshteinDistance is a reference byte-level edit distance (insert, delete, substitute; all cost 1) used to verify the automaton against brute force.

func MakeDoc

func MakeDoc[T ~string | ~[]byte](id T, fields ...*storage.FieldDefinition) *storage.Document

MakeDoc creates a Document with the given external ID and field definitions. The ID must be unique across the index and will be sorted alphabetically during SortAndBuildFrom / BuildFrom.

func MakeField

func MakeField(hash uint64, length uint64, tokens ...*storage.TokenDefinition) *storage.FieldDefinition

MakeField creates a FieldDefinition with the given xxh3 field hash, total token length for this document, and the list of token definitions.

func MakeToken

func MakeToken[T ~string | []byte](value T, freq uint64) *storage.TokenDefinition

MakeToken creates a TokenDefinition with the given normalized value and frequency. The caller is responsible for normalization before passing the value.

func MakeTokenTree

func MakeTokenTree(terms ...string) storage.Tokens

MakeTokenTree builds a byte-sorted token BTree like the one Storage produces, using the same TokenLessFunc and NoLocks options as production. Duplicate terms collapse into a single key.

func MatchedSetWith

func MatchedSetWith(t *testing.T, q string, s *storage.Storage, def tokenizer.Tokenizer, fields map[uint64]tokenizer.Tokenizer) []string

func ResolveDocumentIndexes

func ResolveDocumentIndexes(s *storage.Storage, idxs []uint64) []string

ResolveDocumentIndexes maps a ranked slice of internal indices back to external ids.

func RoundTrip

func RoundTrip(tb testing.TB, s *storage.Storage) *storage.Storage

RoundTrip saves the storage to a buffer and loads it back into a fresh Storage. It returns the loaded storage. Any load error is returned to the caller.

func RunFieldScore

func RunFieldScore(s *storage.Storage, fieldHash uint64, candidates []uint64) (idxs []uint64, ctx *query.QueryContext)

RunFieldScore builds a candidate bitmap, runs FieldScore against fieldHash, then resolves to a ranked slice (best first). Passing candidates == nil means "every document in the corpus"; a non-nil (even empty) slice restricts the candidate set to exactly those internal indices.

func RunQuery

func RunQuery(q *query.SimpleQuery, s *storage.Storage) (idxs []uint64, ctx *query.QueryContext)

RunQuery filters then scores a query against s, returning the ranked doc indices (best first) alongside the populated context so assertions can read raw scores and the resolved bitmap.

func SortableDate

func SortableDate(t *testing.T, s string) []byte

func SortableFloat64

func SortableFloat64(v float64) []byte

SortableFloat64 is the float counterpart of sortableInt.

func SortableInt64

func SortableInt64(v int64) []byte

SortableInt64 encodes v with the same sortable byte layout production uses for integer fields, so token byte order matches numeric order.

func SpanishMatchedSet

func SpanishMatchedSet(t *testing.T, q string, s *storage.Storage) []string

func TempDirectory

func TempDirectory(tb testing.TB, pattern string) (name string)

func TempFilename

func TempFilename(tb testing.TB, pattern string) (name string)

Types

type Term

type Term struct {
	Value string
	Owned bool
}

func CollectTerms

func CollectTerms(t *testing.T, fn tokenizer.Tokenizer, in []byte) []Term

CollectTerms drains the sequence and asserts the ownership invariant on every token: IsStem is true exactly when Value is an owned allocation that does not alias the input. The Token pointer is reused, so values are copied out here.

Directories

Path Synopsis
cmd/wikipedia command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL