search

package
v0.25.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 25, 2026 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package search provides a Scout-style full-text search abstraction for the lagodev framework. It defines a small Engine interface that higher layers program against, plus two implementations:

  • Memory: a dependency-free in-memory inverted index (the default). It tokenizes documents, folds case, drops stop-words, supports term and prefix matching, TF-style ranking, equality filters, and pagination.
  • Postgres: a full-text engine that compiles tsvector/tsquery SQL through the framework's database.Connection (Grammar + Executor). It is wired so queries are parameterized; the SQL generation is unit-tested independently of a live server.

A Searchable model interface plus an Indexer helper let ORM rows opt into indexing without coupling the search package to the orm package: a model describes its Document, and the application wires Index/Delete into its ORM lifecycle hooks. See Searchable and Indexer.

Basic usage:

eng := search.NewMemory()
_ = eng.Index(ctx, "posts",
    search.Doc("1", map[string]any{"title": "Hello World", "body": "first post"}),
    search.Doc("2", map[string]any{"title": "Goodbye World", "body": "last post"}),
)
res, _ := eng.Search(ctx, "posts", "hello", search.Options{})
for _, hit := range res.Hits {
    fmt.Println(hit.ID, hit.Score)
}

Index

Examples

Constants

View Source
const DefaultPerPage = 15

DefaultPerPage is used when Options.PerPage is unset.

Variables

This section is empty.

Functions

func IsStopWord

func IsStopWord(tok string) bool

IsStopWord reports whether tok is in the built-in stop-word set. The input is folded to lower-case before the lookup.

func Tokenize

func Tokenize(text string) []string

Tokenize splits text into normalized, folded, stop-word-filtered tokens. It is the shared primitive used to index documents and parse queries so both sides use the same vocabulary. Splitting happens on any non-letter, non-digit rune.

func WrapDB

func WrapDB(db *sql.DB) querier

WrapDB adapts a standard *sql.DB to the querier expected by NewPostgres.

Types

type Document

type Document struct {
	// ID uniquely identifies the document within its index.
	ID string
	// Fields holds the searchable/filterable attributes. Keys are field names.
	Fields map[string]any
}

Document is a single indexable record: a stable ID plus a set of searchable fields. Field values may be strings, numbers, or bools; non-string values are stringified for the full-text index and kept verbatim for equality filters.

func Doc

func Doc(id string, fields map[string]any) Document

Doc is a convenience constructor for a Document.

type Engine

type Engine interface {
	// Index inserts or replaces documents in the named index. Re-indexing a
	// document with an existing ID replaces it.
	Index(ctx context.Context, index string, docs ...Document) error
	// Delete removes documents by ID from the named index. Unknown IDs are
	// ignored.
	Delete(ctx context.Context, index string, ids ...string) error
	// Search runs query against the named index and returns a ranked,
	// paginated page of hits.
	Search(ctx context.Context, index, query string, opts Options) (Results, error)
}

Engine is the search backend contract. Implementations must be safe for concurrent use.

type Hit

type Hit struct {
	ID     string
	Score  float64
	Fields map[string]any
}

Hit is a single search result: the matched document ID, its relevance score (higher is more relevant), and the original fields (when the engine retains them).

type Indexer added in v0.25.0

type Indexer struct {
	// contains filtered or unexported fields
}

Indexer mirrors model writes into a search Engine. It is a thin, stateless adapter around an Engine that speaks in terms of Searchable models and index IDs, so application code never has to thread index names and Document construction through every call site. The zero value is not usable; build one with NewIndexer.

Indexer holds no state of its own and is safe for concurrent use whenever the underlying Engine is (both bundled engines are).

func NewIndexer added in v0.25.0

func NewIndexer(eng Engine) *Indexer

NewIndexer returns an Indexer that writes to eng. It panics if eng is nil, since an Indexer with no engine can never do useful work.

func (*Indexer) Backfill added in v0.25.0

func (ix *Indexer) Backfill(ctx context.Context, items ...Searchable) error

Backfill (re)indexes a batch of Searchable models. It groups documents by index so that each index receives a single bulk Index call, which is far cheaper than indexing rows one at a time when warming an index from the database on boot. Models are allowed to span multiple indexes. A nil entry in items is an error; the call stops at the first failing Engine write.

func (*Indexer) Delete added in v0.25.0

func (ix *Indexer) Delete(ctx context.Context, index, id string) error

Delete evicts a document by ID from the named index. It is the natural body of an AfterDelete hook; pass the same index name and ID that Index used. Unknown IDs are ignored by the Engine.

func (*Indexer) DeleteModel added in v0.25.0

func (ix *Indexer) DeleteModel(ctx context.Context, m Searchable) error

DeleteModel evicts m using its own declared index and document ID. It is a convenience over Delete for the common case where a Searchable value is in hand (e.g. inside an AfterDelete hook on the model itself).

func (*Indexer) Engine added in v0.25.0

func (ix *Indexer) Engine() Engine

Engine returns the underlying search Engine the Indexer writes to.

func (*Indexer) Index added in v0.25.0

func (ix *Indexer) Index(ctx context.Context, m Searchable) error

Index inserts or replaces m in its declared index. Because the Engine replaces a document by ID, a single Index call covers both inserts and updates, which makes it the natural body of an AfterSave hook. The supplied context is passed straight through to the Engine.

type Memory

type Memory struct {
	// contains filtered or unexported fields
}

Memory is a dependency-free, in-memory search Engine. It keeps every indexed document in a per-index map and answers queries by tokenizing both the stored documents and the query, then scoring by term-frequency overlap. It is the default engine returned by NewMemory and is safe for concurrent use.

Ranking is intentionally simple: each query token contributes the number of times it appears across a document's fields (its term frequency). When Options.Prefix is set, a query token also matches any document token that has it as a prefix. Documents with a higher accumulated score rank first; ties break on document ID for a stable ordering.

Example

ExampleMemory indexes a handful of documents into the in-memory engine and runs a query, printing the ranked result IDs. Ranking is by term-frequency overlap; ties break on ID, so the order is deterministic.

package main

import (
	"context"
	"fmt"

	"github.com/devituz/lagodev/search"
)

func main() {
	ctx := context.Background()
	eng := search.NewMemory()

	_ = eng.Index(ctx, "posts",
		search.Doc("1", map[string]any{"title": "Hello World", "body": "first post"}),
		search.Doc("2", map[string]any{"title": "Goodbye World", "body": "last post"}),
		search.Doc("3", map[string]any{"title": "World World World", "body": "all about the world"}),
	)

	res, err := eng.Search(ctx, "posts", "world", search.Options{})
	if err != nil {
		fmt.Println("search:", err)
		return
	}

	fmt.Println("total:", res.Total)
	for _, hit := range res.Hits {
		fmt.Printf("%s score=%.0f\n", hit.ID, hit.Score)
	}

}
Output:
total: 3
3 score=4
1 score=1
2 score=1

func NewMemory

func NewMemory() *Memory

NewMemory constructs an empty in-memory search Engine.

func (*Memory) Delete

func (m *Memory) Delete(_ context.Context, index string, ids ...string) error

Delete removes documents by ID from the named index. Unknown IDs and unknown indexes are silently ignored.

func (*Memory) Index

func (m *Memory) Index(_ context.Context, index string, docs ...Document) error

Index inserts or replaces docs in the named index. A document whose ID already exists is replaced wholesale (its previous tokens and fields are discarded).

func (*Memory) Search

func (m *Memory) Search(_ context.Context, index, query string, opts Options) (Results, error)

Search tokenizes query, scores every document in the named index, applies equality Filters, sorts by descending score, and returns the requested page. Total reflects the number of matches before pagination. A blank query (no tokens) yields no hits.

type Options

type Options struct {
	// Page is 1-indexed; values < 1 are treated as 1.
	Page int
	// PerPage is the page size; values < 1 fall back to DefaultPerPage.
	PerPage int
	// Filters constrains results to documents whose field equals the given
	// value (string-compared). Multiple filters are AND-ed.
	Filters map[string]any
	// Prefix enables prefix matching in addition to exact term matching, so a
	// query token "hel" matches "hello". Affects the Memory engine; the
	// Postgres engine appends ":*" to query lexemes.
	Prefix bool
}

Options controls a Search call: pagination, equality filters, and prefix matching. The zero value is valid (first page, default size, term matching).

type PgConfig

type PgConfig struct {
	// IDColumn is the primary-key column scanned into Hit.ID. Default: "id".
	IDColumn string
	// VectorColumn is the tsvector column matched against the tsquery and fed
	// to ts_rank for scoring. Default: "search".
	VectorColumn string
	// FieldsColumn, when non-empty, is selected and scanned into Hit.Fields via
	// a sql.Scanner-compatible destination supplied by the caller's driver.
	// Default: "" (Hit.Fields left nil).
	FieldsColumn string
	// Config is the Postgres text-search configuration passed to the tsquery/
	// tsvector functions (e.g. "english", "simple"). Default: "english".
	Config string
	// WebSearch selects websearch_to_tsquery instead of plainto_tsquery. It is
	// ignored when Prefix matching is requested, since websearch syntax does
	// not compose with the ":*" prefix operator.
	WebSearch bool
}

PgConfig customizes the SQL the Postgres engine emits. The zero value is valid and targets a conventional layout: one physical table per index, with a text "id" column, a generated/maintained tsvector "search" column, and a JSON/JSONB "fields" column scanned back into Hit.Fields.

type Postgres

type Postgres struct {
	// contains filtered or unexported fields
}

Postgres is a full-text search Engine backed by database/sql and Postgres' tsvector/tsquery machinery. It compiles parameterized SELECTs that rank rows with ts_rank and paginate with LIMIT/OFFSET. The engine never interpolates user input into SQL text: index names and column identifiers come from configuration, while the query and filter values are always bound parameters.

func NewPostgres

func NewPostgres(db querier, cfg PgConfig) *Postgres

NewPostgres constructs a Postgres engine over the given querier (typically a *sql.DB wrapped via WrapDB) using cfg, applying defaults for unset fields.

func (*Postgres) Delete

func (p *Postgres) Delete(ctx context.Context, index string, ids ...string) error

Delete removes rows by ID from the index table. It emits a single parameterized DELETE ... WHERE id IN ($1, $2, ...).

func (*Postgres) Index

func (p *Postgres) Index(_ context.Context, _ string, _ ...Document) error

Index is a no-op for the Postgres engine: documents live in the underlying table and are populated by the application's normal writes (the tsvector column is expected to be generated or trigger-maintained). It satisfies the Engine interface so a Postgres-backed app can share search-agnostic code.

func (*Postgres) Search

func (p *Postgres) Search(ctx context.Context, index, query string, opts Options) (Results, error)

Search compiles and runs a tsquery search against the index table, ordering by ts_rank descending and paginating with LIMIT/OFFSET. A blank query (no lexemes after tokenizing) short-circuits to an empty result without a round trip. Equality Filters become additional WHERE predicates with bound values.

type Results

type Results struct {
	Hits  []Hit
	Total int
}

Results is a page of search hits plus the total number of matches across all pages (before pagination).

type Searchable added in v0.25.0

type Searchable interface {
	// SearchIndex is the index name this model indexes into.
	SearchIndex() string
	// SearchDocument is the model's projection into an indexable Document.
	SearchDocument() Document
}

Searchable is implemented by a model that knows how to project itself into a search Document. It is deliberately ORM-agnostic: the search package never imports orm, so a model can opt into indexing by satisfying this interface without coupling the two packages. The wiring into ORM lifecycle hooks lives in application code (see the package docs and docs/SEARCH.md).

SearchIndex returns the name of the index the model belongs to (e.g. "posts"). All rows of the same model normally share one index.

SearchDocument returns the indexable projection of the model: a stable ID plus the searchable/filterable fields. The returned Document.ID is the value used to replace or evict the row, so it must be stable across saves (the primary key is the natural choice).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL