microrag

package
v0.14.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 2, 2026 License: Apache-2.0 Imports: 18 Imported by: 0

Documentation

Overview

Package microrag implements the just-in-time knowledge injection engine. See design_docs/planned/v0_15_0/m-brain-microrag.md for the design.

Index

Constants

View Source
const (
	EnvEnabled = "AILANG_MICRORAG_ENABLED"
	EnvRoutes  = "AILANG_MICRORAG_ROUTES"
	EnvDryrun  = "AILANG_MICRORAG_DRYRUN"
	EnvSession = "AILANG_MICRORAG_SESSION"
)

EnvEnabled is the master eval-toggle env var (0/1). Default: enabled.

Variables

This section is empty.

Functions

func DefaultSessionDir

func DefaultSessionDir() string

DefaultSessionDir returns the session-scoped state directory. Honors AILANG_MICRORAG_SESSION; otherwise uses a per-pid fallback so that concurrent sessions don't share a ledger.

func DryrunFromEnv

func DryrunFromEnv() bool

DryrunFromEnv returns whether dryrun mode is active.

func EnabledFromEnv

func EnabledFromEnv() bool

EnabledFromEnv returns the engine on/off state from env, defaulting to true. Invalid values default to true ("don't fail closed" — see design Risks table).

func RoutesAllowlist

func RoutesAllowlist() map[string]bool

RoutesAllowlist returns the AILANG_MICRORAG_ROUTES allowlist as a set, or nil if no allowlist is set (means: allow all).

Types

type BuiltinResolver

type BuiltinResolver interface {
	Resolve(name string) (*BuiltinSpecJSON, error)
}

BuiltinResolver looks up a builtin spec by name and returns the parsed JSON envelope from `ailang builtins show <name> --json`. Tests can stub this.

type BuiltinSpecJSON

type BuiltinSpecJSON struct {
	Name      string `json:"name"`
	Module    string `json:"module"`
	Signature string `json:"signature"`
	IsPure    bool   `json:"is_pure"`
	Effect    string `json:"effect,omitempty"`
	Metadata  *struct {
		Description string `json:"description,omitempty"`
		Examples    []struct {
			Code        string `json:"code"`
			Description string `json:"description,omitempty"`
		} `json:"examples,omitempty"`
		Since string `json:"since,omitempty"`
	} `json:"metadata,omitempty"`
}

BuiltinSpecJSON mirrors the envelope emitted by `ailang builtins show --json`. Only the fields the lint nudge needs are unmarshalled.

type CLIBuiltinResolver

type CLIBuiltinResolver struct {
	Binary string
}

CLIBuiltinResolver shells out to `ailang builtins show <name> --json`.

func (*CLIBuiltinResolver) Resolve

func (r *CLIBuiltinResolver) Resolve(name string) (*BuiltinSpecJSON, error)

type CLISearcher

type CLISearcher struct {
	Binary  string        // path to ailang binary; "" → "ailang" on PATH
	Timeout time.Duration // per-call timeout; 0 → no timeout
}

CLISearcher shells out to `ailang cache search --json`. Production wiring.

func (*CLISearcher) Search

func (c *CLISearcher) Search(query, namespace string, limit int) ([]SearchHit, error)

Search runs `ailang cache search --namespace <ns> --limit <n> --json <query>`.

type Config

type Config struct {
	Enabled       bool        `yaml:"enabled"`
	Routes        []Route     `yaml:"routes"`
	Dedup         DedupConfig `yaml:"dedup"`
	SessionBudget int         `yaml:"session_budget"`
	MarkerStyle   string      `yaml:"marker_style"`
}

Config is the full micro-rag config (~/.ailang/microrag.yaml).

func LoadConfig

func LoadConfig(path string) (*Config, error)

LoadConfig reads ~/.ailang/microrag.yaml (or path) and applies defaults. A missing file is not an error — the engine runs with defaults + the fallback "**/*.ail" route pointing at ailang-syntax.

func (*Config) BypassFor

func (c *Config) BypassFor(ns string) float64

BypassFor returns the relevance-bypass threshold for the given namespace.

func (*Config) MatchRoute

func (c *Config) MatchRoute(filePath string) *Route

MatchRoute returns the first matching route for a file path, or nil. Routes with kb=="skip" return as a sentinel that callers must check.

func (*Config) WindowFor

func (c *Config) WindowFor(ns string) int

WindowFor returns the dedup token window for the given namespace.

type ContextResult

type ContextResult struct {
	Injection *Injection `json:"injection,omitempty"`
	State     string     `json:"microrag_state"`
	Reason    string     `json:"reason,omitempty"` // why no injection (disabled, no_route, dedup, ...)
}

ContextResult is what the CLI shell-out emits.

type DedupConfig

type DedupConfig struct {
	Windows          map[string]int     `yaml:"windows"`
	RelevanceBypass  map[string]float64 `yaml:"relevance_bypass"`
	WallClockMaxSecs int                `yaml:"wall_clock_max"`
}

DedupConfig holds per-namespace dedup tuning.

type Engine

type Engine struct {
	Cfg        *Config
	Searcher   Searcher
	SessionDir string
	Now        func() time.Time // injectable for tests
}

Engine holds runtime configuration + side-effect roots (filesystem, searcher). All disk paths are derived from sessionDir.

func (*Engine) Context

func (e *Engine) Context(req Request) (*ContextResult, error)

Context is the main entry point. Returns a ContextResult describing whether an injection happened and why. Side effect: appends to injections.jsonl on success (or on dryrun).

func (*Engine) UserPrompt added in v0.14.3

func (e *Engine) UserPrompt(req UserPromptRequest) (*UserPromptResult, error)

UserPrompt searches the configured namespaces using `req.Prompt` as the query and returns the highest-scoring injection that clears the user- prompt relevance floor. Reuses the engine's dedup ledger and session budget so a popular chunk doesn't fire twice in one session.

type Injection

type Injection struct {
	InjectionText string  `json:"injection_text"`
	SnippetID     string  `json:"snippet_id"`
	Tokens        int     `json:"tokens"`
	Namespace     string  `json:"ns"`
	Score         float64 `json:"score"`
	State         string  `json:"microrag_state"` // "on" | "dryrun" | "disabled"
}

Injection is the structured output of one engine call.

type LedgerEntry

type LedgerEntry struct {
	TS        int64   `json:"ts"`
	SnippetID string  `json:"snippet_id"`
	Tokens    int     `json:"tokens"`
	FilePath  string  `json:"file_path"`
	Namespace string  `json:"ns"`
	State     string  `json:"microrag_state"`
	Score     float64 `json:"score"`
}

LedgerEntry is one line in injections.jsonl.

type LintNudge

type LintNudge struct {
	Name          string `json:"name"`
	Module        string `json:"module"`
	Signature     string `json:"signature"`
	Description   string `json:"description,omitempty"`
	Example       string `json:"example,omitempty"`
	InjectionText string `json:"injection_text"`
	Tokens        int    `json:"tokens"`
}

LintNudge is one builtin first-use nudge.

type LintRequest

type LintRequest struct {
	FilePath string
	Code     string
}

LintRequest is the input to a lint pass. Code is the file content being inspected (post-edit).

type LintResult

type LintResult struct {
	Nudges []LintNudge `json:"nudges,omitempty"`
	State  string      `json:"microrag_state"`
	Reason string      `json:"reason,omitempty"`
}

LintResult is the structured output of one lint pass.

type Linter

type Linter struct {
	Resolver   BuiltinResolver
	SessionDir string
	MaxNudges  int // hard cap per call; 0 → 2
	MaxTokens  int // per-nudge budget; 0 → 80
}

Linter holds runtime configuration for the first-use builtin nudge pass.

func (*Linter) Lint

func (l *Linter) Lint(req LintRequest) (*LintResult, error)

Lint scans req.Code for first-use builtin invocations and emits short signature nudges for any builtin not yet seen this session.

type Request

type Request struct {
	ToolName string // Edit | Write | Read
	FilePath string
	Content  string // for Edit/Write — first ~2KB used for query enrichment
}

Request is the input to one engine call.

type Route

type Route struct {
	Glob                  string  `yaml:"glob"`
	KB                    string  `yaml:"kb"`
	MaxTokensPerInjection int     `yaml:"max_tokens_per_injection"`
	RelevanceFloor        float64 `yaml:"relevance_floor"`
}

Route maps a file glob to a knowledge-base namespace and per-route limits.

type SearchEnvelope

type SearchEnvelope struct {
	Scope   string      `json:"scope"`
	Count   int         `json:"count"`
	QueryMs int64       `json:"query_ms"`
	Results []SearchHit `json:"results"`
}

SearchEnvelope mirrors the JSON written by `ailang cache search --json`.

type SearchHit

type SearchHit struct {
	Tier      string  `json:"tier"`
	Namespace string  `json:"namespace"`
	Key       string  `json:"key"`
	Score     float64 `json:"score"`
	Content   string  `json:"content"`
	UpdatedAt int64   `json:"updated_at_ms"`
	Source    string  `json:"source"`
}

SearchHit is the engine's view of a single brain search result. Decoupled from internal/effects.BrainSearchResult so the engine can be tested without spinning up SQLite.

type Searcher

type Searcher interface {
	Search(query, namespace string, limit int) ([]SearchHit, error)
}

Searcher abstracts the brain — production wires to `ailang cache search --json`, tests wire to a stub.

type UserPromptRequest added in v0.14.3

type UserPromptRequest struct {
	Prompt     string
	Namespaces []string // empty → defaultUserPromptNamespaces
}

UserPromptRequest is the input to a UserPromptSubmit-driven engine call. The user's prompt itself becomes the embedding query — no filepath, no content blob to dilute the signal. This is the embedding-query shape embeddings excel at (per ADR-002).

type UserPromptResult added in v0.14.3

type UserPromptResult struct {
	Injection *Injection `json:"injection,omitempty"`
	State     string     `json:"microrag_state"`
	Reason    string     `json:"reason,omitempty"`
}

UserPromptResult is the CLI shell-out envelope for a UserPromptSubmit call. Mirrors ContextResult but with no file-routing concept — namespaces are queried directly.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL