index

package
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 10, 2026 License: MPL-2.0 Imports: 12 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultChunkSize = 50    // lines per chunk
	MaxChunkSize     = 100   // absolute max lines per chunk
	MaxChunkBytes    = 24000 // ~6000 tokens at ~4 bytes/token; well under 8192 token context
)

Variables

View Source
var DefaultIgnoreDirs = map[string]bool{
	".git": true, ".sl": true, "node_modules": true, "vendor": true,
	".next": true, "dist": true, "build": true, "__pycache__": true,
	".cache": true, "target": true,
	"package-lock.json": true, "yarn.lock": true, "pnpm-lock.yaml": true,
	"go.sum": true, "Cargo.lock": true,
	".env": true,
}

DefaultIgnoreDirs are directories and files skipped when no ignore set is provided. This is the fallback when no config is loaded; the config's DefaultIndexIgnore is authoritative.

Functions

func BytesToFloat32

func BytesToFloat32(data []byte) []float32

BytesToFloat32 converts a byte slice to a float32 slice. The byte slice must have length divisible by 4.

func CosineSimilarity

func CosineSimilarity(a, b []float32) float64

CosineSimilarity computes the cosine similarity between two vectors.

func Float32ToBytes

func Float32ToBytes(data []float32) []byte

Float32ToBytes converts a float32 slice to a byte slice.

func IsMinified

func IsMinified(path string) bool

IsMinified returns true if the file appears to be minified/bundled, based on filename patterns or content sampling (avg line length > 500).

func ShouldIndex

func ShouldIndex(path string) bool

ShouldIndex returns true if the file should be indexed.

func WalkRepo

func WalkRepo(repoPath string, ignoreDirs ...map[string]bool) ([]string, error)

WalkRepo walks a repo directory and returns all indexable files. If ignoreDirs is nil, DefaultIgnoreDirs is used.

Types

type Chunk

type Chunk struct {
	Repo      string `json:"repo"`
	File      string `json:"file"`
	Index     int    `json:"index"`
	LineStart int    `json:"line_start"`
	LineEnd   int    `json:"line_end"`
	Content   string `json:"content"`
}

Chunk represents a piece of a source file.

func ChunkFile

func ChunkFile(repo, filePath string, chunkSize int) ([]Chunk, error)

ChunkFile splits a file into chunks of roughly chunkSize lines, also respecting a byte budget (MaxChunkBytes) so chunks fit within the embedding model's token context window.

type IndexStats

type IndexStats struct {
	Indexed         int
	Unchanged       int
	SkippedMinified []string // files skipped because they are minified
}

IndexStats tracks what happened during indexing.

type Indexer

type Indexer struct {
	// contains filtered or unexported fields
}

Indexer builds and maintains the semantic index.

func NewIndexer

func NewIndexer(db *statedb.DB, embedder embedding.Embedder) *Indexer

NewIndexer creates a new indexer.

func (*Indexer) IndexRepo

func (idx *Indexer) IndexRepo(ctx context.Context, repoName, repoPath string) (IndexStats, error)

IndexRepo indexes all files in a repo, skipping unchanged files.

func (*Indexer) Search

func (idx *Indexer) Search(ctx context.Context, query string, limit int) ([]SearchResult, error)

Search performs a semantic search across all indexed content.

func (*Indexer) SetIgnoreDirs

func (idx *Indexer) SetIgnoreDirs(dirs map[string]bool)

SetIgnoreDirs sets the directories to skip during indexing.

func (*Indexer) SetProgress

func (idx *Indexer) SetProgress(fn ProgressFunc)

SetProgress sets the progress callback.

type ProgressFunc

type ProgressFunc func(current, total int, file string)

ProgressFunc is called during indexing to report progress. current is the file number being processed, total is the total file count, file is the relative path of the current file.

type SearchResult

type SearchResult struct {
	Repo      string  `json:"repo"`
	File      string  `json:"file"`
	LineStart int     `json:"line_start"`
	LineEnd   int     `json:"line_end"`
	Score     float64 `json:"score"`
	Content   string  `json:"content,omitempty"`
}

SearchResult represents a search hit.

func RankResults

func RankResults(results []SearchResult, limit int) []SearchResult

RankResults sorts search results by score descending and returns top-k.

type WalkResult

type WalkResult struct {
	Files           []string
	SkippedMinified []string
	SkippedDirs     []string
}

WalkResult contains the results of walking a repo directory.

func WalkRepoDetailed

func WalkRepoDetailed(repoPath string, ignoreDirs ...map[string]bool) (WalkResult, error)

WalkRepoDetailed walks a repo directory and returns detailed results including skips.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL