indexer

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 21, 2026 License: MIT Imports: 17 Imported by: 0

Documentation

Overview

Package indexer orchestrates source code extraction and graph indexing.

The indexer walks repository files, dispatches them to the appropriate language-specific Extractor, stores the resulting nodes and edges, computes Merkle snapshots, and runs cross-repo edge resolution. It supports incremental re-indexing by skipping files whose content hash has not changed.

Key components:

  • Indexer: top-level orchestrator (indexer.go)
  • ExtractorRegistry: dispatches files to the first matching extractor (this file)
  • parallelExtract: worker pool for concurrent file extraction (worker.go)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ExtractorRegistry

type ExtractorRegistry struct {
	// contains filtered or unexported fields
}

ExtractorRegistry maintains an ordered list of registered extractors and dispatches files to the first extractor whose CanHandle returns true. Extractors are checked in registration order, so more specific extractors should be registered before generic ones.

func NewExtractorRegistry

func NewExtractorRegistry() *ExtractorRegistry

NewExtractorRegistry creates an empty ExtractorRegistry.

func (*ExtractorRegistry) FindAllExtractors

func (r *ExtractorRegistry) FindAllExtractors(path string) []types.Extractor

FindAllExtractors returns all registered extractors that can handle the given file path. This enables multi-extractor dispatch: a .go file can be processed by both the Go extractor (functions, types) and the event extractor (Kafka/NATS patterns).

func (*ExtractorRegistry) FindExtractor

func (r *ExtractorRegistry) FindExtractor(path string) types.Extractor

FindExtractor returns the first registered extractor that can handle the given file path, or nil if none matches.

func (*ExtractorRegistry) Register

func (r *ExtractorRegistry) Register(ext types.Extractor)

Register adds an extractor to the registry.

type Indexer

type Indexer struct {
	Concurrency int  // number of parallel extraction workers; 0 means 8
	SkipBlame   bool // skip git blame authorship extraction (expensive on large repos)
	// contains filtered or unexported fields
}

Indexer orchestrates extractors to index a repository's source code into the knowledge graph. It manages the full lifecycle: file discovery, content-based change detection, extraction dispatch, batch storage, snapshot computation, edge event recording, and cross-repo resolution.

func NewIndexer

func NewIndexer(store types.GraphStore, snapshot SnapshotComputer) *Indexer

NewIndexer creates an Indexer with the given store and snapshot computer.

func (*Indexer) IndexFile

func (idx *Indexer) IndexFile(ctx context.Context, opts types.ExtractOptions) (*types.ExtractResult, error)

IndexFile extracts nodes and edges from a single file and stores them.

func (*Indexer) IndexRepo

func (idx *Indexer) IndexRepo(ctx context.Context, repoURL, repoPath, commitHash string) (*types.Snapshot, error)

IndexRepo indexes all source files in a repository, skipping files whose content hash has not changed since the last index.

func (*Indexer) LastChangedFiles

func (idx *Indexer) LastChangedFiles() []string

LastChangedFiles returns the file paths that changed in the most recent IndexRepo call.

func (*Indexer) Register

func (idx *Indexer) Register(ext types.Extractor)

Register adds an extractor to the indexer's registry.

func (*Indexer) ResolveEdges

func (idx *Indexer) ResolveEdges(ctx context.Context) (*resolver.ResolveStats, error)

ResolveEdges runs the cross-repo edge resolver to retarget dangling edges whose target hashes were computed with the wrong repo URL.

type SnapshotComputer

type SnapshotComputer interface {
	ComputeSnapshot(ctx context.Context, repoHash types.Hash, commitHash string) (*types.Snapshot, error)
}

SnapshotComputer computes a point-in-time snapshot for a repository. Defined locally to avoid importing internal/snapshot.

Directories

Path Synopsis
Package authorship extracts authored_by edges from git blame data.
Package authorship extracts authored_by edges from git blame data.
Package cloudextractor extracts cloud infrastructure and CI/CD resource definitions and their relationships from YAML configuration files.
Package cloudextractor extracts cloud infrastructure and CI/CD resource definitions and their relationships from YAML configuration files.
Package csharpextractor provides C# extraction with ASP.NET attribute route detection.
Package csharpextractor provides C# extraction with ASP.NET attribute route detection.
Package cssextractor extracts CSS/SCSS selectors, custom properties, and import relationships.
Package cssextractor extracts CSS/SCSS selectors, custom properties, and import relationships.
Package dockerfileextractor provides an extractor for Dockerfile files.
Package dockerfileextractor provides an extractor for Dockerfile files.
Package envextractor provides an extractor for environment variable files.
Package envextractor provides an extractor for environment variable files.
Package eventextractor provides a supplementary extractor that detects message queue producer and consumer patterns across Go, TypeScript, Python, and Java source code.
Package eventextractor provides a supplementary extractor that detects message queue producer and consumer patterns across Go, TypeScript, Python, and Java source code.
Package gitlabciextractor provides an extractor for GitLab CI configuration files.
Package gitlabciextractor provides an extractor for GitLab CI configuration files.
Package goextractor provides Go-specific extraction using go/packages for full type resolution.
Package goextractor provides Go-specific extraction using go/packages for full type resolution.
Package gotsextractor provides Go extraction using tree-sitter for fast AST parsing with route detection.
Package gotsextractor provides Go extraction using tree-sitter for fast AST parsing with route detection.
Package graphqlextractor provides an extractor for GraphQL schema files.
Package graphqlextractor provides an extractor for GraphQL schema files.
Package helmextractor provides an extractor for Helm chart files.
Package helmextractor provides an extractor for Helm chart files.
Package javaextractor provides Java extraction with Spring annotation route detection.
Package javaextractor provides Java extraction with Spring annotation route detection.
Package k8sextractor extracts Kubernetes resource definitions and their deployment relationships.
Package k8sextractor extracts Kubernetes resource definitions and their deployment relationships.
Package makefileextractor provides an extractor for Makefile and .mk files.
Package makefileextractor provides an extractor for Makefile and .mk files.
Package ownership parses CODEOWNERS files and emits owned_by edges from file nodes to synthetic team/user nodes.
Package ownership parses CODEOWNERS files and emits owned_by edges from file nodes to synthetic team/user nodes.
Package packagejsonextractor provides an extractor for package.json files.
Package packagejsonextractor provides an extractor for package.json files.
Package protoextractor provides a tree-sitter based extractor for Protocol Buffer (.proto) files.
Package protoextractor provides a tree-sitter based extractor for Protocol Buffer (.proto) files.
Package rubyextractor provides a tree-sitter based extractor for Ruby files.
Package rubyextractor provides a tree-sitter based extractor for Ruby files.
Package rustextractor provides Rust extraction with Actix/Axum/Rocket route detection.
Package rustextractor provides Rust extraction with Actix/Axum/Rocket route detection.
Package schemaextractor provides an extractor for OpenAPI 3.x, Swagger 2.x, and JSON Schema files.
Package schemaextractor provides an extractor for OpenAPI 3.x, Swagger 2.x, and JSON Schema files.
Package scipingest parses SCIP (Source Code Intelligence Protocol) index files and imports their symbol definitions and references into the knowing knowledge graph.
Package scipingest parses SCIP (Source Code Intelligence Protocol) index files and imports their symbol definitions and references into the knowing knowledge graph.
Package sqlextractor extracts SQL tables, views, functions, and their relationships.
Package sqlextractor extracts SQL tables, views, functions, and their relationships.
Package terraformextractor extracts Terraform HCL resources, modules, and dependency relationships.
Package terraformextractor extracts Terraform HCL resources, modules, and dependency relationships.
Package treesitter provides a Python extractor using tree-sitter grammars.
Package treesitter provides a Python extractor using tree-sitter grammars.
Package tsextractor provides TypeScript/JavaScript extraction with framework route detection.
Package tsextractor provides TypeScript/JavaScript extraction with framework route detection.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL