scanner

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 11, 2026 License: MIT Imports: 39 Imported by: 0

Documentation

Index

Constants

View Source
const CrossFileCacheVersion = 8

CrossFileCacheVersion is bumped whenever the on-disk layout or the Symbol / Reference shapes change. A version mismatch is treated as a miss.

v2: switched payload to a columnar form with a shared string table. Every Reference previously serialized its File path in full (~100 bytes times many rows); interning collapses that to uint32 indexes and a de-duplicated string list. v4: embedded the assembled lookup maps (symbolsByName, refCountByName, refFilesByName, nonCommentRefFilesByName, nonCommentRefCountByNameFile) plus the serialized bloom filter so a warm hit can skip the lookup-map rebuild phase entirely. v5: force rebuild after the FlatFindChild sentinel-collision fix (see crossFileShardVersion v3). The monolithic payload mirrors the shard data and is similarly corrupted pre-fix. v6: zstd-wrap the monolithic gob payload. The shards were already compressed; this keeps the fallback/full-index cache from dominating .krit size on small-to-medium projects. v7: Symbol carries language/package/FQN/owner/signature/arity and Java modifiers so Java declarations can participate in the source index. v8: meta records per-file content hashes and optional small-change overlays so one-file edits do not force rewriting payload.gob.

Variables

View Source
var (
	NodeTypeTable []string
)

Functions

func AndroidFindingsCacheDir

func AndroidFindingsCacheDir(repoDir string) string

AndroidFindingsCacheDir returns the cache root for a repo.

func AndroidFindingsKey

func AndroidFindingsKey(in AndroidFindingsKeyInputs) string

AndroidFindingsKey composes the inputs into a stable hex digest. Order is fixed; field separators are NUL bytes that cannot appear inside any of the hex/string fields.

func BaselineID

func BaselineID(f Finding, signature string, basePath string) string

BaselineID generates a baseline ID for a finding. Format: "RuleName:filename:signature" The signature is typically the entity name (function, class, etc.)

When basePath is set, uses relative path instead of just filename. This avoids collisions between same-named files in different modules.

func BaselineIDAt

func BaselineIDAt(columns *FindingColumns, row int, signature string, basePath string) string

BaselineIDAt generates a baseline ID directly from a columnar finding row without reconstructing a Finding value first.

func BaselineIDFilenameOnly

func BaselineIDFilenameOnly(f Finding, signature string) string

BaselineIDFilenameOnly generates a baseline ID using only the file basename. Use this when comparing against baseline files that do not include module paths.

func BaselineIDFilenameOnlyAt

func BaselineIDFilenameOnlyAt(columns *FindingColumns, row int, signature string) string

BaselineIDFilenameOnlyAt generates a filename-only baseline ID directly from a columnar finding row.

func ClearAndroidFindingsCache

func ClearAndroidFindingsCache(dir string) error

ClearAndroidFindingsCache removes the on-disk cache. Safe to call when absent.

func ClearCrossFileCache

func ClearCrossFileCache(dir string) error

ClearCrossFileCache removes every file under the cache dir.

func ClearCrossFindingsCache

func ClearCrossFindingsCache(dir string) error

ClearCrossFindingsCache removes the on-disk cross-findings cache. Safe to call when the directory is absent.

func ClearFindingsBundleCache

func ClearFindingsBundleCache(dir string) error

func ClearParseCache

func ClearParseCache(repoDir string) error

ClearParseCache removes the parse-cache directory under repoDir. Used by --clear-cache at the CLI boundary; a no-op when the cache directory does not exist.

func CollectJavaFiles

func CollectJavaFiles(paths []string, excludes []string) ([]string, error)

CollectJavaFiles finds all .java files under the given paths.

func CollectKotlinAndJavaFiles

func CollectKotlinAndJavaFiles(paths []string, excludes []string) (kotlin []string, java []string, err error)

CollectKotlinAndJavaFiles walks the tree once and returns Kotlin and Java files separately. Equivalent to calling CollectKotlinFiles and CollectJavaFiles but does a single filesystem traversal.

func CollectKotlinFiles

func CollectKotlinFiles(paths []string, excludes []string) ([]string, error)

CollectKotlinFiles finds all .kt and .kts files under the given paths.

func CrossFileCacheDir

func CrossFileCacheDir(repoDir string) string

CrossFileCacheDir returns the cache root for a repo. The directory is created lazily on write.

func CrossFindingsCacheDir

func CrossFindingsCacheDir(repoDir string) string

CrossFindingsCacheDir returns the cache root for a repo.

func CrossFindingsKey

func CrossFindingsKey(indexFingerprint, ruleHash string) string

CrossFindingsKey composes the cache lookup key from the codeIndex/parsed-files fingerprint and the cross-rule ruleHash. The returned hex digest covers both inputs plus the cache version, so changing any one of them produces a new entry.

func FileStructuralFingerprint added in v0.2.0

func FileStructuralFingerprint(file *File) string

FileStructuralFingerprint hashes a single file's contribution to the cross-file CodeIndex: its declared Symbols (sorted by FQN + Name + Kind + Visibility + Signature) and its References (sorted by Name + InComment flag). Per-line positions are deliberately excluded — line-number drift from intra-file body edits should not invalidate cross-file findings for unchanged files.

Mismatches are precise: any added/removed/renamed declaration moves the fingerprint, any added/removed reference moves it, but a body edit that touches no Symbols and no References (whitespace, comment, constant value, local variable rename) keeps it stable. The ConservativeDeltaPlanner's "CrossFile stable" gate uses the aggregate of per-file structural fps to decide whether the delta path is safe — see pipeline.crossFileStructuralFingerprint.

func FindingSignature

func FindingSignature(f Finding, lines []string) string

FindingSignature extracts a signature for a finding based on the source line.

func FindingsBundleCacheDir

func FindingsBundleCacheDir(repoDir string) string

func FindingsBundleKey

func FindingsBundleKey(fp RunFingerprint) string

func FindingsBundleManifestKey added in v0.2.0

func FindingsBundleManifestKey(repoDir string, scanPaths []string) string

FindingsBundleManifestKey derives a stable manifest identifier from a project root + sorted scan paths. The repoDir is included so daemons running against multiple projects don't collide.

func FlatFindChild

func FlatFindChild(tree *FlatTree, parent uint32, childType string) (uint32, bool)

FlatFindChild finds the first direct child with the given type. The second return reports whether a matching child was found; when false, the first return is 0 and must not be used as a node index. Before this returned a bare uint32 and conflated "not found" with "node 0" (the source_file root), which silently produced whole-file source reads when the result was passed to FlatNodeText.

func FlatHasModifier

func FlatHasModifier(tree *FlatTree, idx uint32, content []byte, modifier string) bool

FlatHasModifier checks whether a flattened declaration has the given modifier.

func FlatNodeBytes

func FlatNodeBytes(tree *FlatTree, idx uint32, content []byte) []byte

FlatNodeBytes returns the source bytes spanned by the flattened node. The returned slice aliases content and is only valid while content is live.

func FlatNodeString

func FlatNodeString(tree *FlatTree, idx uint32, content []byte, pool *StringPool) string

FlatNodeString returns an interned string for a flattened node.

func FlatNodeText

func FlatNodeText(tree *FlatTree, idx uint32, content []byte) string

FlatNodeText returns the source text spanned by the flattened node.

func FlatNodeTextEquals

func FlatNodeTextEquals(tree *FlatTree, idx uint32, content []byte, s string) bool

FlatNodeTextEquals reports whether the node's source text equals s. Zero-alloc: the compiler optimises string(b)==s to a direct byte comparison regardless of whether s is a constant or a variable (confirmed via escape analysis — the temporary string is stack-allocated, never escapes to heap).

func FlatWalkNodes

func FlatWalkNodes(tree *FlatTree, nodeType string, fn func(uint32))

FlatWalkNodes calls fn for every node of the given type.

func GetKotlinParser

func GetKotlinParser() *sitter.Parser

GetKotlinParser returns a fresh Kotlin parser. Callers must call PutKotlinParser when done.

func GitLsFilesByExt

func GitLsFilesByExt(dir string, extensions []string) ([]string, bool)

GitLsFilesByExt returns paths tracked by the git repo rooted at dir whose names end in any of the given extensions. Paths are repository-relative; the caller joins with dir to form absolute paths.

Returns ok=false when dir is not the top of a git work tree, when git is unavailable, or when ls-files exits non-zero — callers fall back to a manual walk. ls-files reports paths relative to the repo top, so we require dir to be the top to keep caller path-joining sound.

func GrammarVersion

func GrammarVersion() string

GrammarVersion returns the Kotlin grammar identifier. Kept for back-compat with callers that pre-date per-language keys; new code should prefer KotlinGrammarVersion / JavaGrammarVersion so the intent is explicit.

func InitTestPaths

func InitTestPaths(config []string, override []string)

func InternString

func InternString(s string) string

InternString returns a canonical copy of s from the global string pool. Other packages can use this to reduce duplicate string allocations.

func IsCommentLine

func IsCommentLine(line string) bool

IsCommentLine returns true if the trimmed line is a comment (// or * prefix).

func IsTestFile

func IsTestFile(path string) bool

func JavaGrammarVersion

func JavaGrammarVersion() string

JavaGrammarVersion returns a stable identifier for the tree-sitter Java grammar binding in use. Keyed independently of Kotlin so a tree-sitter-java bump evicts only Java cache entries.

func KotlinGrammarVersion

func KotlinGrammarVersion() string

KotlinGrammarVersion returns a stable identifier for the tree-sitter Kotlin grammar binding in use. The SymbolCount is appended so a regenerated-but-same-dep-version grammar (rare but possible) still invalidates cached entries.

func LoadCrossFileCache

func LoadCrossFileCache(dir, wantFingerprint string) ([]Symbol, []Reference, bool)

LoadCrossFileCache returns (symbols, refs, true) when the on-disk fingerprint matches wantFingerprint. Any other outcome — missing files, version mismatch, decode error, fingerprint drift — is a miss. A miss is never an error; callers fall back to BuildIndex.

func LoadCurrentCrossFileCacheIndex

func LoadCurrentCrossFileCacheIndex(dir string) (*CodeIndex, CrossFileCacheMeta, bool)

LoadCurrentCrossFileCacheIndex returns the current cached index and meta without requiring the caller to already know the cached fingerprint.

func LookupFlatNodeType

func LookupFlatNodeType(nodeType string) (uint16, bool)

LookupFlatNodeType resolves a node type string to its flattened type ID.

func NodeTypeTableSize

func NodeTypeTableSize() int

NodeTypeTableSize returns the current size of the node type table via the lock-free snapshot. Safe to call concurrently with internNodeType.

func PutKotlinParser

func PutKotlinParser(p *sitter.Parser)

PutKotlinParser releases a Kotlin parser.

func ReadLines

func ReadLines(path string) ([]string, error)

ReadLines reads a file and returns lines (for gitignore, etc.)

func SaveAndroidFindings

func SaveAndroidFindings(cacheDir, key string, cols FindingColumns) error

SaveAndroidFindings writes cached findings for the given key. Best effort: any error is returned but the analysis path treats failure as non-fatal.

func SaveCrossFileCache

func SaveCrossFileCache(dir, fingerprint string, meta CrossFileCacheMeta, symbols []Symbol, refs []Reference) error

SaveCrossFileCache writes the symbols and references slices under the given fingerprint. Payloads stream directly into tempfiles and are atomically renamed so concurrent readers never observe a truncated cache.

func SaveCrossFileCacheIndex

func SaveCrossFileCacheIndex(dir, fingerprint string, meta CrossFileCacheMeta, idx *CodeIndex) error

SaveCrossFileCacheIndex persists a fully-built CodeIndex so warm loads can skip the lookup-map rebuild.

func SaveCrossFileCacheOverlay

func SaveCrossFileCacheOverlay(dir, fingerprint string, meta CrossFileCacheMeta, idx *CodeIndex) error

SaveCrossFileCacheOverlay persists only metadata for a small incremental update. The existing payload.gob remains the compacted base; overlay entries point at per-file shards that replace or extend that base.

func SaveCrossFindings

func SaveCrossFindings(cacheDir, key string, cols FindingColumns) error

SaveCrossFindings writes cached cross-rule findings for the given key. Best-effort: any error returns nil after recording a miss counter, since cache failures should never break analysis.

func SaveFindingsBundleManifest added in v0.2.0

func SaveFindingsBundleManifest(repoDir, key string, manifest FindingsBundleManifest) error

SaveFindingsBundleManifest persists the run manifest atomically. Best-effort: errors are returned but callers (RunProject's save path) typically log-and-continue rather than failing the whole verb on a manifest write error.

func SortIndexReferences added in v0.2.0

func SortIndexReferences(refs []Reference)

SortIndexReferences is the Reference counterpart. Composite key: (File, Line, Name). References don't carry a byte offset; line is sufficient because the rule consumers care about source position, not exact column.

func SortIndexSymbols added in v0.2.0

func SortIndexSymbols(symbols []Symbol)

SortIndexSymbols orders Symbol slices canonically. Used at the scanner index merge seam (post-fan-in) so any consumer iterating `CodeIndex.Symbols` or `symbolsByName[Name]` sees the same sequence every run regardless of which worker contributed which shard. Composite key: (File, StartByte, FQN, Name). FQN before Name handles the case where two declarations share a short name in different packages but produce the same `Symbol.Name`.

func SplitNUL

func SplitNUL(data []byte, atEOF bool) (advance int, token []byte, err error)

SplitNUL is a bufio.SplitFunc for NUL-separated streams (e.g. the `git ls-files -z` output).

func WriteBaseline

func WriteBaseline(path string, findings []Finding, basePath string) error

WriteBaseline writes findings as a baseline XML file. If basePath is set, uses relative paths for multi-module disambiguation. If basePath is empty, uses filename-only IDs for legacy compatibility.

func WriteBaselineColumns

func WriteBaselineColumns(path string, columns *FindingColumns, basePath string) error

WriteBaselineColumns writes columnar findings as a baseline XML file.

func WriteBaselineJSON

func WriteBaselineJSON(path string, b *Baseline) error

WriteBaselineJSON writes a Baseline to a JSON file. The output is sorted for stable diffs. Atomically replaces any existing file.

Types

type AndroidCacheWriter

type AndroidCacheWriter struct {
	// contains filtered or unexported fields
}

AndroidCacheWriter defers SaveAndroidFindings calls onto a bounded background pool so the AndroidPhase doesn't block on disk writes between input units (manifests, gradle files, resource dirs). Mirrors internal/oracle.CacheWriter.

func NewAndroidCacheWriter

func NewAndroidCacheWriter(workers int) *AndroidCacheWriter

NewAndroidCacheWriter starts a bounded writer for Android findings cache entries. Worker counts below one are clamped to one and counts above four are capped to keep cold persistence from competing too aggressively with rule execution.

func (*AndroidCacheWriter) AddPerfEntries

func (w *AndroidCacheWriter) AddPerfEntries(t perf.Tracker)

AddPerfEntries records summary timings/counts on t when enabled.

func (*AndroidCacheWriter) Close

func (w *AndroidCacheWriter) Close() error

Close flushes synchronously.

func (*AndroidCacheWriter) Flush

func (w *AndroidCacheWriter) Flush(ctx context.Context) error

Flush waits for queued writes to finish. A canceled context returns early, but the underlying writer continues draining in its goroutine.

func (*AndroidCacheWriter) Save

func (w *AndroidCacheWriter) Save(cacheDir, key string, cols FindingColumns)

Save queues a write of cols under key. When the writer is nil or the queue is saturated the save runs synchronously so cache persistence remains best-effort complete.

func (*AndroidCacheWriter) Stats

type AndroidCacheWriterStats

type AndroidCacheWriterStats struct {
	Queued    int64
	Completed int64
	Failed    int64
	Bytes     int64
	SyncSaves int64
}

AndroidCacheWriterStats is a point-in-time snapshot of deferred Android findings cache writes.

type AndroidFindingsKeyInputs

type AndroidFindingsKeyInputs struct {
	// Kind is the input family; manifests, gradle files, resource dirs,
	// icons, and per-source resource lookups never share entries.
	Kind AndroidFindingsKind
	// RuleHash covers active rule IDs and their config (from
	// cache.ComputeConfigHash).
	RuleHash string
	// LibraryFactsFP is librarymodel.Facts.Fingerprint().
	LibraryFactsFP string
	// JavaSemanticFactsFP is javafacts.Facts.Fingerprint().
	JavaSemanticFactsFP string
	// InputFP is the kind-specific input fingerprint: a manifest's
	// content hash, a Gradle file's content hash, a merged resource-index
	// hash, etc. The discipline is content-based, never mtime.
	InputFP string
	// Extra is a kind-specific supplementary fingerprint mixed in after
	// InputFP. Use it for context that isn't captured by the named
	// fingerprints above (e.g. concatenated manifest-merge ancestors,
	// merged ResourceIndex hash for resource-source rules). Empty when
	// the kind's findings depend on no extra context.
	Extra string
}

AndroidFindingsKeyInputs are the components mixed into a cache lookup key. Every field that influences findings for the cached unit MUST be represented here — a missing input is a false-hit waiting to happen.

type AndroidFindingsKind

type AndroidFindingsKind string

AndroidFindingsKind tags the input family inside an AndroidFindings cache key so two different families with coincidentally equal content fingerprints can never share a cache entry.

const (
	AndroidFindingsKindManifest             AndroidFindingsKind = "manifest"
	AndroidFindingsKindGradle               AndroidFindingsKind = "gradle"
	AndroidFindingsKindResources            AndroidFindingsKind = "resources"
	AndroidFindingsKindIcons                AndroidFindingsKind = "icons"
	AndroidFindingsKindManifestBundle       AndroidFindingsKind = "manifest-bundle"
	AndroidFindingsKindResourceBundle       AndroidFindingsKind = "resource-bundle"
	AndroidFindingsKindGradleBundle         AndroidFindingsKind = "gradle-bundle"
	AndroidFindingsKindIconBundle           AndroidFindingsKind = "icon-bundle"
	AndroidFindingsKindResourceSource       AndroidFindingsKind = "resource-source"
	AndroidFindingsKindResourceSourceBundle AndroidFindingsKind = "resource-source-bundle"
	AndroidFindingsKindProject              AndroidFindingsKind = "project"
)

type Baseline

type Baseline struct {
	ManuallySuppressed map[string]bool // IDs manually suppressed by user
	CurrentIssues      map[string]bool // IDs from last run (auto-generated)
}

Baseline represents a set of suppressed finding IDs loaded from a Krit baseline file. XML baselines use the SmellBaseline schema for interoperability with existing Kotlin analyzer workflows.

func LoadBaseline

func LoadBaseline(path string) (*Baseline, error)

LoadBaseline reads a baseline file. Detects format automatically: files ending in .json are read as JSON; all others are parsed as baseline XML.

func LoadBaselineJSON

func LoadBaselineJSON(path string) (*Baseline, error)

LoadBaselineJSON reads a krit-native JSON baseline file.

func LoadBaselineXML

func LoadBaselineXML(path string) (*Baseline, error)

LoadBaselineXML reads a baseline XML file.

func (*Baseline) Contains

func (b *Baseline) Contains(id string) bool

Contains checks if a finding ID is in the baseline (either section).

func (*Baseline) Entries

func (b *Baseline) Entries() []BaselineEntry

Entries returns parsed, sorted entries from both baseline sections.

type BaselineEntry

type BaselineEntry struct {
	ID        string
	Section   string
	Rule      string
	Path      string
	Signature string
}

BaselineEntry is a parsed baseline ID from one of the baseline sections.

func ParseBaselineEntry

func ParseBaselineEntry(section, id string) BaselineEntry

ParseBaselineEntry splits a baseline ID into its parts.

type BaselineIDList

type BaselineIDList struct {
	IDs []string `xml:"ID"`
}

type BaselineXML

type BaselineXML struct {
	XMLName            xml.Name       `xml:"SmellBaseline"`
	ManuallySuppressed BaselineIDList `xml:"ManuallySuppressedIssues"`
	CurrentIssues      BaselineIDList `xml:"CurrentIssues"`
}

BaselineXML is the XML structure used by Krit baseline files.

type BinaryFix

type BinaryFix struct {
	Type         BinaryFixType
	SourcePath   string // original file
	TargetPath   string // new file (empty = generate alongside with new extension)
	Description  string
	DeleteSource bool   // delete source file after successful conversion
	Content      []byte // file content for BinaryFixCreateFile
	HintOnly     bool   // when true, the fix is informational only (no automatic action)
	MinSdk       int    // minimum SDK for this fix to be safe (0 = no restriction)
}

BinaryFix represents a fix that operates on binary files (images, etc.)

type BinaryFixType

type BinaryFixType int

BinaryFixType enumerates the kinds of binary file operations.

const (
	BinaryFixConvertWebP BinaryFixType = iota
	BinaryFixDeleteFile
	BinaryFixCreateFile
	BinaryFixMoveFile
	BinaryFixOptimizePNG
)

type ClassFanInStat

type ClassFanInStat struct {
	Symbol           Symbol
	ReferencingFiles []string
	FanIn            int
}

ClassFanInStat captures how many external files reference a class-like symbol.

type CodeIndex

type CodeIndex struct {
	Symbols    []Symbol
	References []Reference
	Files      []*File

	// Fingerprint is the cache fingerprint computed from the input file
	// set's content hashes. Populated by BuildIndexCached on both hit
	// and miss paths so downstream callers (e.g. cross-file findings
	// cache) can reuse it as part of their own cache keys without
	// rehashing every file.
	Fingerprint string
	// contains filtered or unexported fields
}

CodeIndex holds the cross-file symbol table.

func BuildIndex

func BuildIndex(files []*File, workers int, javaFiles ...*File) *CodeIndex

BuildIndex constructs a cross-file index from parsed Kotlin and Java files.

func BuildIndexCached

func BuildIndexCached(cacheDir string, files []*File, workers int, tracker perf.Tracker, javaFiles ...*File) (*CodeIndex, bool)

BuildIndexCached behaves like BuildIndexWithTracker but tries the on-disk cross-file index cache first. When cacheDir is empty, the cache is bypassed entirely and this reduces to BuildIndexWithTracker. On a miss (or when persistence fails) the full build path runs and the result is written back. Returns the index and a bool reporting whether the cache was hit.

func BuildIndexFromData

func BuildIndexFromData(symbols []Symbol, refs []Reference) *CodeIndex

BuildIndexFromData constructs a CodeIndex from pre-collected symbols and references. This lets callers reuse indexing work instead of rescanning ASTs.

func BuildIndexFromDataWithBloom

func BuildIndexFromDataWithBloom(symbols []Symbol, refs []Reference, prebuilt *bloom.BloomFilter, tracker perf.Tracker) *CodeIndex

BuildIndexFromDataWithBloom is like BuildIndexFromDataWithTracker but accepts a pre-built bloom filter. When prebuilt is non-nil it replaces the AddString loop in lookup-map construction, so warm-load paths that already unioned per-shard blooms don't pay the per-reference hash cost again. prebuilt must cover at least every ref's Name; extra items are fine (bloom false positives are already tolerated by callers) but missing items would produce false negatives and are considered a bug.

func BuildIndexFromDataWithTracker

func BuildIndexFromDataWithTracker(symbols []Symbol, refs []Reference, tracker perf.Tracker) *CodeIndex

BuildIndexFromDataWithTracker constructs a CodeIndex from pre-collected symbols and references and records sub-phase timings when tracker is enabled.

func BuildIndexIncremental

func BuildIndexIncremental(base *CodeIndex, removePaths map[string]bool, addSymbols []Symbol, addRefs []Reference) *CodeIndex

BuildIndexIncremental returns base with the listed file contributions removed and the supplied fresh contributions added. It is used by the cross-file overlay cache so a small edit can update the lookup maps without rescanning unchanged files or rewriting the compacted full payload.

func BuildIndexWithTracker

func BuildIndexWithTracker(files []*File, workers int, tracker perf.Tracker, javaFiles ...*File) *CodeIndex

BuildIndexWithTracker constructs a cross-file index and records sub-phase timings when tracker is enabled.

func LoadCrossFileCacheIndex

func LoadCrossFileCacheIndex(dir, wantFingerprint string) (*CodeIndex, bool)

LoadCrossFileCacheIndex returns a fully-assembled CodeIndex (symbols, references, lookup maps, and bloom filter) when the on-disk fingerprint matches. A miss — missing files, version mismatch, decode error, fingerprint drift, or missing lookup section — is never an error; callers fall back to BuildIndex.

func (*CodeIndex) BloomStats

func (idx *CodeIndex) BloomStats() (refBits, crossBits uint)

BloomStats returns the bloom filter memory usage in bytes.

func (*CodeIndex) ClassLikeFanInStats

func (idx *CodeIndex) ClassLikeFanInStats(ignoreCommentRefs bool) []ClassFanInStat

ClassLikeFanInStats returns class-like declarations with their distinct external referencing files, sorted by descending fan-in.

func (*CodeIndex) CountNonCommentRefsInFile

func (idx *CodeIndex) CountNonCommentRefsInFile(name, file string) int

CountNonCommentRefsInFile counts references to a name in a file that are NOT inside comments.

func (*CodeIndex) IsReferencedOutsideFile

func (idx *CodeIndex) IsReferencedOutsideFile(name, file string) bool

IsReferencedOutsideFile checks if a name is referenced in any file other than the given one.

func (*CodeIndex) IsReferencedOutsideFileExcludingComments

func (idx *CodeIndex) IsReferencedOutsideFileExcludingComments(name, file string) bool

IsReferencedOutsideFileExcludingComments checks if a name has any non-comment reference in a file other than the given one.

func (*CodeIndex) IsSymbolReferencedOutsideFile

func (idx *CodeIndex) IsSymbolReferencedOutsideFile(sym Symbol, ignoreCommentRefs bool) bool

IsSymbolReferencedOutsideFile checks whether sym is referenced from another file by either simple name or fully-qualified name.

func (*CodeIndex) MayHaveReference

func (idx *CodeIndex) MayHaveReference(name string) bool

ReferenceCount returns how many times a name is referenced across all files.

func (*CodeIndex) ReferenceCount

func (idx *CodeIndex) ReferenceCount(name string) int

ReferenceCount returns how many times a name is referenced across all files.

func (*CodeIndex) ReferenceFiles

func (idx *CodeIndex) ReferenceFiles(name string) map[string]bool

ReferenceFiles returns the set of files that reference a name.

func (*CodeIndex) ResolveCallable

func (idx *CodeIndex) ResolveCallable(file *File, receiver, name string, arity int) []ResolvedSymbol

ResolveCallable resolves a function/method/property callable from the mixed source index. arity < 0 disables arity filtering.

func (*CodeIndex) ResolveType

func (idx *CodeIndex) ResolveType(file *File, name string) []ResolvedSymbol

ResolveType resolves a type name from a Kotlin or Java source file using source-visible package and import information plus declarations in the mixed-language CodeIndex.

func (*CodeIndex) SymbolByFQN

func (idx *CodeIndex) SymbolByFQN(fqn string) (Symbol, bool)

SymbolByFQN returns the declaration with the exact fully-qualified name.

func (*CodeIndex) SymbolReferenceCount

func (idx *CodeIndex) SymbolReferenceCount(sym Symbol) int

SymbolReferenceCount returns the total number of references that can identify sym by either simple name or fully-qualified name.

func (*CodeIndex) SymbolsNamed

func (idx *CodeIndex) SymbolsNamed(name string) []Symbol

SymbolsNamed returns declarations indexed under a simple or fully-qualified name. A nil result means no matching declaration is known.

func (*CodeIndex) UnusedSymbols

func (idx *CodeIndex) UnusedSymbols(ignoreCommentRefs bool) []Symbol

UnusedSymbols returns symbols that are never referenced from any other file. If ignoreCommentRefs is true, references inside comments don't count as usage.

type ConservativeDeltaPlanner

type ConservativeDeltaPlanner struct{}

func (ConservativeDeltaPlanner) Plan

func (ConservativeDeltaPlanner) Plan(previous, current RunFingerprint, changed []string) DeltaPlan

type CrossFileCacheMeta

type CrossFileCacheMeta struct {
	Version             int                `json:"version"`
	Fingerprint         string             `json:"fingerprint"`
	KotlinFiles         int                `json:"kotlin_files"`
	JavaFiles           int                `json:"java_files"`
	XMLFiles            int                `json:"xml_files"`
	SymbolCount         int                `json:"symbol_count"`
	ReferenceCount      int                `json:"reference_count"`
	Entries             []fingerprintEntry `json:"entries,omitempty"`
	PayloadEntries      []fingerprintEntry `json:"payload_entries,omitempty"`
	OverlayEntries      []fingerprintEntry `json:"overlay_entries,omitempty"`
	RemovedPayloadPaths []string           `json:"removed_payload_paths,omitempty"`
	WrittenAt           time.Time          `json:"written_at"`
	KritVersion         string             `json:"krit_version,omitempty"`
}

CrossFileCacheMeta is persisted alongside the serialized symbols and references. JSON-encoded for human inspection.

func LoadCurrentCrossFileCacheMeta

func LoadCurrentCrossFileCacheMeta(dir string) (CrossFileCacheMeta, bool)

LoadCurrentCrossFileCacheMeta returns the current cache metadata without reading or decoding the full index payload.

type DeltaPlan

type DeltaPlan struct {
	ReusePrevious bool
	ChangedPaths  []string
	AffectedPaths []string
}

type DeltaPlanner

type DeltaPlanner interface {
	Plan(previous, current RunFingerprint, changed []string) DeltaPlan
}

type DependentsIndex added in v0.2.0

type DependentsIndex struct {
	// contains filtered or unexported fields
}

DependentsIndex is a reverse-dependency map keyed by imported FQN. Given a changed file plus the FQNs that file declares (or removes), callers can compute the tight set of source files whose rule output might change. Used by watch-mode and LSP didChange to narrow incremental rerun scope from "every file" to "files reachable from the diff".

The index records explicit imports only (`import a.b.C`). Wildcard imports (`import a.b.*`) are recorded as the package name. Aliases resolve to their underlying FQN.

func BuildDependentsIndex added in v0.2.0

func BuildDependentsIndex(files []*File) *DependentsIndex

BuildDependentsIndex walks each file's import_header nodes and constructs the per-file FQN list and its inverse. Pass parsed Kotlin files; non-Kotlin files contribute nothing. Files with no imports are still recorded with an empty slice so ImportsOfFile distinguishes "indexed but importless" from "not indexed".

func (*DependentsIndex) FilesAffectedBy added in v0.2.0

func (d *DependentsIndex) FilesAffectedBy(changedFiles, changedFQNs []string) []string

FilesAffectedBy returns the union of:

  • the changed files themselves
  • every file that explicitly imports any FQN in changedFQNs
  • every file with a wildcard import covering any FQN in changedFQNs

The result is sorted and deduped. Used by watch-mode reruns to compute "given that file F changed and declares FQNs X, which files do I need to re-run rules on?" without scanning the whole project.

changedFQNs may be nil, in which case only changedFiles are returned — useful when a file's edits affected nothing externally observable.

func (*DependentsIndex) FilesImporting added in v0.2.0

func (d *DependentsIndex) FilesImporting(fqn string) []string

FilesImporting returns the file paths that import the given FQN explicitly. Sorted ascending. Wildcard importers are not included here — use FilesImportingPackage for those.

func (*DependentsIndex) FilesImportingPackage added in v0.2.0

func (d *DependentsIndex) FilesImportingPackage(pkg string) []string

FilesImportingPackage returns the file paths that import the given package via a wildcard (`import pkg.*`). Sorted ascending.

func (*DependentsIndex) ImportsOfFile added in v0.2.0

func (d *DependentsIndex) ImportsOfFile(path string) []string

ImportsOfFile returns the explicit-import FQNs declared in the given file path. The returned slice is sorted and deduped. Returns nil for files the index doesn't know about.

type DiskFindingsBundleStore

type DiskFindingsBundleStore struct{}

func (DiskFindingsBundleStore) Load

func (DiskFindingsBundleStore) Save

type File

type File struct {
	Path     string
	Language Language
	Content  []byte
	Lines    []string
	FlatTree *FlatTree

	// Generated is true when the file came from a build/generated/**
	// directory and was kept by the parse phase's known-safe-generator
	// allowlist (Hilt, KSP, Kapt, ViewBinding, DataBinding, etc.).
	// Rules that should not lint generated code — but still want their
	// resolver to index it — gate on this. Pure source files always
	// have Generated == false.
	Generated bool

	// Metadata carries language-specific parsed structures (e.g.
	// *android.ManifestMeta, *android.ResourceMeta, *android.BuildConfig)
	// for non-source-language files. Nil for Kotlin/Java.
	Metadata any

	// PrecomputedReferences optionally stores cross-file references
	// collected during a specialized source parse path. ReferencesPrecomputed
	// distinguishes an intentionally empty reference set from "not computed".
	PrecomputedReferences []Reference
	ReferencesPrecomputed bool

	// SuppressionIdx is the byte-range annotation index. Populated by
	// the pipeline.Parse phase as a side-effect of building Suppression;
	// retained as its own field for legacy callers and tests that have
	// not yet migrated to the unified filter.
	SuppressionIdx *SuppressionIndex

	// Suppression is the unified per-file suppression filter combining
	// annotations, config excludes, baseline, and inline comments. Built
	// once in pipeline.Parse and consulted by the dispatcher, cross-file
	// phase, and any other post-collect filter. Nil when the caller
	// (LSP/MCP ParseSingle) builds files without running Parse; the
	// dispatcher handles the nil case by lazily building a filter.
	Suppression *SuppressionFilter
	// contains filtered or unexported fields
}

File holds parsed source in flat form. The cgo parse tree is used only during flattening and is not retained on the File.

func NewParsedFile

func NewParsedFile(path string, content []byte, tree *sitter.Tree) *File

NewParsedFile builds a scanner.File from already-parsed Kotlin source. The incoming tree is flattened immediately and not retained.

func ParseFile

func ParseFile(path string) (*File, error)

ParseFile parses a Kotlin file and returns the AST.

func ParseJavaFile

func ParseJavaFile(path string) (*File, error)

ParseJavaFile parses a Java file and returns a File with its AST.

func ParseJavaFileCached

func ParseJavaFileCached(path string, pc *ParseCache) (*File, error)

ParseJavaFileCached parses a Java file, consulting the parse cache first when pc is non-nil. On a cache hit the tree-sitter parse and flattenTree walk are both skipped. A nil pc behaves exactly like an uncached parse.

func ParseJavaFileCachedForIndex

func ParseJavaFileCachedForIndex(path string, pc *ParseCache, stats *JavaIndexPerf) (*File, error)

ParseJavaFileCachedForIndex is a reference-indexing-only Java parse path. It skips line splitting and, on parse-cache misses, precomputes Java references so index construction can reuse the same flattened tree walk.

func ParseKotlinFileCached

func ParseKotlinFileCached(path string, pc *ParseCache) (*File, error)

ParseKotlinFileCached parses a Kotlin file, consulting the parse cache first when pc is non-nil. On a cache hit the tree-sitter parse and flattenTree walk are both skipped. A nil pc behaves exactly like ParseFile.

func ScanFiles

func ScanFiles(paths []string, workers int) ([]*File, []error)

ScanFiles parses all files in parallel and returns parsed File objects.

func ScanFilesCached

func ScanFilesCached(paths []string, workers int, pc *ParseCache) ([]*File, []error)

ScanFilesCached is like ScanFiles but routes every file through ParseKotlinFileCached so the on-disk parse cache is consulted (and populated) on each file. A nil pc is a no-op cache.

func ScanJavaFiles

func ScanJavaFiles(paths []string, workers int) ([]*File, []error)

ScanJavaFiles parses all Java files in parallel (for reference indexing only).

func ScanJavaFilesCached

func ScanJavaFilesCached(paths []string, workers int, pc *ParseCache) ([]*File, []error)

ScanJavaFilesCached is like ScanJavaFiles but routes every file through ParseJavaFileCached so the on-disk parse cache is consulted (and populated) on each file. A nil pc is a no-op cache.

func ScanJavaFilesCachedForIndex

func ScanJavaFilesCachedForIndex(paths []string, workers int, pc *ParseCache, stats *JavaIndexPerf) ([]*File, []error)

ScanJavaFilesCachedForIndex parses Java files for cross-file indexing.

func (*File) FlatChild

func (f *File) FlatChild(parent uint32, childIdx int) uint32

func (*File) FlatChildBytesOrNil

func (f *File) FlatChildBytesOrNil(parent uint32, childType string) []byte

FlatChildBytesOrNil mirrors FlatChildTextOrEmpty for the []byte form.

func (*File) FlatChildCount

func (f *File) FlatChildCount(idx uint32) int

func (*File) FlatChildTextOrEmpty

func (f *File) FlatChildTextOrEmpty(parent uint32, childType string) string

FlatChildTextOrEmpty returns the text of the first child of the given type, or "" if no such child exists. Replaces the sentinel-zero pattern where FlatNodeText(FlatFindChild(...)) silently returned the entire file source.

func (*File) FlatCol

func (f *File) FlatCol(idx uint32) int

func (*File) FlatCountNodes

func (f *File) FlatCountNodes(root uint32, nodeType string) int

func (*File) FlatEndByte

func (f *File) FlatEndByte(idx uint32) uint32

func (*File) FlatFindChild

func (f *File) FlatFindChild(parent uint32, childType string) (uint32, bool)

func (*File) FlatFindModifierNode

func (f *File) FlatFindModifierNode(idx uint32, modifier string) uint32

func (*File) FlatFirstChild

func (f *File) FlatFirstChild(parent uint32) uint32

FlatFirstChild returns the first child index of parent, or 0 if parent has no children. O(1). Intended for linked-list iteration over children:

for c := file.FlatFirstChild(p); c != 0; c = file.FlatNextSib(c) {
    // ... use c ...
}

Prefer this over `for i := 0; i < FlatChildCount(p); i++ { FlatChild(p, i) }` which is O(k) per child access and O(N²) across the full iteration.

func (*File) FlatForEachChild

func (f *File) FlatForEachChild(parent uint32, fn func(uint32))

func (*File) FlatHasAncestorOfType

func (f *File) FlatHasAncestorOfType(idx uint32, ancestorType string) bool

func (*File) FlatHasAnyAncestorOfType

func (f *File) FlatHasAnyAncestorOfType(idx uint32, ancestorTypes ...uint16) bool

func (*File) FlatHasChildOfType

func (f *File) FlatHasChildOfType(parent uint32, childType string) bool

func (*File) FlatHasModifier

func (f *File) FlatHasModifier(idx uint32, modifier string) bool

func (*File) FlatIsNamed

func (f *File) FlatIsNamed(idx uint32) bool

FlatIsNamed reports whether the node at idx is a named tree-sitter node (i.e., not an anonymous punctuation / keyword token). O(1). Use inside linked-list child iteration to replicate the semantics of FlatNamedChild:

for c := file.FlatFirstChild(p); c != 0; c = file.FlatNextSib(c) {
    if !file.FlatIsNamed(c) { continue }
    // ... use c as a named child ...
}

func (*File) FlatNamedChild

func (f *File) FlatNamedChild(parent uint32, childIdx int) uint32

func (*File) FlatNamedChildCount

func (f *File) FlatNamedChildCount(idx uint32) int

func (*File) FlatNamedDescendantForByteRange

func (f *File) FlatNamedDescendantForByteRange(startByte, endByte uint32) (uint32, bool)

func (*File) FlatNextSib

func (f *File) FlatNextSib(idx uint32) uint32

FlatNextSib returns the next sibling index, or 0 if idx is the last child. O(1). Simpler variant of FlatNextSibling (which returns (uint32, bool)) for the linked-list iteration idiom.

func (*File) FlatNextSibling

func (f *File) FlatNextSibling(idx uint32) (uint32, bool)

func (*File) FlatNodeBytes

func (f *File) FlatNodeBytes(idx uint32) []byte

func (*File) FlatNodeString

func (f *File) FlatNodeString(idx uint32, pool *StringPool) string

func (*File) FlatNodeText

func (f *File) FlatNodeText(idx uint32) string

func (*File) FlatNodeTextEquals

func (f *File) FlatNodeTextEquals(idx uint32, s string) bool

func (*File) FlatParent

func (f *File) FlatParent(idx uint32) (uint32, bool)

func (*File) FlatPrevSibling

func (f *File) FlatPrevSibling(idx uint32) (uint32, bool)

func (*File) FlatRow

func (f *File) FlatRow(idx uint32) int

func (*File) FlatStartByte

func (f *File) FlatStartByte(idx uint32) uint32

func (*File) FlatType

func (f *File) FlatType(idx uint32) string

func (*File) FlatWalkAllNodes

func (f *File) FlatWalkAllNodes(root uint32, fn func(uint32))

func (*File) FlatWalkNodes

func (f *File) FlatWalkNodes(root uint32, nodeType string, fn func(uint32))

func (*File) LineOffset

func (f *File) LineOffset(lineIdx int) int

LineOffset returns the byte offset for the start of the given line index (0-based). If lineIdx is out of range, returns len(Content).

func (*File) LineOffsets

func (f *File) LineOffsets() []int

LineOffsets returns the byte offset of each line start, computed lazily and cached.

type FileStat added in v0.2.0

type FileStat struct {
	Size            int64 `json:"size"`
	ModTimeUnixNano int64 `json:"modTimeUnixNano"`
}

type Finding

type Finding struct {
	File       string
	Line       int
	Col        int
	StartByte  int
	EndByte    int
	RuleSet    string
	Rule       string
	Severity   string
	Message    string
	Fix        *Fix       // nil if no auto-fix available
	BinaryFix  *BinaryFix // nil if no binary fix available
	Confidence float64    // 0.0-1.0, 0 means not set
}

Finding is the serialization-boundary representation of a single lint finding. Internally krit stores findings in columnar form via FindingColumns (see findings.go) — Finding is the per-row struct used at boundaries: output formatters (JSON/SARIF/plain/checkstyle) marshal from it, rule bodies produce it for Context.Emit (which immediately writes into a FindingCollector), and tests construct it to seed columns via CollectFindings. New internal code should prefer the columnar accessors; construct Finding only at serialization or emit boundaries.

func FilterByBaseline

func FilterByBaseline(findings []Finding, baseline *Baseline, basePath string) []Finding

FilterByBaseline removes findings that are in the baseline. Tries both relative-path IDs and filename-only IDs for compatibility with baseline files generated with or without module-relative paths.

type FindingCollector

type FindingCollector struct {
	// contains filtered or unexported fields
}

FindingCollector incrementally builds a FindingColumns instance.

FindingCollector is NOT safe for concurrent use: its intern maps and column slices are unsynchronized. To parallelize rule or per-file execution, each worker goroutine should hold its own FindingCollector and the phase owner should serially merge them at a phase boundary via AppendColumns or MergeCollectors. Deterministic output is recovered with SortByFileLine (or SortedRowOrderByFileLine) after the merge — never per-worker — so cross-worker interleavings do not affect the final row order.

func MergeCollectors

func MergeCollectors(dst *FindingCollector, workers ...*FindingCollector) *FindingCollector

MergeCollectors serially folds the columns from each worker-local collector into a single destination collector, preserving per-worker insertion order and each worker's relative ordering of findings. It is intended to be called once at a phase boundary after all worker goroutines have stopped appending; MergeCollectors itself is single-threaded.

A nil entry in workers is skipped. The returned collector is the same as dst; when dst is nil a fresh collector sized for the combined row count is allocated.

func NewFindingCollector

func NewFindingCollector(capacity int) *FindingCollector

NewFindingCollector creates a collector sized for an expected number of rows.

func (*FindingCollector) Append

func (c *FindingCollector) Append(f Finding)

Append adds a finding row to the collector.

func (*FindingCollector) AppendAll

func (c *FindingCollector) AppendAll(findings []Finding)

AppendAll adds each finding in order.

func (*FindingCollector) AppendColumns

func (c *FindingCollector) AppendColumns(columns *FindingColumns)

AppendColumns merges an existing columnar finding set into the collector.

func (*FindingCollector) AppendRow

func (c *FindingCollector) AppendRow(columns *FindingColumns, row int)

AppendRow copies a single row from an existing columnar finding set.

func (*FindingCollector) Columns

func (c *FindingCollector) Columns() *FindingColumns

Columns returns the built columns.

type FindingColumns

type FindingColumns struct {
	Files         []string
	RuleSets      []string
	Rules         []string
	Messages      []string
	FixPool       []Fix
	BinaryFixPool []BinaryFix

	FileIdx        []uint32
	Line           []uint32
	Col            []uint16
	StartByte      []uint32
	EndByte        []uint32
	RuleSetIdx     []uint16
	RuleIdx        []uint16
	SeverityID     []uint8
	MessageIdx     []uint32
	Confidence     []uint8
	FixStart       []uint32
	BinaryFixStart []uint32
	N              int
	// contains filtered or unexported fields
}

FindingColumns stores finding rows in mostly-scalar parallel slices. String-heavy fields are interned into side tables so the hot row data stays flat.

func ApplyDelta

func ApplyDelta(previous *FindingColumns, replacement *FindingColumns, affected []string) FindingColumns

func CollectFindings

func CollectFindings(findings []Finding) FindingColumns

CollectFindings materializes findings into columnar storage in one step.

func FilterColumnsByBaseline

func FilterColumnsByBaseline(columns *FindingColumns, baseline *Baseline, basePath string) FindingColumns

FilterColumnsByBaseline removes findings that are in the baseline without materializing intermediate []Finding slices.

func FilterColumnsByFilePaths

func FilterColumnsByFilePaths(columns *FindingColumns, allowedPaths map[string]bool) FindingColumns

FilterColumnsByFilePaths keeps only rows whose file path resolves to an absolute path present in allowedPaths.

func LoadAndroidFindings

func LoadAndroidFindings(cacheDir, key string) (FindingColumns, bool)

LoadAndroidFindings reads cached findings for the given key. Returns (cols, true) on hit and (FindingColumns{}, false) on any miss (missing file, version/CRC mismatch, key collision, decode error).

func LoadCrossFindings

func LoadCrossFindings(cacheDir, key string) (FindingColumns, bool)

LoadCrossFindings reads cached cross-rule findings for the given key. Returns (cols, true) on hit and (FindingColumns{}, false) on any miss (including key mismatch, version mismatch, CRC failure, missing file).

func LoadLastCrossFindings added in v0.2.0

func LoadLastCrossFindings(cacheDir string) (FindingColumns, bool)

LoadLastCrossFindings reads the single on-disk cross-findings snapshot without requiring the caller to know its key. This is for warm-delta planning: the cache file is intentionally a last-value slot, so a body-only edit can reuse the previous snapshot before the current content fingerprint is available.

func (*FindingColumns) BinaryFixAt

func (c *FindingColumns) BinaryFixAt(i int) *BinaryFix

BinaryFixAt returns a cloned binary fix for row i, or nil when absent.

func (*FindingColumns) Clone

func (c *FindingColumns) Clone() FindingColumns

Clone returns a deep copy of the columns, including pooled fix state.

func (*FindingColumns) ColumnAt

func (c *FindingColumns) ColumnAt(i int) int

ColumnAt returns the 1-based column number for row i.

func (*FindingColumns) ConfidenceAt

func (c *FindingColumns) ConfidenceAt(i int) float64

ConfidenceAt returns the confidence value for row i in the 0.0-1.0 range.

func (*FindingColumns) CountTextFixes

func (c *FindingColumns) CountTextFixes() int

CountTextFixes returns the number of rows with a text auto-fix.

func (*FindingColumns) EndByteAt

func (c *FindingColumns) EndByteAt(i int) int

EndByteAt returns the byte offset where row i ends, or 0 when unset.

func (*FindingColumns) FileAt

func (c *FindingColumns) FileAt(i int) string

FileAt returns the file path for row i.

func (*FindingColumns) FilterByMinConfidence

func (c *FindingColumns) FilterByMinConfidence(minVal float64) FindingColumns

FilterByMinConfidence drops rows whose confidence is strictly below the given threshold (0.0-1.0). A row with confidence == 0 is treated as "unset" and kept when min is 0, dropped when min > 0. Returns a new FindingColumns; the original is unchanged.

func (*FindingColumns) FilterRows

func (c *FindingColumns) FilterRows(keep func(row int) bool) FindingColumns

FilterRows keeps rows for which keep returns true while preserving fix pools and interned string tables via the collector append path.

func (*FindingColumns) Finding

func (c *FindingColumns) Finding(i int) Finding

Finding reconstructs the i'th row as a compatibility Finding value.

func (*FindingColumns) Findings

func (c *FindingColumns) Findings() []Finding

Findings reconstructs all rows as compatibility Finding values.

func (*FindingColumns) FindingsWithFixes

func (c *FindingColumns) FindingsWithFixes() []Finding

FindingsWithFixes reconstructs only rows that carry a text or binary fix. Row order is preserved so downstream fix application stays deterministic.

func (*FindingColumns) FixAt

func (c *FindingColumns) FixAt(i int) *Fix

FixAt returns a copy of the text auto-fix for row i, or nil when absent.

func (*FindingColumns) HasFix

func (c *FindingColumns) HasFix(i int) bool

HasFix reports whether row i has a text auto-fix.

func (*FindingColumns) Len

func (c *FindingColumns) Len() int

Len returns the number of stored findings.

func (*FindingColumns) LineAt

func (c *FindingColumns) LineAt(i int) int

LineAt returns the 1-based line number for row i.

func (FindingColumns) MarshalJSON

func (c FindingColumns) MarshalJSON() ([]byte, error)

MarshalJSON persists finding columns with a stable lowercase schema rather than exposing Go field names directly.

func (*FindingColumns) MessageAt

func (c *FindingColumns) MessageAt(i int) string

MessageAt returns the message string for row i.

func (*FindingColumns) PromoteWarningsToErrors

func (c *FindingColumns) PromoteWarningsToErrors()

PromoteWarningsToErrors rewrites warning severities in-place without materializing Finding structs.

func (*FindingColumns) RuleAt

func (c *FindingColumns) RuleAt(i int) string

RuleAt returns the rule name for row i.

func (*FindingColumns) RuleSetAt

func (c *FindingColumns) RuleSetAt(i int) string

RuleSetAt returns the ruleset name for row i.

func (*FindingColumns) SeverityAt

func (c *FindingColumns) SeverityAt(i int) string

SeverityAt returns the severity string for row i.

func (*FindingColumns) SortByFileLine

func (c *FindingColumns) SortByFileLine()

SortByFileLine reorders rows in-place using file, line, then column ordering.

func (*FindingColumns) SortedRowOrderByFileLine

func (c *FindingColumns) SortedRowOrderByFileLine() []int

SortedRowOrderByFileLine returns row indexes ordered by file, line, column, then lexical ruleset/rule tie-breakers. The column data is not mutated.

func (*FindingColumns) StartByteAt

func (c *FindingColumns) StartByteAt(i int) int

StartByteAt returns the byte offset where row i starts, or 0 when unset.

func (*FindingColumns) StripTextFixes

func (c *FindingColumns) StripTextFixes(drop func(row int) bool) int

StripTextFixes removes text auto-fixes from rows matching drop and returns the number of stripped fixes. Binary fixes are preserved.

func (*FindingColumns) UnmarshalJSON

func (c *FindingColumns) UnmarshalJSON(data []byte) error

UnmarshalJSON accepts the stable lowercase schema and the prior exported Go field-name schema written by older iterations of the cache.

func (*FindingColumns) VisitRowsWithBinaryFixes

func (c *FindingColumns) VisitRowsWithBinaryFixes(yield func(row int))

VisitRowsWithBinaryFixes visits row indexes that carry a binary auto-fix.

func (*FindingColumns) VisitRowsWithTextFixes

func (c *FindingColumns) VisitRowsWithTextFixes(yield func(row int))

VisitRowsWithTextFixes visits row indexes that carry a text auto-fix.

func (*FindingColumns) VisitSortedByFileLine

func (c *FindingColumns) VisitSortedByFileLine(yield func(row int))

VisitSortedByFileLine visits row indexes ordered by file, line, column, then lexical ruleset/rule tie-breakers without allocating a result slice.

type FindingsBundleManifest added in v0.2.0

type FindingsBundleManifest struct {
	Version       int                 `json:"version"`
	Key           string              `json:"key"`
	BundleKey     string              `json:"bundleKey"`
	Fingerprint   RunFingerprint      `json:"fingerprint"`
	ContentHashes map[string]string   `json:"contentHashes"`
	StructuralFPs map[string]string   `json:"structuralFps,omitempty"`
	FileStats     map[string]FileStat `json:"fileStats,omitempty"`
}

func LoadFindingsBundleManifest added in v0.2.0

func LoadFindingsBundleManifest(repoDir, key string) (FindingsBundleManifest, bool)

LoadFindingsBundleManifest reads the manifest for the given run key, returning (manifest, true) on success or (zero, false) when it's missing or invalid. Manifest-version mismatches are treated as missing so a krit upgrade doesn't serve stale entries.

type FindingsBundleStore

type FindingsBundleStore interface {
	Load(root string, fp RunFingerprint) (*FindingColumns, bool)
	Save(root string, fp RunFingerprint, cols *FindingColumns) error
}

type Fix

type Fix struct {
	// Line-based replacement: replace lines[StartLine-1:EndLine] with Replacement
	StartLine   int
	EndLine     int
	Replacement string
	// Byte-based replacement (more precise): replace content[StartByte:EndByte]
	StartByte int
	EndByte   int
	ByteMode  bool // if true, use byte offsets instead of line offsets
}

Fix describes an auto-fix for a finding.

type FlatNode

type FlatNode struct {
	Type       uint16
	Parent     uint32
	FirstChild uint32
	NextSib    uint32
	PrevSib    uint32
	StartByte  uint32
	EndByte    uint32
	StartRow   uint16
	StartCol   uint16
	ChildCount uint16
	NamedCount uint16
	Flags      uint8
}

FlatNode stores a tree-sitter node in a compact, cgo-free form. Size: 40 bytes. PrevSib is stored explicitly so FlatPrevSibling can be O(1); without it, prev-sibling access required walking from FirstChild which gave O(sibling_index) per call and O(N²) across adjacent callers.

func (FlatNode) HasError

func (n FlatNode) HasError() bool

HasError reports whether this node or its subtree contains a parse error.

func (FlatNode) IsNamed

func (n FlatNode) IsNamed() bool

IsNamed reports whether this node is a named tree-sitter node.

func (FlatNode) TypeName

func (n FlatNode) TypeName() string

TypeName resolves the node's interned type back to its string name.

type FlatTree

type FlatTree struct {
	Nodes []FlatNode
}

FlatTree holds a preorder-flattened syntax tree.

type JavaIndexPerf

type JavaIndexPerf struct {
	Files       atomic.Int64
	Bytes       atomic.Int64
	CacheHits   atomic.Int64
	CacheMisses atomic.Int64

	FileReadNs            atomic.Int64
	ParseCacheLoadNs      atomic.Int64
	TreeSitterParseNs     atomic.Int64
	FlattenTreeNs         atomic.Int64
	QueueParseCacheSaveNs atomic.Int64
	ReferenceExtractionNs atomic.Int64
}

JavaIndexPerf aggregates Java parse/reference timings across parallel workers. Durations are stored in nanoseconds and emitted once by callers.

func (*JavaIndexPerf) Snapshot

func (p *JavaIndexPerf) Snapshot() JavaIndexPerfSnapshot

Snapshot returns a point-in-time copy of the aggregate counters.

type JavaIndexPerfSnapshot

type JavaIndexPerfSnapshot struct {
	Files       int64
	Bytes       int64
	CacheHits   int64
	CacheMisses int64

	FileReadNs            int64
	ParseCacheLoadNs      int64
	TreeSitterParseNs     int64
	FlattenTreeNs         int64
	QueueParseCacheSaveNs int64
	ReferenceExtractionNs int64
}

JavaIndexPerfSnapshot is an immutable copy of JavaIndexPerf counters.

type Language

type Language uint8

Language identifies which source language a File holds. Used by the dispatcher to skip rules whose declared Languages list excludes this file.

const (
	// LangKotlin is the default for files parsed by ParseFile. Rules with
	// no declared Languages list default to targeting Kotlin only.
	LangKotlin Language = iota
	LangJava
	// LangXML covers both AndroidManifest.xml and res/ XML files. The
	// specific kind (manifest vs resource) lives in File.Metadata.
	LangXML
	LangGradle
	LangVersionCatalog
)

func (Language) String

func (l Language) String() string

String returns a short human-readable name for the language.

type LocalPool

type LocalPool struct {
	// contains filtered or unexported fields
}

LocalPool caches recently-seen values without synchronization and falls back to the shared global pool when a string is first observed.

func NewLocalPool

func NewLocalPool(fallback *StringPool) *LocalPool

NewLocalPool creates an unsynchronized pool backed by fallback.

func (*LocalPool) Intern

func (p *LocalPool) Intern(s string) string

Intern returns a canonical string value and promotes it into the local cache.

type ParseCache

type ParseCache struct {
	// contains filtered or unexported fields
}

ParseCache persists FlatTree parse results keyed by content hash. A nil *ParseCache is a valid disabled cache — every method is a safe no-op.

Each language holds its own LRU size cap; when a langCache's on-disk total exceeds its CapBytes, Save evicts the least-recently-accessed entries down to LowWaterFrac (80%) of the cap. Caps are per-language so a huge Kotlin corpus doesn't starve the Java cache and vice versa.

func NewParseCache

func NewParseCache(repoDir string) (*ParseCache, error)

NewParseCache returns a ParseCache rooted at repoDir/.krit/parse-cache. A schema-version, hash-algo, or grammar-version mismatch in the existing metadata clears the affected language's entries subtree. Kotlin and Java are versioned independently. The default per-language size cap (cacheutil.DefaultParseCacheCapBytes) is applied.

func NewParseCacheWithCap

func NewParseCacheWithCap(repoDir string, capBytes int64) (*ParseCache, error)

NewParseCacheWithCap is NewParseCache with an explicit per-language byte cap. capBytes <= 0 disables the cap (no eviction). The cap applies to each language's subtree independently so a Kotlin-heavy repo doesn't starve Java cached entries.

func (*ParseCache) AddPerfEntries

func (pc *ParseCache) AddPerfEntries(t perf.Tracker)

func (*ParseCache) AsyncStats

func (pc *ParseCache) AsyncStats() cacheutil.AsyncWriterStats

AsyncStats returns the background writer counters when async persistence is enabled.

func (*ParseCache) Clear

func (pc *ParseCache) Clear() error

Clear removes every cache entry across both languages. The version / grammar-version metadata files are left in place so a subsequent NewParseCache call does not see a schema mismatch.

func (*ParseCache) Close

func (pc *ParseCache) Close() error

Close flushes any async writes and the per-language LRU sidecars. Safe to call multiple times; a nil ParseCache Close is a no-op so callers can always invoke it.

func (*ParseCache) CloseIdle

func (pc *ParseCache) CloseIdle() error

CloseIdle shuts down background workers without flushing LRU metadata or applying eviction. Use it for read-only runs: cache hits may dirty access times, but persisting those touches is not worth blocking process exit.

func (*ParseCache) Dir

func (pc *ParseCache) Dir() string

Dir returns the Kotlin subtree root for the on-disk cache. Kept pointing at the Kotlin dir (not the parse-cache parent) so existing tests that check for cached Kotlin entries under {Dir}/entries keep working. Use Root to get the parent directory containing both languages.

func (*ParseCache) Evict

func (pc *ParseCache) Evict()

Evict forces a cap eviction pass on both per-language LRUs. The hot write path defers eviction (it would otherwise sort+delete on every batch); production callers run eviction once at Close. Tests that want to observe eviction without going through Close call this.

func (*ParseCache) Flush

func (pc *ParseCache) Flush() error

Flush waits for all accepted async write jobs to finish. Synchronous caches have nothing to drain.

func (*ParseCache) HasWrites

func (pc *ParseCache) HasWrites() bool

HasWrites reports whether this process queued or performed parse-cache writes. Cache-hit LRU touches are intentionally excluded so callers can choose a read-only close path.

func (*ParseCache) JavaDir

func (pc *ParseCache) JavaDir() string

JavaDir returns the Java subtree root for the on-disk cache. Empty when pc is nil.

func (*ParseCache) LRUStats

func (pc *ParseCache) LRUStats() cacheutil.LRUStats

LRUStats returns a combined LRU snapshot across both languages. Entries and Bytes are summed; Cap reflects the per-language cap (both languages share the same configured cap).

func (*ParseCache) Load

func (pc *ParseCache) Load(path string, content []byte) (*FlatTree, bool)

Load tries to load a cached Kotlin FlatTree for the given content. Returns (tree, true) on hit, (nil, false) on miss, small file, or any read/decode error. A nil ParseCache is always a miss. When path is non-empty, the content hash is also recorded in the shared hashutil.Memo so downstream subsystems (cross-file index, oracle, incremental cache) reuse it without re-reading or re-hashing.

func (*ParseCache) LoadJava

func (pc *ParseCache) LoadJava(path string, content []byte) (*FlatTree, bool)

LoadJava is the Java-language equivalent of Load.

func (*ParseCache) Root

func (pc *ParseCache) Root() string

Root returns the parent directory that contains both language subtrees. Exposed for diagnostics; callers that want to target a specific language should use the language-specific Load/Save entrypoints.

func (*ParseCache) Save

func (pc *ParseCache) Save(path string, content []byte, tree *FlatTree) error

Save persists the Kotlin parse result for content under its content hash. Small files are skipped. A returned error means the write failed and the next run will miss; callers typically discard it.

func (*ParseCache) SaveAsync

func (pc *ParseCache) SaveAsync(path string, content []byte, tree *FlatTree) error

SaveAsync persists the Kotlin parse result using the configured background writer when present. The content hash and FlatTree node snapshot are captured before Submit returns so downstream cache users still benefit from the shared hash memo and the job does not retain a mutable caller-owned slice.

func (*ParseCache) SaveJava

func (pc *ParseCache) SaveJava(path string, content []byte, tree *FlatTree) error

SaveJava is the Java-language equivalent of Save.

func (*ParseCache) SaveJavaAsync

func (pc *ParseCache) SaveJavaAsync(path string, content []byte, tree *FlatTree) error

SaveJavaAsync is the Java-language equivalent of SaveAsync.

func (*ParseCache) SetAsyncWriter

func (pc *ParseCache) SetAsyncWriter(w *cacheutil.AsyncWriter)

SetAsyncWriter enables bounded background persistence for SaveAsync and SaveJavaAsync. Passing nil restores synchronous behavior.

func (*ParseCache) Stats

func (pc *ParseCache) Stats() cacheutil.CacheStats

Stats returns a unified snapshot summed across both languages. Counter fields are running totals for the current process; Entries and Bytes come from the LRU sidecars (which themselves reflect disk state at open time).

func (*ParseCache) WriterStats

func (pc *ParseCache) WriterStats() ParseCacheWriterStats

type ParseCacheWriterStats

type ParseCacheWriterStats struct {
	Queued          int64
	Completed       int64
	Failed          int64
	Bytes           int64
	KotlinEntries   int64
	JavaEntries     int64
	KotlinBytes     int64
	JavaBytes       int64
	EncodeDuration  time.Duration
	PackWriteTime   time.Duration
	FlushWriteTime  time.Duration
	FlushEntries    int64
	FlushBytes      int64
	LRUUpdateTime   time.Duration
	LRUCloseTime    time.Duration
	LRUCloseEntries int64
	PackWrites      int64
}

ParseCacheWriterStats is a point-in-time snapshot of parse-cache async persistence. Encode/pack durations are aggregate worker time; close-time LRU eviction is measured as wall-clock time.

type Reference

type Reference struct {
	Name      string
	File      string
	Line      int
	InComment bool // true if this reference is inside a comment node
	// StartByte/EndByte locate the identifier's text in File.Content. They
	// are populated for Kotlin and Java references; XML and other text-based
	// references leave them as 0 and are not safe to rewrite by offset.
	StartByte int
	EndByte   int
	Language  Language
}

Reference represents a usage of a name in the codebase.

type ResolvedSymbol

type ResolvedSymbol struct {
	FQN      string
	Language Language
	Owner    string
	Kind     string
	Symbol   Symbol
}

ResolvedSymbol is a language-tagged source declaration resolved from the mixed Kotlin/Java source index.

type RunFingerprint

type RunFingerprint struct {
	Version      string
	Rules        string
	Config       string
	SourceSet    string
	CrossFile    string
	Android      string
	LibraryFacts string
}

type StringPool

type StringPool struct {
	// contains filtered or unexported fields
}

StringPool deduplicates repeated string values across scanner hot paths.

func NewStringPool

func NewStringPool() *StringPool

NewStringPool creates a pool ready for concurrent use.

func (*StringPool) Intern

func (p *StringPool) Intern(s string) string

Intern returns a canonical copy of s. The stored value is cloned on first insert so callers can safely pass zero-copy string views backed by file bytes.

type Suppression

type Suppression struct {
	StartByte int
	EndByte   int
	Rules     map[string]bool // rule names that are suppressed; nil = suppress all
}

Suppression represents a range of bytes where specific rules are suppressed.

type SuppressionFilter

type SuppressionFilter struct {
	// contains filtered or unexported fields
}

SuppressionFilter is the single per-file query object that combines every suppression source Krit understands:

  • @Suppress / @SuppressWarnings annotations (byte-range, via SuppressionIndex)
  • config-level per-rule `excludes` glob patterns
  • baseline entries (project-level; pointer only, filtering is per-finding and requires the full Finding struct, so callers apply it via FilterByBaseline / FilterColumnsByBaseline)
  • inline `// krit:ignore[RuleA,RuleB]` line comments (line-scoped)

Built once per file in the Parse phase and cached on File.Suppression. The dispatcher, cross-file phase, and any other post-collect filter all ask the same filter, so adding a new suppression source is a single BuildSuppressionFilter edit instead of four disconnected code paths.

func BuildSuppressionFilter

func BuildSuppressionFilter(file *File, baseline *Baseline, excludes map[string][]string, basePath string) *SuppressionFilter

BuildSuppressionFilter collects every per-file suppression source for a single parsed file. baseline and excludes are project-level inputs passed in by the caller (the pipeline Parse phase snapshots rules.GetAllRuleExcludes() and threads it through); the rest come directly from file contents.

Safe to call with nil file; returns a non-nil filter that always reports IsSuppressed == false so dispatcher / cross-file call sites do not need nil checks.

func (*SuppressionFilter) Annotations

func (f *SuppressionFilter) Annotations() *SuppressionIndex

Annotations exposes the underlying @Suppress index so compat callers (legacy tests, the File.SuppressionIdx shim) can reuse the same data without rebuilding.

func (*SuppressionFilter) Baseline

func (f *SuppressionFilter) Baseline() *Baseline

Baseline returns the project-level baseline the filter was built against, or nil if none was configured.

func (*SuppressionFilter) IsFileExcluded

func (f *SuppressionFilter) IsFileExcluded(ruleID string) bool

IsFileExcluded reports whether the given rule is globally excluded for this file via config globs. Used by the dispatcher to skip rule execution entirely rather than filtering findings after the fact.

func (*SuppressionFilter) IsSuppressed

func (f *SuppressionFilter) IsSuppressed(ruleID, ruleSet string, line int) bool

IsSuppressed reports whether a finding at (ruleID, ruleSet, line) is suppressed by any non-baseline source. Baseline filtering is applied separately via FilterByBaseline / FilterColumnsByBaseline because it requires the full Finding struct (message + signature).

A nil filter reports false — matches the pre-filter "no suppression data available" behaviour.

func (*SuppressionFilter) WithRuleAliases

func (f *SuppressionFilter) WithRuleAliases(aliases map[string][]string) *SuppressionFilter

WithRuleAliases attaches an alias index to the filter and returns it for chaining. Map key is a canonical rule ID; value is the list of alternate IDs (legacy names, renames) that should also suppress the canonical rule when written in @Suppress / krit:ignore.

Safe on a nil receiver (returns nil). Callers without alias data simply omit the call — alias matching is purely additive.

type SuppressionIndex

type SuppressionIndex struct {
	// contains filtered or unexported fields
}

SuppressionIndex provides O(log n) lookup for whether a finding is suppressed.

func BuildSuppressionIndexFlat

func BuildSuppressionIndexFlat(tree *FlatTree, content []byte) *SuppressionIndex

BuildSuppressionIndexFlat walks the flat tree once to find all @Suppress/@SuppressWarnings annotations and builds an index of suppressed byte ranges.

func (*SuppressionIndex) IsSuppressed

func (idx *SuppressionIndex) IsSuppressed(byteOffset int, ruleName string, ruleSetName string) bool

IsSuppressed checks if a finding at the given byte offset is suppressed for the given rule.

type Symbol

type Symbol struct {
	Name       string
	Kind       string // "function", "class", "property", "object", "interface"
	Visibility string // "public", "private", "internal", "protected"
	File       string
	Line       int
	StartByte  int
	EndByte    int
	Language   Language
	Package    string
	FQN        string
	Owner      string
	Signature  string
	Arity      int
	IsOverride bool
	IsTest     bool
	IsMain     bool
	IsStatic   bool
	IsFinal    bool
}

Symbol represents a declared symbol in the codebase.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL