Documentation
¶
Overview ¶
Package gotreesitter implements a pure Go tree-sitter runtime.
This file defines the core data structures that mirror tree-sitter's TSLanguage C struct and related types. They form the foundation on which the lexer, parser, query engine, and syntax tree are built.
Index ¶
- Constants
- Variables
- func DecodeUTF16Bytes(source []byte, order UTF16ByteOrder) ([]uint16, error)
- func DrainArenaPools()
- func EnableArenaBreakdown(enabled bool)
- func EnableArenaProfile(enabled bool)
- func EnableGLREquivAudit(enabled bool)
- func EnableRuntimeAudit(enabled bool)
- func RegisterHighlighterInjection(parentLanguage string, spec HighlighterInjectionSpec)
- func RepairNoLookaheadLexModes(lang *Language)
- func ResetArenaProfile()
- func ResetParseEnvConfigCacheForTests()
- func ResetPerfCounters()
- func RunExternalScanner(lang *Language, payload any, lexer *ExternalLexer, validSymbols []bool) bool
- func SetGLRForestEnabled(on bool)
- func SetInternLeavesObserveEnabled(on bool)
- func SetInternLeavesSubstituteEnabled(on bool)
- func Walk(node *Node, fn func(node *Node, depth int) WalkAction)
- type AmbiguityKey
- type AmbiguityProfile
- func (p *AmbiguityProfile) Reset()
- func (p *AmbiguityProfile) SnapshotReduceChainTotals() AmbiguityStat
- func (p *AmbiguityProfile) SnapshotTop(limit int) []AmbiguityStat
- func (p *AmbiguityProfile) SnapshotTopMergeStates(limit int) []AmbiguityStat
- func (p *AmbiguityProfile) SnapshotTopReduceChainRuns(limit int) []AmbiguityStat
- func (p *AmbiguityProfile) SnapshotTopReduceChains(limit int) []AmbiguityStat
- type AmbiguityStat
- type ArenaBreakdown
- type ArenaProfile
- type BoundTree
- func (bt *BoundTree) ChildByField(n *Node, fieldName string) *Node
- func (bt *BoundTree) Language() *Language
- func (bt *BoundTree) NodeText(n *Node) string
- func (bt *BoundTree) NodeType(n *Node) string
- func (bt *BoundTree) Release()
- func (bt *BoundTree) RootNode() *Node
- func (bt *BoundTree) Source() []byte
- func (bt *BoundTree) TreeCursor() *TreeCursor
- type ByteSkippableTokenSource
- type ExternalLexer
- type ExternalScanner
- type ExternalScannerState
- type ExternalSymbolResolver
- type ExternalVMInstr
- func VMAdvance(skip bool) ExternalVMInstr
- func VMEmit(sym Symbol) ExternalVMInstr
- func VMFail() ExternalVMInstr
- func VMIfRuneClass(class ExternalVMRuneClass, alt int) ExternalVMInstr
- func VMIfRuneEq(r rune, alt int) ExternalVMInstr
- func VMIfRuneInRange(start, end rune, alt int) ExternalVMInstr
- func VMJump(target int) ExternalVMInstr
- func VMMarkEnd() ExternalVMInstr
- func VMRequireStateEq(state uint32, alt int) ExternalVMInstr
- func VMRequireValid(validSymbolIndex, alt int) ExternalVMInstr
- func VMSetState(state uint32) ExternalVMInstr
- type ExternalVMOp
- type ExternalVMProgram
- type ExternalVMRuneClass
- type ExternalVMScanner
- func (s *ExternalVMScanner) Create() any
- func (s *ExternalVMScanner) Deserialize(payload any, buf []byte)
- func (s *ExternalVMScanner) Destroy(payload any)
- func (s *ExternalVMScanner) Scan(payload any, lexer *ExternalLexer, validSymbols []bool) bool
- func (s *ExternalVMScanner) Serialize(payload any, buf []byte) int
- type FieldID
- type FieldMapEntry
- type HighlightRange
- type Highlighter
- func (h *Highlighter) Highlight(source []byte) []HighlightRange
- func (h *Highlighter) HighlightIncremental(source []byte, oldTree *Tree) ([]HighlightRange, *Tree)
- func (h *Highlighter) HighlightIncrementalUTF16(source []uint16, oldTree *Tree) ([]UTF16HighlightRange, *Tree)
- func (h *Highlighter) HighlightIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) ([]UTF16HighlightRange, *Tree, error)
- func (h *Highlighter) HighlightUTF16(source []uint16) []UTF16HighlightRange
- func (h *Highlighter) HighlightUTF16Bytes(source []byte, order UTF16ByteOrder) ([]UTF16HighlightRange, error)
- type HighlighterInjectionResolver
- type HighlighterInjectionSpec
- type HighlighterOption
- type ImportExtractResult
- type ImportExtractStatus
- type ImportRef
- type IncrementalParseProfile
- type IncrementalReuseExternalScanner
- type IncrementalReuseTokenSource
- type Injection
- type InjectionParser
- func (ip *InjectionParser) Parse(source []byte, parentLang string) (*InjectionResult, error)
- func (ip *InjectionParser) ParseIncremental(source []byte, parentLang string, oldResult *InjectionResult) (*InjectionResult, error)
- func (ip *InjectionParser) ParseIncrementalUTF16(source []uint16, parentLang string, oldResult *UTF16InjectionResult) (*UTF16InjectionResult, error)
- func (ip *InjectionParser) ParseIncrementalUTF16Bytes(source []byte, parentLang string, oldResult *UTF16InjectionResult, ...) (*UTF16InjectionResult, error)
- func (ip *InjectionParser) ParseUTF16(source []uint16, parentLang string) (*UTF16InjectionResult, error)
- func (ip *InjectionParser) ParseUTF16Bytes(source []byte, parentLang string, order UTF16ByteOrder) (*UTF16InjectionResult, error)
- func (ip *InjectionParser) RegisterInjectionQuery(parentLang string, query string) error
- func (ip *InjectionParser) RegisterLanguage(name string, lang *Language)
- func (ip *InjectionParser) SetMaxDepth(depth int)
- type InjectionResult
- type InputEdit
- type InputEncoding
- type InternObservationStats
- type Language
- func (l *Language) CompatibleWithRuntime() bool
- func (l *Language) FieldByName(name string) (FieldID, bool)
- func (l *Language) IsSupertype(sym Symbol) bool
- func (l *Language) KeywordLexAsciiTable() [][128]int32
- func (l *Language) LexAsciiTable() [][128]int32
- func (l *Language) LexModeStarts() []lexModeStart
- func (l *Language) PublicSymbol(sym Symbol) Symbol
- func (l *Language) PublicSymbolForNamedness(sym Symbol, named bool) Symbol
- func (l *Language) SupertypeChildren(sym Symbol) []Symbol
- func (l *Language) SymbolByName(name string) (Symbol, bool)
- func (l *Language) TokenSymbolsByName(name string) []Symbol
- func (l *Language) Version() uint32
- type LanguageMetadata
- type LexMode
- type LexState
- type LexTransition
- type Lexer
- type LookaheadIterator
- type Node
- func (n *Node) Child(i int) *Node
- func (n *Node) ChildByFieldName(name string, lang *Language) *Node
- func (n *Node) ChildCount() int
- func (n *Node) Children() []*Node
- func (n *Node) DescendantForByteRange(startByte, endByte uint32) *Node
- func (n *Node) DescendantForPointRange(startPoint, endPoint Point) *Node
- func (n *Node) Edit(edit InputEdit)
- func (n *Node) EndByte() uint32
- func (n *Node) EndPoint() Point
- func (n *Node) FieldNameForChild(i int, lang *Language) string
- func (n *Node) HasChanges() bool
- func (n *Node) HasError() bool
- func (n *Node) IsError() bool
- func (n *Node) IsExtra() bool
- func (n *Node) IsMissing() bool
- func (n *Node) IsNamed() bool
- func (n *Node) NamedChild(i int) *Node
- func (n *Node) NamedChildCount() int
- func (n *Node) NamedDescendantForByteRange(startByte, endByte uint32) *Node
- func (n *Node) NamedDescendantForPointRange(startPoint, endPoint Point) *Node
- func (n *Node) NextSibling() *Node
- func (n *Node) Parent() *Node
- func (n *Node) ParseState() StateID
- func (n *Node) PreGotoState() StateID
- func (n *Node) PrevSibling() *Node
- func (n *Node) Range() Range
- func (n *Node) SExpr(lang *Language) string
- func (n *Node) StartByte() uint32
- func (n *Node) StartPoint() Point
- func (n *Node) Symbol() Symbol
- func (n *Node) Text(source []byte) string
- func (n *Node) Type(lang *Language) string
- type NormalizationPassRuntime
- type ParseAction
- type ParseActionEntry
- type ParseActionTiming
- type ParseActionType
- type ParseEquivStateRuntime
- type ParseOption
- type ParseReduceTiming
- type ParseResult
- type ParseRuntime
- type ParseStopReason
- type Parser
- func (p *Parser) CancellationFlag() *uint32
- func (p *Parser) IncludedRanges() []Range
- func (p *Parser) InferredRootSymbol() (Symbol, bool)
- func (p *Parser) Language() *Language
- func (p *Parser) Logger() ParserLogger
- func (p *Parser) Parse(source []byte) (*Tree, error)
- func (p *Parser) ParseForestExperimental(source []byte) (*Tree, bool)
- func (p *Parser) ParseIncremental(source []byte, oldTree *Tree) (*Tree, error)
- func (p *Parser) ParseIncrementalProfiled(source []byte, oldTree *Tree) (*Tree, IncrementalParseProfile, error)
- func (p *Parser) ParseIncrementalUTF16(source []uint16, oldTree *Tree) (*Tree, error)
- func (p *Parser) ParseIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) (*Tree, error)
- func (p *Parser) ParseIncrementalUTF16BytesWithTokenSourceFactory(source []byte, oldTree *Tree, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
- func (p *Parser) ParseIncrementalUTF16WithTokenSourceFactory(source []uint16, oldTree *Tree, factory TokenSourceFactory) (*Tree, error)
- func (p *Parser) ParseIncrementalWithTokenSource(source []byte, oldTree *Tree, ts TokenSource) (*Tree, error)
- func (p *Parser) ParseIncrementalWithTokenSourceFactory(source []byte, oldTree *Tree, factory TokenSourceFactory) (*Tree, error)
- func (p *Parser) ParseIncrementalWithTokenSourceProfiled(source []byte, oldTree *Tree, ts TokenSource) (*Tree, IncrementalParseProfile, error)
- func (p *Parser) ParseNoResultCompatibilityBenchmarkOnly(source []byte) (*Tree, error)
- func (p *Parser) ParseNoTreeBenchmarkOnly(source []byte) (*Tree, error)
- func (p *Parser) ParseNoTreeWithExternalCheckpointsBenchmarkOnly(source []byte) (*Tree, error)
- func (p *Parser) ParseUTF16(source []uint16) (*Tree, error)
- func (p *Parser) ParseUTF16Bytes(source []byte, order UTF16ByteOrder) (*Tree, error)
- func (p *Parser) ParseUTF16BytesWithTokenSourceFactory(source []byte, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
- func (p *Parser) ParseUTF16WithTokenSourceFactory(source []uint16, factory TokenSourceFactory) (*Tree, error)
- func (p *Parser) ParseWith(source []byte, opts ...ParseOption) (ParseResult, error)
- func (p *Parser) ParseWithTokenSource(source []byte, ts TokenSource) (*Tree, error)
- func (p *Parser) ParseWithTokenSourceFactory(source []byte, factory TokenSourceFactory) (*Tree, error)
- func (p *Parser) SetAmbiguityProfile(profile *AmbiguityProfile)
- func (p *Parser) SetCancellationFlag(flag *uint32)
- func (p *Parser) SetGLRTrace(enabled bool)
- func (p *Parser) SetIncludedRanges(ranges []Range)
- func (p *Parser) SetIncludedUTF16ByteRanges(source []byte, order UTF16ByteOrder, ranges []UTF16Range) error
- func (p *Parser) SetIncludedUTF16Ranges(source []uint16, ranges []UTF16Range) bool
- func (p *Parser) SetLogger(logger ParserLogger)
- func (p *Parser) SetTimeoutMicros(timeoutMicros uint64)
- func (p *Parser) TimeoutMicros() uint64
- type ParserLogType
- type ParserLogger
- type ParserPool
- func (pp *ParserPool) Language() *Language
- func (pp *ParserPool) Parse(source []byte) (*Tree, error)
- func (pp *ParserPool) ParseIncrementalUTF16(source []uint16, oldTree *Tree) (*Tree, error)
- func (pp *ParserPool) ParseIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) (*Tree, error)
- func (pp *ParserPool) ParseIncrementalUTF16BytesWithTokenSourceFactory(source []byte, oldTree *Tree, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
- func (pp *ParserPool) ParseIncrementalUTF16WithTokenSourceFactory(source []uint16, oldTree *Tree, factory TokenSourceFactory) (*Tree, error)
- func (pp *ParserPool) ParseNoResultCompatibilityBenchmarkOnly(source []byte) (*Tree, error)
- func (pp *ParserPool) ParseNoTreeBenchmarkOnly(source []byte) (*Tree, error)
- func (pp *ParserPool) ParseNoTreeWithExternalCheckpointsBenchmarkOnly(source []byte) (*Tree, error)
- func (pp *ParserPool) ParseUTF16(source []uint16) (*Tree, error)
- func (pp *ParserPool) ParseUTF16Bytes(source []byte, order UTF16ByteOrder) (*Tree, error)
- func (pp *ParserPool) ParseUTF16BytesWithTokenSourceFactory(source []byte, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
- func (pp *ParserPool) ParseUTF16WithTokenSourceFactory(source []uint16, factory TokenSourceFactory) (*Tree, error)
- func (pp *ParserPool) ParseWith(source []byte, opts ...ParseOption) (ParseResult, error)
- func (pp *ParserPool) ParseWithTokenSource(source []byte, ts TokenSource) (*Tree, error)
- func (pp *ParserPool) ParseWithTokenSourceFactory(source []byte, factory TokenSourceFactory) (*Tree, error)
- type ParserPoolOption
- func WithParserPoolAmbiguityProfile(profile *AmbiguityProfile) ParserPoolOption
- func WithParserPoolGLRTrace(enabled bool) ParserPoolOption
- func WithParserPoolIncludedRanges(ranges []Range) ParserPoolOption
- func WithParserPoolLogger(logger ParserLogger) ParserPoolOption
- func WithParserPoolTimeoutMicros(timeoutMicros uint64) ParserPoolOption
- type Pattern
- type PendingParentFieldRejectPayloadStats
- type PendingParentFieldRejectStats
- type PendingParentRejectStats
- type PerfCounters
- type Point
- type PointSkippableTokenSource
- type Query
- func (q *Query) CaptureCount() uint32
- func (q *Query) CaptureNameForID(id uint32) (string, bool)
- func (q *Query) CaptureNames() []string
- func (q *Query) DisableCapture(name string)
- func (q *Query) DisablePattern(patternIndex uint32)
- func (q *Query) EndByteForPattern(patternIndex uint32) (uint32, bool)
- func (q *Query) Exec(node *Node, lang *Language, source []byte) *QueryCursor
- func (q *Query) Execute(tree *Tree) []QueryMatch
- func (q *Query) ExecuteInto(tree *Tree, dst []QueryMatch) []QueryMatch
- func (q *Query) ExecuteNode(node *Node, lang *Language, source []byte) []QueryMatch
- func (q *Query) IsPatternGuaranteedAtStep(patternIndex uint32, stepIndex uint32) bool
- func (q *Query) IsPatternNonLocal(patternIndex uint32) bool
- func (q *Query) IsPatternRooted(patternIndex uint32) bool
- func (q *Query) PatternCount() int
- func (q *Query) PredicatesForPattern(patternIndex uint32) ([]QueryPredicate, bool)
- func (q *Query) StartByteForPattern(patternIndex uint32) (uint32, bool)
- func (q *Query) StepIsDefinite(patternIndex uint32, stepIndex uint32) bool
- func (q *Query) StringCount() uint32
- func (q *Query) StringValueForID(id uint32) (string, bool)
- type QueryCapture
- type QueryCursor
- func (c *QueryCursor) DidExceedMatchLimit() bool
- func (c *QueryCursor) NextCapture() (QueryCapture, bool)
- func (c *QueryCursor) NextMatch() (QueryMatch, bool)
- func (c *QueryCursor) SetByteRange(startByte, endByte uint32)
- func (c *QueryCursor) SetMatchLimit(limit uint32)
- func (c *QueryCursor) SetMaxStartDepth(depth uint32)
- func (c *QueryCursor) SetPointRange(startPoint, endPoint Point)
- func (c *QueryCursor) SetUTF16Range(tree *Tree, startCodeUnit, endCodeUnit uint32) bool
- type QueryMatch
- type QueryPredicate
- type QueryStep
- type Range
- type ReduceChainHint
- type ReduceChainTerminalAction
- type ReduceChildPathRuntime
- type Rewriter
- func (r *Rewriter) Apply() (newSource []byte, edits []InputEdit, err error)
- func (r *Rewriter) ApplyToTree(tree *Tree) ([]byte, error)
- func (r *Rewriter) Delete(node *Node)
- func (r *Rewriter) InsertAfter(node *Node, text []byte)
- func (r *Rewriter) InsertBefore(node *Node, text []byte)
- func (r *Rewriter) Replace(node *Node, newText []byte)
- func (r *Rewriter) ReplaceRange(startByte, endByte uint32, newText []byte)
- type StateID
- type Symbol
- type SymbolMetadata
- type Tag
- type Tagger
- func (tg *Tagger) Tag(source []byte) []Tag
- func (tg *Tagger) TagIncremental(source []byte, oldTree *Tree) ([]Tag, *Tree)
- func (tg *Tagger) TagIncrementalUTF16(source []uint16, oldTree *Tree) ([]UTF16Tag, *Tree)
- func (tg *Tagger) TagIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) ([]UTF16Tag, *Tree, error)
- func (tg *Tagger) TagTree(tree *Tree) []Tag
- func (tg *Tagger) TagTreeUTF16(tree *Tree) []UTF16Tag
- func (tg *Tagger) TagUTF16(source []uint16) []UTF16Tag
- func (tg *Tagger) TagUTF16Bytes(source []byte, order UTF16ByteOrder) ([]UTF16Tag, error)
- type TaggerOption
- type Token
- type TokenSource
- type TokenSourceFactory
- type TokenSourceRebuilder
- type Tree
- func (t *Tree) ArenaBreakdown() (ArenaBreakdown, bool)
- func (t *Tree) ChangedRanges() []Range
- func (t *Tree) Copy() *Tree
- func (t *Tree) DOT(lang *Language) string
- func (t *Tree) DescendantForUTF16Range(startCodeUnit, endCodeUnit uint32) *Node
- func (t *Tree) Edit(edit InputEdit)
- func (t *Tree) EditUTF16(edit UTF16Edit, newSource []uint16) bool
- func (t *Tree) Edits() []InputEdit
- func (t *Tree) InputEditForUTF16(edit UTF16Edit, newSource []uint16) (InputEdit, bool)
- func (t *Tree) Language() *Language
- func (t *Tree) NamedDescendantForUTF16Range(startCodeUnit, endCodeUnit uint32) *Node
- func (t *Tree) ParseRuntime() ParseRuntime
- func (t *Tree) ParseStopReason() ParseStopReason
- func (t *Tree) ParseStoppedEarly() bool
- func (t *Tree) Release()
- func (t *Tree) RootNode() *Node
- func (t *Tree) RootNodeWithOffset(offsetBytes uint32, offsetExtent Point) *Node
- func (t *Tree) Source() []byte
- func (t *Tree) SourceEncoding() InputEncoding
- func (t *Tree) SourceUTF16() []uint16
- func (t *Tree) UTF8ByteForUTF16Offset(offset uint32) (uint32, bool)
- func (t *Tree) UTF16OffsetForByte(offset uint32) (uint32, bool)
- func (t *Tree) UTF16PointForByte(offset uint32) (Point, bool)
- func (t *Tree) UTF16RangeForByteRange(startByte, endByte uint32) (UTF16Range, bool)
- func (t *Tree) UTF16RangeForNode(n *Node) (UTF16Range, bool)
- func (t *Tree) UTF16RangeForRange(r Range) (UTF16Range, bool)
- func (t *Tree) UTF16SourceForNode(n *Node) ([]uint16, bool)
- func (t *Tree) WriteDOT(w io.Writer, lang *Language) error
- type TreeCursor
- func (c *TreeCursor) Copy() *TreeCursor
- func (c *TreeCursor) CurrentFieldID() FieldID
- func (c *TreeCursor) CurrentFieldName() string
- func (c *TreeCursor) CurrentNode() *Node
- func (c *TreeCursor) CurrentNodeIsNamed() bool
- func (c *TreeCursor) CurrentNodeText() string
- func (c *TreeCursor) CurrentNodeType() string
- func (c *TreeCursor) Depth() int
- func (c *TreeCursor) GotoChildByFieldID(fid FieldID) bool
- func (c *TreeCursor) GotoChildByFieldName(name string) bool
- func (c *TreeCursor) GotoFirstChild() bool
- func (c *TreeCursor) GotoFirstChildForByte(targetByte uint32) int64
- func (c *TreeCursor) GotoFirstChildForPoint(targetPoint Point) int64
- func (c *TreeCursor) GotoFirstNamedChild() bool
- func (c *TreeCursor) GotoLastChild() bool
- func (c *TreeCursor) GotoLastNamedChild() bool
- func (c *TreeCursor) GotoNextNamedSibling() bool
- func (c *TreeCursor) GotoNextSibling() bool
- func (c *TreeCursor) GotoParent() bool
- func (c *TreeCursor) GotoPrevNamedSibling() bool
- func (c *TreeCursor) GotoPrevSibling() bool
- func (c *TreeCursor) Reset(node *Node)
- func (c *TreeCursor) ResetTree(tree *Tree)
- type UTF16ByteOrder
- type UTF16Edit
- type UTF16HighlightRange
- type UTF16Injection
- type UTF16InjectionResult
- type UTF16Range
- type UTF16Tag
- type WalkAction
Constants ¶
const ( // RuntimeLanguageVersion is the maximum tree-sitter language version this // runtime is known to support. RuntimeLanguageVersion uint32 = 15 // MinCompatibleLanguageVersion is the minimum accepted language version. MinCompatibleLanguageVersion uint32 = 13 )
Variables ¶
var ( // ErrInvalidUTF16ByteLength is returned when a UTF-16 byte source has a // dangling trailing byte. ErrInvalidUTF16ByteLength = errors.New("utf16: byte source length must be even") // ErrInvalidUTF16ByteOrder is returned for an unknown UTF-16ByteOrder. ErrInvalidUTF16ByteOrder = errors.New("utf16: invalid byte order") // ErrInvalidUTF16Range is returned when a UTF-16 range does not align to // valid code-point boundaries or has an inverted span. ErrInvalidUTF16Range = errors.New("utf16: invalid range") )
var DebugDFA atomic.Bool
DebugDFA enables trace logging for DFA token production.
Use `DebugDFA.Store(true/false)` to toggle at runtime.
var ErrNoLanguage = errors.New("parser has no language configured")
ErrNoLanguage is returned when a Parser has no language configured.
var ErrNoTokenSource = errors.New("parser has no token source")
ErrNoTokenSource is returned when a token-source parse is called without a token source.
var ErrNoTokenSourceFactory = errors.New("parser has no token source factory")
ErrNoTokenSourceFactory is returned when a factory-based parse is called without a token source factory.
Functions ¶
func DecodeUTF16Bytes ¶ added in v0.16.0
func DecodeUTF16Bytes(source []byte, order UTF16ByteOrder) ([]uint16, error)
DecodeUTF16Bytes decodes an endian-specific UTF-16 byte source into Go UTF-16 code units.
func DrainArenaPools ¶ added in v0.14.0
func DrainArenaPools()
DrainArenaPools releases all cached arenas from both incremental and full-parse pools. Arenas held in the pool are strong Go references and are not collected by the GC until explicitly drained or the process exits.
Call this after a large batch scan (e.g. after WalkAndParse returns) to allow the GC to reclaim the arena memory. The next parse will allocate a fresh arena.
func EnableArenaBreakdown ¶ added in v0.18.0
func EnableArenaBreakdown(enabled bool)
EnableArenaBreakdown toggles detailed arena accounting for subsequently acquired arenas. It is intended for diagnostics and benchmark attribution; normal parser paths leave it disabled to avoid perturbing hot allocation paths.
func EnableArenaProfile ¶ added in v0.6.0
func EnableArenaProfile(enabled bool)
EnableArenaProfile toggles arena pool counters. This debug hook is not concurrency-safe and is intended for single-threaded benchmark/profiling runs.
func EnableGLREquivAudit ¶ added in v0.19.0
func EnableGLREquivAudit(enabled bool)
EnableGLREquivAudit toggles lightweight GLR equivalence attribution. This is intended for parser gap diagnostics and avoids the heavier survivor maps used by EnableRuntimeAudit.
func EnableRuntimeAudit ¶ added in v0.7.0
func EnableRuntimeAudit(enabled bool)
EnableRuntimeAudit toggles per-parse survivor instrumentation. This debug hook is intended for single-threaded benchmark/profiling runs.
func RegisterHighlighterInjection ¶ added in v0.7.0
func RegisterHighlighterInjection(parentLanguage string, spec HighlighterInjectionSpec)
RegisterHighlighterInjection registers nested-highlighting configuration for a parent language name (for example "markdown").
func RepairNoLookaheadLexModes ¶ added in v0.9.0
func RepairNoLookaheadLexModes(lang *Language)
RepairNoLookaheadLexModes marks parser states as no-lookahead when they only need EOF-triggered reductions plus external/trivia handling. Tree-sitter's C runtime uses these states to reduce before lexing the next real token.
func ResetArenaProfile ¶ added in v0.6.0
func ResetArenaProfile()
ResetArenaProfile resets arena pool counters. This debug hook is not concurrency-safe and is intended for single-threaded benchmark/profiling runs.
func ResetParseEnvConfigCacheForTests ¶ added in v0.7.0
func ResetParseEnvConfigCacheForTests()
ResetParseEnvConfigCacheForTests clears memoized parser env config.
Tests in this repo mutate env vars between cases; this helper ensures subsequent parses observe the new values in the same process.
func ResetPerfCounters ¶ added in v0.6.0
func ResetPerfCounters()
func RunExternalScanner ¶
func RunExternalScanner(lang *Language, payload any, lexer *ExternalLexer, validSymbols []bool) bool
RunExternalScanner invokes the language's external scanner if present. Returns true if the scanner produced a token, false otherwise.
func SetGLRForestEnabled ¶ added in v0.20.0
func SetGLRForestEnabled(on bool)
SetGLRForestEnabled toggles the GSS-forest path at runtime (tests/benchmarks).
func SetInternLeavesObserveEnabled ¶ added in v0.20.0
func SetInternLeavesObserveEnabled(on bool)
SetInternLeavesObserveEnabled toggles leaf-interning observation at runtime. Tests and benches that want to A/B observation without re-running the test binary set this directly. Not safe to flip while a parse is in flight on another goroutine. Phase 2 scaffolding; the API may change before becoming public.
func SetInternLeavesSubstituteEnabled ¶ added in v0.20.0
func SetInternLeavesSubstituteEnabled(on bool)
SetInternLeavesSubstituteEnabled toggles canonical substitution at runtime. See internLeavesSubstituteEnabled.
Types ¶
type AmbiguityKey ¶ added in v0.17.0
type AmbiguityKey struct {
State StateID
Lookahead Symbol
ActionCount uint8
ShiftCount uint8
ReduceCount uint8
ReduceSymbol Symbol
ChildCount uint8
ProductionID uint16
ReduceChainTerminalState StateID
ReduceChainTerminalActionClass uint8
}
AmbiguityKey identifies one parse-table ambiguity bucket.
type AmbiguityProfile ¶ added in v0.17.0
type AmbiguityProfile struct {
// contains filtered or unexported fields
}
AmbiguityProfile aggregates parser states/lookaheads that contribute to GLR fanout. It is intended for diagnostics and benchmark runs, not normal API use.
func NewAmbiguityProfile ¶ added in v0.17.0
func NewAmbiguityProfile() *AmbiguityProfile
NewAmbiguityProfile creates an empty GLR ambiguity profile.
func (*AmbiguityProfile) Reset ¶ added in v0.17.0
func (p *AmbiguityProfile) Reset()
Reset clears all accumulated ambiguity counters.
func (*AmbiguityProfile) SnapshotReduceChainTotals ¶ added in v0.19.0
func (p *AmbiguityProfile) SnapshotReduceChainTotals() AmbiguityStat
SnapshotReduceChainTotals returns aggregate deterministic reduce-chain run counters across all profiled start states/lookaheads.
func (*AmbiguityProfile) SnapshotTop ¶ added in v0.17.0
func (p *AmbiguityProfile) SnapshotTop(limit int) []AmbiguityStat
SnapshotTop returns the highest-impact ambiguity buckets ordered by stack pressure, then hit count.
func (*AmbiguityProfile) SnapshotTopMergeStates ¶ added in v0.19.0
func (p *AmbiguityProfile) SnapshotTopMergeStates(limit int) []AmbiguityStat
SnapshotTopMergeStates returns parser states that most often participate in multi-stack merge passes. These rows are keyed by state only, because merge happens before the next lookahead dispatch.
func (*AmbiguityProfile) SnapshotTopReduceChainRuns ¶ added in v0.19.0
func (p *AmbiguityProfile) SnapshotTopReduceChainRuns(limit int) []AmbiguityStat
SnapshotTopReduceChainRuns returns the starting states/lookaheads that begin the most expensive deterministic reduce chains.
func (*AmbiguityProfile) SnapshotTopReduceChains ¶ added in v0.19.0
func (p *AmbiguityProfile) SnapshotTopReduceChains(limit int) []AmbiguityStat
SnapshotTopReduceChains returns the parser states/lookaheads that spent the most time in deterministic reduce-chain fusion.
type AmbiguityStat ¶ added in v0.17.0
type AmbiguityStat struct {
State StateID
Lookahead Symbol
ActionCount uint8
ShiftCount uint8
ReduceCount uint8
ReduceSymbol Symbol
ChildCount uint8
ProductionID uint16
Actions []ParseAction
Hits uint64
Forks uint64
MultiStackHits uint64
StackInTotal uint64
StackInMax int
ReduceChainHits uint64
ReduceChainSteps uint64
ReduceChainMaxLen int
ReduceChainNanos int64
ReduceChainRuns uint64
ReduceChainClassHits uint64
ReduceChainStopNoAction uint64
ReduceChainStopMulti uint64
ReduceChainStopShift uint64
ReduceChainStopAccept uint64
ReduceChainStopDead uint64
ReduceChainStopCycle uint64
ReduceChainStopLimit uint64
ReduceChainTerminalState StateID
ReduceChainTerminalActionClass uint8
ActionNanos int64
ExtraShiftNanos int64
NoActionNanos int64
ConflictChoiceNanos int64
ConflictForkNanos int64
SingleShiftNanos int64
SingleReduceNanos int64
SingleAcceptNanos int64
SingleRecoverNanos int64
SingleOtherNanos int64
MergeCalls uint64
MergeStacksIn uint64
MergeStacksOut uint64
MergeStacksInMax int
MergeStacksOutMax int
}
AmbiguityStat is a snapshot row from AmbiguityProfile.
type ArenaBreakdown ¶ added in v0.18.0
type ArenaBreakdown struct {
NodeStructBytesAllocated int64
NoTreeNodeBytesAllocated int64
CompactFullLeafBytesAllocated int64
PendingParentBytesAllocated int64
PendingChildEntryBytesAllocated int64
FinalChildSidecarBytesAllocated int64
PendingChildEntriesAllocated uint64
PendingChildEntryCapacity uint64
PendingChildEntryWaste uint64
ChildSliceBytesAllocated int64
FieldIDBytesAllocated int64
FieldSourceBytesAllocated int64
MergeScratchBytesAllocated int64
ArenaNodesConstructed uint64
// NodeLiveCount is arena allocation-slot usage, not root-reachable tree
// liveness. It includes parser alternatives and recovery nodes allocated
// during the parse.
NodeLiveCount uint64
NodeCapacityCount uint64
NodeCapacityWaste uint64
PrimaryNodeCapacity uint64
PrimaryNodeUsed uint64
OverflowNodeCapacity uint64
OverflowNodeUsed uint64
OverflowNodeSlabs uint64
LargestNodeSlabUsedFraction float64
LeafNodesConstructed uint64
ParentNodesConstructed uint64
FieldedParentNodesConstructed uint64
UnfieldedParentNodesConstructed uint64
ParentConstructedChildLen0 uint64
ParentConstructedChildLen1 uint64
ParentConstructedChildLen2 uint64
ParentConstructedChildLen3 uint64
ParentConstructedChildLen4Plus uint64
ParentConstructedNoLinks uint64
ParentConstructedWithLinks uint64
ParentConstructedTrackErrors uint64
ParentConstructedFieldSources uint64
ParentReductionVisible uint64
ParentReductionInvisible uint64
ParentReductionVisibleFielded uint64
ParentReductionVisibleUnfielded uint64
ParentReductionInvisibleFielded uint64
ParentReductionInvisibleUnfielded uint64
ParentReductionVisibleChildPtrs uint64
ParentReductionInvisibleChildPtrs uint64
ParentReductionVisibleLen0 uint64
ParentReductionVisibleLen1 uint64
ParentReductionVisibleLen2 uint64
ParentReductionVisibleLen3 uint64
ParentReductionVisibleLen4Plus uint64
ParentReductionInvisibleLen0 uint64
ParentReductionInvisibleLen1 uint64
ParentReductionInvisibleLen2 uint64
ParentReductionInvisibleLen3 uint64
ParentReductionInvisibleLen4Plus uint64
ReduceChildSlicesFastGSS uint64
ReduceChildPointersFastGSS uint64
ReduceChildSlicesAllVisible uint64
ReduceChildPointersAllVisible uint64
ReduceChildSlicesNoAlias uint64
ReduceChildPointersNoAlias uint64
ReduceChildSlicesScratchGeneral uint64
ReduceChildPointersScratchGeneral uint64
ReduceChildSlicesScratchNoAlias uint64
ReduceChildPointersScratchNoAlias uint64
CollapseRawUnaryAttempts uint64
CollapseRawUnarySuccesses uint64
CollapseRawUnaryMissShape uint64
CollapseRawUnaryMissGrammar uint64
CollapseRawUnaryMissChild uint64
CollapseRawUnaryMissRule uint64
CollapseUnaryAttempts uint64
CollapseUnarySuccesses uint64
CollapseUnaryMissShape uint64
CollapseUnaryMissGrammar uint64
CollapseUnaryMissFielded uint64
CollapseUnaryMissChild uint64
CollapseUnaryMissRule uint64
CollapseRuleSameSymbol uint64
CollapseRuleInvisibleWrapper uint64
CollapseRuleNamedLeafAlias uint64
NoTreeReduceNodesConstructed uint64
NoTreeLeafNodesConstructed uint64
NoTreePlaceholderNodesConstructed uint64
OtherNodesConstructed uint64
ExtraNodesConstructed uint64
ErrorSymbolNodesConstructed uint64
HasErrorNodesConstructed uint64
ChildSlicesConstructed uint64
ChildPointersConstructed uint64
ChildSlicesLen1 uint64
ChildSlicesLen2 uint64
ChildSlicesLen3 uint64
ChildSlicesLen4Plus uint64
ParentChildPointersConstructed uint64
ParentChildrenLen0 uint64
ParentChildrenLen1 uint64
ParentChildrenLen2 uint64
ParentChildrenLen3 uint64
ParentChildrenLen4Plus uint64
FieldIDElementsConstructed uint64
FieldSourceElementsConstructed uint64
}
ArenaBreakdown captures optional arena/materialization attribution. It is populated only when EnableArenaBreakdown(true) is set before parsing.
type ArenaProfile ¶ added in v0.6.0
type ArenaProfile struct {
IncrementalAcquire uint64
IncrementalNew uint64
FullAcquire uint64
FullNew uint64
}
ArenaProfile captures node arena allocation statistics. Enable with SetArenaProfileEnabled(true) and retrieve with GetArenaProfile().
func ArenaProfileSnapshot ¶ added in v0.6.0
func ArenaProfileSnapshot() ArenaProfile
ArenaProfileSnapshot returns current arena pool counters. This debug hook is not concurrency-safe and is intended for single-threaded benchmark/profiling runs.
type BoundTree ¶
type BoundTree struct {
// contains filtered or unexported fields
}
BoundTree pairs a Tree with its Language and source, eliminating the need to pass *Language and []byte to every node method call.
func Bind ¶
Bind creates a BoundTree from a Tree. The Tree must have been created with a Language (via NewTree or a Parser). Returns a BoundTree that delegates to the underlying Tree's Language and Source.
func (*BoundTree) ChildByField ¶
ChildByField returns the first child assigned to the given field name.
func (*BoundTree) NodeType ¶
NodeType returns the node's type name, resolved via the bound language.
func (*BoundTree) Release ¶
func (bt *BoundTree) Release()
Release releases the underlying tree's arena memory.
func (*BoundTree) TreeCursor ¶ added in v0.6.0
func (bt *BoundTree) TreeCursor() *TreeCursor
TreeCursor returns a new TreeCursor starting at the tree's root node.
type ByteSkippableTokenSource ¶
type ByteSkippableTokenSource interface {
TokenSource
SkipToByte(offset uint32) Token
}
ByteSkippableTokenSource can jump to a byte offset and return the first token at or after that position.
type ExternalLexer ¶
type ExternalLexer struct {
// contains filtered or unexported fields
}
ExternalLexer is the scanner-facing lexer API used by external scanners. It mirrors the essential tree-sitter scanner API: lookahead, advance, mark_end, and result_symbol.
func (*ExternalLexer) Advance ¶
func (l *ExternalLexer) Advance(skip bool)
Advance consumes one rune. When skip is true, consumed bytes are excluded from the token span (scanner whitespace skipping behavior).
func (*ExternalLexer) Column ¶ added in v0.6.0
func (l *ExternalLexer) Column() uint32
Column returns the current column (0-based) at the scanner cursor.
func (*ExternalLexer) GetColumn
deprecated
func (l *ExternalLexer) GetColumn() uint32
GetColumn returns the current column (0-based) at the scanner cursor.
Deprecated: use Column.
func (*ExternalLexer) Lookahead ¶
func (l *ExternalLexer) Lookahead() rune
Lookahead returns the current rune or 0 at EOF.
func (*ExternalLexer) MarkEnd ¶
func (l *ExternalLexer) MarkEnd()
MarkEnd marks the current scanner position as the token end.
func (*ExternalLexer) SetResultSymbol ¶
func (l *ExternalLexer) SetResultSymbol(sym Symbol)
SetResultSymbol sets the token symbol to emit when Scan returns true.
type ExternalScanner ¶
type ExternalScanner interface {
Create() any
Destroy(payload any)
Serialize(payload any, buf []byte) int
Deserialize(payload any, buf []byte)
Scan(payload any, lexer *ExternalLexer, validSymbols []bool) bool
}
ExternalScanner is the interface for language-specific external scanners. Languages like Python and JavaScript need these for indent tracking, template literals, regex vs division, etc.
The value returned by Create must be accepted by Destroy/Serialize/ Deserialize/Scan for that scanner implementation. Most scanners use a concrete payload pointer type and will panic on mismatched payload types.
func AdaptExternalScannerByExternalOrder ¶ added in v0.9.0
func AdaptExternalScannerByExternalOrder(sourceLang, targetLang *Language) (ExternalScanner, bool)
AdaptExternalScannerByExternalOrder builds an ExternalScanner adapter that reuses sourceLang's scanner for targetLang by remapping external symbols.
Mapping strategy:
- If either side has duplicate external names, use index mapping (capped to the shorter list length).
- Otherwise, prefer exact external-symbol-name matches.
- Fill remaining slots by index order (within the shorter dimension).
When source and target have different external symbol counts, name-based matching pairs tokens that exist in both grammars. Target externals with no source match get -1 (the scanner will never produce them). Source externals with no target match are silently ignored.
Returns (nil, false) when adaptation is not possible.
type ExternalScannerState ¶
type ExternalScannerState struct {
Data []byte
}
ExternalScannerState holds serialized state for an external scanner between incremental parse runs.
type ExternalSymbolResolver ¶ added in v0.9.0
type ExternalSymbolResolver struct {
// contains filtered or unexported fields
}
ExternalSymbolResolver maps external token names to their concrete Symbol IDs in a specific Language. This allows external scanners to resolve symbol IDs at runtime rather than using hardcoded constants, making them compatible with any Language that defines the same external tokens (whether from ts2go extraction or grammargen).
func NewExternalSymbolResolver ¶ added in v0.9.0
func NewExternalSymbolResolver(lang *Language) *ExternalSymbolResolver
NewExternalSymbolResolver builds a resolver from a Language's external symbol definitions. Returns nil if the Language has no external symbols.
func (*ExternalSymbolResolver) ByIndex ¶ added in v0.9.0
func (r *ExternalSymbolResolver) ByIndex(idx int) (Symbol, bool)
ByIndex returns the Symbol ID for the given external token index (position in the grammar's externals array). Returns 0, false if the index is out of range.
func (*ExternalSymbolResolver) ByName ¶ added in v0.9.0
func (r *ExternalSymbolResolver) ByName(name string) (Symbol, bool)
ByName returns the Symbol ID for the given external token name. Returns 0, false if the name is not found.
func (*ExternalSymbolResolver) Count ¶ added in v0.9.0
func (r *ExternalSymbolResolver) Count() int
Count returns the number of external tokens.
type ExternalVMInstr ¶
type ExternalVMInstr struct {
Op ExternalVMOp
A int32
B int32
Alt int32
}
ExternalVMInstr is one instruction in an external scanner VM program.
Operands:
- A: primary operand (opcode-specific)
- B: secondary operand (used by range checks)
- Alt: alternate program counter when a condition fails
func VMAdvance ¶
func VMAdvance(skip bool) ExternalVMInstr
VMAdvance constructs an advance instruction. When skip is true, the advanced rune is skipped from the token text.
func VMEmit ¶
func VMEmit(sym Symbol) ExternalVMInstr
VMEmit constructs an emit instruction for the given symbol.
func VMFail ¶
func VMFail() ExternalVMInstr
VMFail constructs a fail instruction that terminates scan with no token.
func VMIfRuneClass ¶
func VMIfRuneClass(class ExternalVMRuneClass, alt int) ExternalVMInstr
VMIfRuneClass constructs a rune-class branch with alternate target on miss.
func VMIfRuneEq ¶
func VMIfRuneEq(r rune, alt int) ExternalVMInstr
VMIfRuneEq constructs a rune-equality branch with alternate target on miss.
func VMIfRuneInRange ¶
func VMIfRuneInRange(start, end rune, alt int) ExternalVMInstr
VMIfRuneInRange constructs a rune-range branch with alternate target on miss.
func VMJump ¶
func VMJump(target int) ExternalVMInstr
VMJump constructs an unconditional branch to the target instruction index.
func VMMarkEnd ¶
func VMMarkEnd() ExternalVMInstr
VMMarkEnd constructs a mark-end instruction for the current token extent.
func VMRequireStateEq ¶
func VMRequireStateEq(state uint32, alt int) ExternalVMInstr
VMRequireStateEq constructs a payload-state guard with alternate branch on miss.
func VMRequireValid ¶
func VMRequireValid(validSymbolIndex, alt int) ExternalVMInstr
VMRequireValid constructs a valid-symbol guard with alternate branch on miss.
func VMSetState ¶
func VMSetState(state uint32) ExternalVMInstr
VMSetState constructs a payload-state assignment instruction.
type ExternalVMOp ¶
type ExternalVMOp uint8
ExternalVMOp is an opcode for the native-Go external scanner VM.
const ( ExternalVMOpFail ExternalVMOp = iota ExternalVMOpJump ExternalVMOpRequireValid ExternalVMOpRequireStateEq ExternalVMOpSetState ExternalVMOpIfRuneEq ExternalVMOpIfRuneInRange ExternalVMOpIfRuneClass ExternalVMOpAdvance ExternalVMOpMarkEnd ExternalVMOpEmit )
type ExternalVMProgram ¶
type ExternalVMProgram struct {
Code []ExternalVMInstr
MaxSteps int // <=0 uses a safe default based on program size
}
ExternalVMProgram is a small bytecode program interpreted by ExternalVMScanner.
type ExternalVMRuneClass ¶
type ExternalVMRuneClass uint8
ExternalVMRuneClass is a character class used by ExternalVMOpIfRuneClass.
const ( ExternalVMRuneClassWhitespace ExternalVMRuneClass = iota ExternalVMRuneClassDigit ExternalVMRuneClassLetter ExternalVMRuneClassWord ExternalVMRuneClassNewline )
type ExternalVMScanner ¶
type ExternalVMScanner struct {
// contains filtered or unexported fields
}
ExternalVMScanner executes an ExternalVMProgram and implements ExternalScanner.
func MustNewExternalVMScanner ¶
func MustNewExternalVMScanner(program ExternalVMProgram) *ExternalVMScanner
MustNewExternalVMScanner is like NewExternalVMScanner but panics on error. It is intended for package-level initialization where invalid programs are programmer errors.
func NewExternalVMScanner ¶
func NewExternalVMScanner(program ExternalVMProgram) (*ExternalVMScanner, error)
NewExternalVMScanner validates and constructs an ExternalVMScanner.
func (*ExternalVMScanner) Create ¶
func (s *ExternalVMScanner) Create() any
Create allocates scanner payload (currently a single uint32 state slot).
func (*ExternalVMScanner) Deserialize ¶
func (s *ExternalVMScanner) Deserialize(payload any, buf []byte)
Deserialize restores payload state from buf.
func (*ExternalVMScanner) Destroy ¶
func (s *ExternalVMScanner) Destroy(payload any)
Destroy releases scanner payload resources.
func (*ExternalVMScanner) Scan ¶
func (s *ExternalVMScanner) Scan(payload any, lexer *ExternalLexer, validSymbols []bool) bool
Scan executes the scanner program against the current lexer position.
type FieldMapEntry ¶
FieldMapEntry maps a child index to a field name.
type HighlightRange ¶
type HighlightRange struct {
StartByte uint32
EndByte uint32
Capture string // "keyword", "string", "function", etc.
PatternIndex int // query pattern index; later patterns override earlier for identical ranges
}
HighlightRange represents a styled range of source code, mapping a byte span to a capture name from a highlight query. The editor maps capture names (e.g., "keyword", "string", "function") to FSS style classes.
type Highlighter ¶
type Highlighter struct {
// contains filtered or unexported fields
}
Highlighter is a high-level API that takes source code and returns styled ranges. It combines a Parser, a compiled Query, and a Language to provide a single Highlight() call for the editor.
func NewHighlighter ¶
func NewHighlighter(lang *Language, highlightQuery string, opts ...HighlighterOption) (*Highlighter, error)
NewHighlighter creates a Highlighter for the given language and highlight query (in tree-sitter .scm format). Returns an error if the query fails to compile.
func (*Highlighter) Highlight ¶
func (h *Highlighter) Highlight(source []byte) []HighlightRange
Highlight parses the source code and executes the highlight query, returning a slice of HighlightRange sorted by StartByte. When ranges overlap, inner (more specific) captures take priority over outer ones.
func (*Highlighter) HighlightIncremental ¶
func (h *Highlighter) HighlightIncremental(source []byte, oldTree *Tree) ([]HighlightRange, *Tree)
HighlightIncremental re-highlights source after edits were applied to oldTree. Returns the new highlight ranges and the new parse tree (for use in subsequent incremental calls). Call oldTree.Edit() before calling this.
func (*Highlighter) HighlightIncrementalUTF16 ¶ added in v0.16.0
func (h *Highlighter) HighlightIncrementalUTF16(source []uint16, oldTree *Tree) ([]UTF16HighlightRange, *Tree)
HighlightIncrementalUTF16 re-highlights UTF-16 source after edits were applied to oldTree with Tree.EditUTF16.
func (*Highlighter) HighlightIncrementalUTF16Bytes ¶ added in v0.16.0
func (h *Highlighter) HighlightIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) ([]UTF16HighlightRange, *Tree, error)
HighlightIncrementalUTF16Bytes is like HighlightIncrementalUTF16 for endian-specific UTF-16 bytes.
func (*Highlighter) HighlightUTF16 ¶ added in v0.16.0
func (h *Highlighter) HighlightUTF16(source []uint16) []UTF16HighlightRange
HighlightUTF16 parses UTF-16 source and returns highlight ranges in UTF-16 code-unit coordinates.
func (*Highlighter) HighlightUTF16Bytes ¶ added in v0.16.0
func (h *Highlighter) HighlightUTF16Bytes(source []byte, order UTF16ByteOrder) ([]UTF16HighlightRange, error)
HighlightUTF16Bytes is like HighlightUTF16 for endian-specific UTF-16 bytes.
type HighlighterInjectionResolver ¶ added in v0.7.0
type HighlighterInjectionResolver func(languageHint string) (lang *Language, highlightQuery string, tokenSourceFactory func(source []byte) TokenSource, ok bool)
HighlighterInjectionResolver maps a language hint (for example "go" from a markdown code fence) to a child language and highlight query.
type HighlighterInjectionSpec ¶ added in v0.7.0
type HighlighterInjectionSpec struct {
Query string
ResolveLanguage HighlighterInjectionResolver
}
HighlighterInjectionSpec configures nested highlighting for a parent language. Query must emit @injection.content and either @injection.language or #set! injection.language metadata.
type HighlighterOption ¶
type HighlighterOption func(*Highlighter)
HighlighterOption configures a Highlighter.
func WithTokenSourceFactory ¶
func WithTokenSourceFactory(factory func(source []byte) TokenSource) HighlighterOption
WithTokenSourceFactory sets a factory function that creates a TokenSource for each Highlight call. This is needed for languages that use a custom lexer bridge (like Go, which uses go/scanner instead of a DFA lexer).
When set, Highlight() calls ParseWithTokenSource instead of Parse.
type ImportExtractResult ¶ added in v0.18.0
type ImportExtractResult struct {
Imports []ImportRef
Status ImportExtractStatus
Reason string
FallbackRecommended bool
}
ImportExtractResult is returned by source-only dependency extraction. When FallbackRecommended is true, callers that need exact tree-sitter behavior should parse the file and use ExtractImports.
func ExtractImportsFromSourceWithReport ¶ added in v0.18.0
func ExtractImportsFromSourceWithReport(lang *Language, source []byte) ImportExtractResult
ExtractImportsFromSourceWithReport returns source-only dependency declarations and a confidence report for fallback policy.
type ImportExtractStatus ¶ added in v0.18.0
type ImportExtractStatus string
ImportExtractStatus describes the confidence of source-only import extraction.
const ( ImportExtractOK ImportExtractStatus = "ok" ImportExtractUnsupportedConstruct ImportExtractStatus = "unsupported_construct" ImportExtractScannerError ImportExtractStatus = "scanner_error" ImportExtractAmbiguous ImportExtractStatus = "ambiguous" ImportExtractFallbackToTree ImportExtractStatus = "fallback_to_tree" )
type ImportRef ¶ added in v0.18.0
type ImportRef struct {
Lang string
Kind string
Path string
From string
Name string
Alias string
Static bool
Wildcard bool
Relative int
StartByte uint32
EndByte uint32
}
ImportRef is a compact language-neutral dependency declaration extracted from a syntax tree.
func ExtractImports ¶ added in v0.18.0
ExtractImports returns package/import declarations for the languages used by Gazelle-style dependency extraction. It is intentionally independent from the generic query engine so it can later be backed by compact parser refs.
func ExtractImportsFromSource ¶ added in v0.18.0
ExtractImportsFromSource returns language-neutral dependency declarations directly from source text. It is intended for cold dependency-extraction workflows that do not need a public syntax tree.
type IncrementalParseProfile ¶ added in v0.6.0
type IncrementalParseProfile struct {
ReuseCursorNanos int64
ReparseNanos int64
ReusedSubtrees uint64
ReusedBytes uint64
NewNodesAllocated uint64
ReuseUnsupported bool
ReuseUnsupportedReason string
ReuseRejectDirty uint64
ReuseRejectAncestorDirtyBeforeEdit uint64
ReuseRejectHasError uint64
ReuseRejectInvalidSpan uint64
ReuseRejectOutOfBounds uint64
ReuseRejectRootNonLeafChanged uint64
ReuseRejectLargeNonLeaf uint64
RecoverSearches uint64
RecoverStateChecks uint64
RecoverStateSkips uint64
RecoverSymbolSkips uint64
RecoverLookups uint64
RecoverHits uint64
MaxStacksSeen int
EntryScratchPeak uint64
StopReason ParseStopReason
TokensConsumed uint64
LastTokenEndByte uint32
ExpectedEOFByte uint32
ArenaBytesAllocated int64
ScratchBytesAllocated int64
EntryScratchBytesAllocated int64
GSSBytesAllocated int64
SingleStackIterations int
MultiStackIterations int
SingleStackTokens uint64
MultiStackTokens uint64
SingleStackGSSNodes uint64
MultiStackGSSNodes uint64
GSSNodesAllocated uint64
GSSNodesRetained uint64
GSSNodesDroppedSameToken uint64
ParentNodesAllocated uint64
ParentNodesRetained uint64
ParentNodesDroppedSameToken uint64
LeafNodesAllocated uint64
LeafNodesRetained uint64
LeafNodesDroppedSameToken uint64
MergeStacksIn uint64
MergeStacksOut uint64
MergeSlotsUsed uint64
GlobalCullStacksIn uint64
GlobalCullStacksOut uint64
ParserLoopNanos int64
TokenNextNanos int64
ActionDispatchNanos int64
ActionLookupNanos int64
GLRMergeNanos int64
GLRCullNanos int64
ResultSelectionNanos int64
TransientParentMaterializationNanos int64
ResultTreeBuildNanos int64
TransientChildMaterializationNanos int64
ResultPythonKeywordRepairNanos int64
ResultPythonRootRepairNanos int64
ResultFinalizeRootNanos int64
ResultExtendTrailingNanos int64
ResultNormalizeRootStartNanos int64
ResultCompatibilityNanos int64
ResultParentLinkNanos int64
ReduceRangeNanos int64
ReducePendingParentNanos int64
ReduceChildBuildNanos int64
ReduceParentBuildNanos int64
ReduceSpanNanos int64
ReduceStackPushNanos int64
ReduceNoTreeBuildNanos int64
ActionExtraShiftNanos int64
ActionNoActionNanos int64
ActionNoActionRelexNanos int64
ActionNoActionMissingNanos int64
ActionNoActionRecoverNanos int64
ActionNoActionErrorNanos int64
ActionConflictChoiceNanos int64
ActionConflictForkNanos int64
ActionSingleShiftNanos int64
ActionSingleReduceNanos int64
ActionSingleAcceptNanos int64
ActionSingleRecoverNanos int64
ActionSingleOtherNanos int64
NormalizationNanos int64
}
IncrementalParseProfile attributes incremental parse time into coarse buckets.
ReuseCursorNanos includes reuse-cursor setup and subtree-candidate checks. ReparseNanos includes the remainder of incremental parsing/rebuild work.
type IncrementalReuseExternalScanner ¶ added in v0.7.0
type IncrementalReuseExternalScanner interface {
ExternalScanner
SupportsIncrementalReuse() bool
}
IncrementalReuseExternalScanner is implemented by external scanners that can safely participate in DFA subtree reuse during incremental parses. Scanners with serialized mutable state, such as Python's indentation stack, should leave this unimplemented so edited incremental parses fall back to the conservative full-reparse path.
type IncrementalReuseTokenSource ¶ added in v0.7.0
type IncrementalReuseTokenSource interface {
TokenSource
SupportsIncrementalReuse() bool
}
IncrementalReuseTokenSource is an opt-in marker for custom token sources that are safe for incremental subtree reuse. Implementations must provide stable token boundaries across edits and support deterministic SkipToByte* behavior so reused-tree fast-forwarding remains correct.
type Injection ¶ added in v0.6.0
type Injection struct {
// Language is the detected language name (e.g., "javascript").
Language string
// Tree is the parse tree for this region, or nil if the language
// was not registered.
Tree *Tree
// Ranges are the source ranges this tree covers.
Ranges []Range
// Node is the parent tree node that triggered the injection.
Node *Node
}
Injection is a single embedded language region.
type InjectionParser ¶ added in v0.6.0
type InjectionParser struct {
// contains filtered or unexported fields
}
InjectionParser parses documents with embedded languages.
InjectionParser is not safe for concurrent use. It caches child parsers and mutates shared maps during parse operations.
func NewInjectionParser ¶ added in v0.6.0
func NewInjectionParser() *InjectionParser
NewInjectionParser creates an InjectionParser.
func (*InjectionParser) Parse ¶ added in v0.6.0
func (ip *InjectionParser) Parse(source []byte, parentLang string) (*InjectionResult, error)
Parse parses source as parentLang, then recursively parses injected regions.
func (*InjectionParser) ParseIncremental ¶ added in v0.6.0
func (ip *InjectionParser) ParseIncremental(source []byte, parentLang string, oldResult *InjectionResult) (*InjectionResult, error)
ParseIncremental re-parses after edits, reusing unchanged child trees.
func (*InjectionParser) ParseIncrementalUTF16 ¶ added in v0.16.0
func (ip *InjectionParser) ParseIncrementalUTF16(source []uint16, parentLang string, oldResult *UTF16InjectionResult) (*UTF16InjectionResult, error)
ParseIncrementalUTF16 re-parses UTF-16 source after edits, reusing unchanged child trees. Call oldResult.Tree.EditUTF16 before calling this.
func (*InjectionParser) ParseIncrementalUTF16Bytes ¶ added in v0.16.0
func (ip *InjectionParser) ParseIncrementalUTF16Bytes(source []byte, parentLang string, oldResult *UTF16InjectionResult, order UTF16ByteOrder) (*UTF16InjectionResult, error)
ParseIncrementalUTF16Bytes is like ParseIncrementalUTF16 for endian-specific UTF-16 bytes.
func (*InjectionParser) ParseUTF16 ¶ added in v0.16.0
func (ip *InjectionParser) ParseUTF16(source []uint16, parentLang string) (*UTF16InjectionResult, error)
ParseUTF16 parses UTF-16 source as parentLang, then recursively parses injected regions. The returned injection ranges are in UTF-16 code units.
func (*InjectionParser) ParseUTF16Bytes ¶ added in v0.16.0
func (ip *InjectionParser) ParseUTF16Bytes(source []byte, parentLang string, order UTF16ByteOrder) (*UTF16InjectionResult, error)
ParseUTF16Bytes is like ParseUTF16 for endian-specific UTF-16 bytes.
func (*InjectionParser) RegisterInjectionQuery ¶ added in v0.6.0
func (ip *InjectionParser) RegisterInjectionQuery(parentLang string, query string) error
RegisterInjectionQuery sets the injection query for a parent language. The query should use @injection.content and #set! injection.language conventions. It is compiled against the registered parent language.
func (*InjectionParser) RegisterLanguage ¶ added in v0.6.0
func (ip *InjectionParser) RegisterLanguage(name string, lang *Language)
RegisterLanguage adds a language that can be used as parent or child.
func (*InjectionParser) SetMaxDepth ¶ added in v0.6.0
func (ip *InjectionParser) SetMaxDepth(depth int)
SetMaxDepth overrides the nested injection recursion limit. Depth values <= 0 restore the default limit.
type InjectionResult ¶ added in v0.6.0
type InjectionResult struct {
// Tree is the parent language's parse tree.
Tree *Tree
// Injections contains child language parse results, ordered by position.
Injections []Injection
}
InjectionResult holds parse results for a multi-language document.
type InputEdit ¶
type InputEdit struct {
StartByte uint32
OldEndByte uint32
NewEndByte uint32
StartPoint Point
OldEndPoint Point
NewEndPoint Point
}
InputEdit describes a single edit to the source text. It tells the parser what byte range was replaced and what the new range looks like, so the incremental parser can skip unchanged subtrees.
type InputEncoding ¶ added in v0.16.0
type InputEncoding uint8
InputEncoding identifies the source encoding used to build a Tree.
const ( InputEncodingUTF8 InputEncoding = iota InputEncodingUTF16 )
func (InputEncoding) String ¶ added in v0.16.0
func (e InputEncoding) String() string
type InternObservationStats ¶ added in v0.20.0
type InternObservationStats struct {
// Phase 2 counters (parseState-blind observation across ALL leaves).
LeafLookups uint64
LeafHits uint64
LeafMisses uint64
LeafStores uint64
LeafGrowths uint64
// Phase 3 attribution. Shift-path leaves get parseState set per-fork
// so they can't be canonically substituted via the parseState-blind
// measurement; non-shift leaves can. "Safe to substitute" via blind
// measurement = (LeafMisses+LeafHits) - ShiftLeafObserved.
ShiftLeafObserved uint64
// Phase 3 parseState-aware measurement. Same hook as LeafLookups
// but with parseState/preGotoState included in the key. A hit here
// means a truly dedup-safe duplicate; the difference between this
// and the blind hit rate quantifies how much of the blind
// observation was an artifact of ignoring state.
FullLookups uint64
FullHits uint64
FullMisses uint64
}
InternObservationStats is the externally-visible snapshot of leaf-interning observation counters for a single parse. Returned from InternStatsFor.
func InternStatsFor ¶ added in v0.20.0
func InternStatsFor(root *Node) InternObservationStats
InternStatsFor returns a snapshot of the leaf-interning observation counters for the arena that owns the given root node. Returns the zero value if observation is disabled or the root is not arena-backed. Exposed so external benches can read hit rates without grepping internal logs.
type Language ¶
type Language struct {
Name string
// GeneratedByGrammargen is true for languages assembled by grammargen at
// runtime rather than decoded from a checked-in ts2go blob.
GeneratedByGrammargen bool
// LanguageVersion is the tree-sitter language ABI version.
// A value of 0 means "unknown/unspecified" and is treated as compatible.
LanguageVersion uint32
// Counts
SymbolCount uint32
TokenCount uint32
ExternalTokenCount uint32
StateCount uint32
LargeStateCount uint32
FieldCount uint32
ProductionIDCount uint32
// Symbol metadata
SymbolNames []string
SymbolMetadata []SymbolMetadata
FieldNames []string // index 0 is ""
// Parse tables
ParseTable [][]uint16 // dense: [state][symbol] -> action index
SmallParseTable []uint16 // compressed sparse table
SmallParseTableMap []uint32 // state -> offset into SmallParseTable
ParseActions []ParseActionEntry
// ReduceChainHints are optional generated hot-path hints for deterministic
// reduce runs. They are only consumed when reduce-chain hints are enabled.
ReduceChainHints []ReduceChainHint
// Lex tables
LexModes []LexMode
LexStates []LexState // main lexer DFA
KeywordLexStates []LexState // keyword lexer DFA (optional)
KeywordCaptureToken Symbol
// LayoutFallbackLexState is an optional broad DFA start state used only in
// layout-entry parser states. It lets the runtime avoid skipping over
// zero-width external layout markers before the layout scanner fires.
LayoutFallbackLexState uint16
HasLayoutFallbackLexState bool
// Field mapping
FieldMapSlices [][2]uint16 // [production_id] -> (index, length)
FieldMapEntries []FieldMapEntry
// Alias sequences
AliasSequences [][]Symbol // [production_id][child_index] -> alias symbol
// Primary state IDs (for table dedup)
PrimaryStateIDs []StateID
// ABI 15: Reserved words — flat array indexed by
// (reserved_word_set_id * MaxReservedWordSetSize + i), terminated by 0.
ReservedWords []Symbol
MaxReservedWordSetSize uint16
// ABI 15: Supertype hierarchy
SupertypeSymbols []Symbol
SupertypeMapSlices [][2]uint16 // [supertype_symbol] -> (index, length)
SupertypeMapEntries []Symbol
// ABI 15: Grammar semantic version
Metadata LanguageMetadata
// External scanner (nil if not needed)
ExternalScanner ExternalScanner
ExternalSymbols []Symbol // external token index -> symbol
// ImmediateTokens is a bitmask of symbol IDs that are token.immediate() tokens.
// When the lexer matches one of these after consuming whitespace, the match
// should be rejected — immediate tokens must match at the original position.
// nil means no immediate tokens (common for ts2go grammars).
ImmediateTokens []bool
// ZeroWidthTokens is a bitmask of symbol IDs whose DFA terminal pattern can
// intentionally match empty input. nil means this information is unavailable,
// which preserves historical lexer behavior for ts2go blobs.
ZeroWidthTokens []bool
// ExternalLexStates maps external lex state IDs (from LexMode.ExternalLexState)
// to a boolean slice indicating which external tokens are valid. Row 0 is
// always all-false (no external tokens valid). When non-nil, this table is
// used instead of parse-action-table probing to compute validSymbols for the
// external scanner, matching C tree-sitter's ts_external_scanner_states.
ExternalLexStates [][]bool
// InitialState is the parser's start state. In tree-sitter grammars
// this is always 1 (state 0 is reserved for error recovery). For
// hand-built grammars it defaults to 0.
InitialState StateID
// contains filtered or unexported fields
}
Language holds all data needed to parse a specific language. It mirrors tree-sitter's TSLanguage C struct, translated into idiomatic Go types with slice-based tables instead of raw pointers.
func LoadLanguage ¶ added in v0.9.0
LoadLanguage deserializes a compressed grammar blob into a Language. Blobs are produced by grammargen's GenerateLanguage or the grammar build toolchain. This is the only function needed at runtime to load pre-compiled grammars — no grammargen import required.
func (*Language) CompatibleWithRuntime ¶
CompatibleWithRuntime reports whether this language can be parsed by the current runtime version. Unspecified versions (0) are treated as compatible.
func (*Language) FieldByName ¶
FieldByName returns the field ID for a given name, or (0, false) if not found. Builds an internal map on first call for O(1) subsequent lookups.
func (*Language) IsSupertype ¶ added in v0.6.0
IsSupertype reports whether sym is a supertype symbol.
func (*Language) KeywordLexAsciiTable ¶ added in v0.10.2
KeywordLexAsciiTable returns the ASCII fast-path table for the keyword lexer DFA.
func (*Language) LexAsciiTable ¶ added in v0.10.2
LexAsciiTable returns the pre-built ASCII fast-path transition table for the main lexer DFA. The table is built once per Language. Entry format:
bit 31 set → skip transition (consume and reset token start) bits 0-30 → next state ID (lexAsciiNoMatch if no transition)
func (*Language) LexModeStarts ¶ added in v0.19.0
func (l *Language) LexModeStarts() []lexModeStart
func (*Language) PublicSymbol ¶ added in v0.7.0
PublicSymbol maps an internal symbol to its canonical public form. Multiple internal symbols may share the same visible name (e.g. HTML's _start_tag_name and _end_tag_name both display as "tag_name"). PublicSymbol returns the first symbol with that name, matching what SymbolByName returns. This ensures query patterns compiled with SymbolByName match nodes regardless of which alias produced them.
func (*Language) PublicSymbolForNamedness ¶ added in v0.19.0
PublicSymbolForNamedness maps an internal symbol to the canonical public symbol with the same display name and requested namedness. This lets query matching distinguish named nodes from anonymous tokens that share text.
func (*Language) SupertypeChildren ¶ added in v0.6.0
SupertypeChildren returns the subtype symbols for a given supertype. Returns nil if sym is not a supertype or has no entries.
func (*Language) SymbolByName ¶
SymbolByName returns the symbol ID for a given name, or (0, false) if not found. The "_" wildcard returns (0, true) as a special case. Builds an internal map on first call for O(1) subsequent lookups.
func (*Language) TokenSymbolsByName ¶
TokenSymbolsByName returns all terminal token symbols whose display name matches name. The returned symbols are in grammar order.
type LanguageMetadata ¶ added in v0.6.0
LanguageMetadata holds the grammar's semantic version (ABI 15+).
type LexMode ¶
type LexMode struct {
LexState uint16
ExternalLexState uint16
ReservedWordSetID uint16
AfterWhitespaceLexState uint16 // DFA start state to use after whitespace (0 = same as LexState)
LexStateID uint32 // widened DFA start state for grammargen tables with >64K lexer states
AfterWhitespaceLexStateID uint32
}
LexMode maps a parser state to its lexer configuration.
func (LexMode) AfterWhitespaceLexStateIndex ¶ added in v0.16.0
AfterWhitespaceLexStateIndex returns the alternate DFA start state used after whitespace, or zero when the primary lex state should be used.
func (LexMode) LexStateIndex ¶ added in v0.16.0
LexStateIndex returns the DFA start state for this lex mode. Older grammar blobs only populate the uint16 LexState field; grammargen-generated tables can populate LexStateID when the DFA table exceeds 64K states.
func (*LexMode) SetAfterWhitespaceLexStateIndex ¶ added in v0.16.0
func (*LexMode) SetLexStateIndex ¶ added in v0.16.0
type LexState ¶
type LexState struct {
AcceptToken Symbol // 0 if this state doesn't accept
AcceptPriority int16 // lower = higher priority (0 for ts2go blobs = longest-match)
Skip bool // true if accepted chars are whitespace
Default int // default next state (-1 if none)
EOF int // state on EOF (-1 if none)
Transitions []LexTransition
}
LexState is one state in the table-driven lexer DFA.
type LexTransition ¶
type LexTransition struct {
Lo, Hi rune // inclusive character range
NextState int
// Skip mirrors tree-sitter's SKIP(state): consume the matched rune
// and continue lexing while resetting token start.
Skip bool
}
LexTransition maps a character range to a next state.
type Lexer ¶
type Lexer struct {
// contains filtered or unexported fields
}
Lexer tokenizes source text using a table-driven DFA.
type LookaheadIterator ¶ added in v0.6.0
type LookaheadIterator struct {
// contains filtered or unexported fields
}
LookaheadIterator iterates over valid symbols for a given parse state. It precomputes the full set of symbols that have valid parse actions in the specified state, enabling autocomplete and error diagnostic use cases.
func NewLookaheadIterator ¶ added in v0.6.0
func NewLookaheadIterator(lang *Language, state StateID) (*LookaheadIterator, error)
NewLookaheadIterator creates an iterator over all symbols that have valid parse actions in the given state. Returns an error if the state is out of range for the language's parse tables.
func (*LookaheadIterator) CurrentSymbol ¶ added in v0.6.0
func (it *LookaheadIterator) CurrentSymbol() Symbol
CurrentSymbol returns the symbol at the current iterator position. Must be called after a successful Next().
func (*LookaheadIterator) CurrentSymbolName ¶ added in v0.6.0
func (it *LookaheadIterator) CurrentSymbolName() string
CurrentSymbolName returns the name of the symbol at the current iterator position. Returns "" if the position is invalid or the symbol has no name.
func (*LookaheadIterator) Language ¶ added in v0.6.0
func (it *LookaheadIterator) Language() *Language
Language returns the language associated with this iterator.
func (*LookaheadIterator) Next ¶ added in v0.6.0
func (it *LookaheadIterator) Next() bool
Next advances the iterator to the next valid symbol. Returns false when there are no more symbols.
func (*LookaheadIterator) ResetState ¶ added in v0.6.0
func (it *LookaheadIterator) ResetState(state StateID) error
ResetState resets the iterator to enumerate valid symbols for a different parse state within the same language. Returns an error if the state is out of range.
type Node ¶
type Node struct {
// contains filtered or unexported fields
}
Node is a syntax tree node.
func NewLeafNode ¶
func NewLeafNode(sym Symbol, named bool, startByte, endByte uint32, startPoint, endPoint Point) *Node
NewLeafNode creates a terminal/leaf node.
func NewParentNode ¶
func NewParentNode(sym Symbol, named bool, children []*Node, fieldIDs []FieldID, productionID uint16) *Node
NewParentNode creates a non-terminal node with children. It sets parent pointers on all children and computes byte/point spans from the first and last children. If any child has an error, the parent is marked as having an error too.
func (*Node) ChildByFieldName ¶
ChildByFieldName returns the first child assigned to the given field name, or nil if no child has that field. The Language is needed to resolve field names to IDs. Uses Language.FieldByName for O(1) lookup.
func (*Node) ChildCount ¶
ChildCount returns the number of children (both named and anonymous).
func (*Node) DescendantForByteRange ¶ added in v0.6.0
DescendantForByteRange returns the smallest descendant that fully contains the given byte range, or nil when no such descendant exists.
func (*Node) DescendantForPointRange ¶ added in v0.6.0
DescendantForPointRange returns the smallest descendant that fully contains the given point range, or nil when no such descendant exists.
func (*Node) Edit ¶ added in v0.7.0
Edit adjusts this node's byte/point span for a source edit.
If the node belongs to a larger tree, the edit is applied from the containing root so sibling and ancestor spans remain consistent. Unlike Tree.Edit, this method does not record edit history on a Tree.
func (*Node) FieldNameForChild ¶ added in v0.6.0
FieldNameForChild returns the field name assigned to the i-th child, or an empty string when no field is assigned.
func (*Node) HasChanges ¶ added in v0.6.0
HasChanges reports whether this node was marked dirty by Tree.Edit.
func (*Node) HasError ¶
HasError reports whether this node or any descendant contains a parse error.
func (*Node) IsExtra ¶ added in v0.6.0
IsExtra reports whether this node was marked as extra syntax (e.g. whitespace/comments outside the core parse structure).
func (*Node) IsNamed ¶
IsNamed reports whether this is a named node (as opposed to anonymous syntax like punctuation).
func (*Node) NamedChild ¶
NamedChild returns the i-th named child (skipping anonymous children), or nil if i is out of range.
func (*Node) NamedChildCount ¶
NamedChildCount returns the number of named children.
func (*Node) NamedDescendantForByteRange ¶ added in v0.6.0
NamedDescendantForByteRange returns the smallest named descendant that fully contains the given byte range, or nil when no such descendant exists.
func (*Node) NamedDescendantForPointRange ¶ added in v0.6.0
NamedDescendantForPointRange returns the smallest named descendant that fully contains the given point range, or nil when no such descendant exists.
func (*Node) NextSibling ¶
NextSibling returns the next sibling node, or nil when this is the last child or has no parent.
func (*Node) ParseState ¶
ParseState returns the parser state associated with this node.
func (*Node) PreGotoState ¶ added in v0.6.0
PreGotoState returns the parser state that was on top of the stack before this node was pushed (i.e., the state exposed after popping children during reduce). For non-leaf nodes: lookupGoto(PreGotoState, Symbol) == ParseState.
func (*Node) PrevSibling ¶
PrevSibling returns the previous sibling node, or nil when this is the first child or has no parent.
func (*Node) SExpr ¶ added in v0.6.0
SExpr returns a tree-sitter-style S-expression for this node. It includes only named nodes for stable debug snapshots.
func (*Node) StartPoint ¶
StartPoint returns the row/column position where this node begins.
type NormalizationPassRuntime ¶ added in v0.20.0
type ParseAction ¶
type ParseAction struct {
Type ParseActionType
State StateID // target state (shift/recover)
Symbol Symbol // reduced symbol (reduce)
ChildCount uint8 // children consumed (reduce)
DynamicPrecedence int16 // precedence (reduce)
ProductionID uint16 // which production (reduce)
Extra bool // is this an extra token (shift)
ExtraChain bool // does this shift enter a nonterminal extra chain
Repetition bool // is this a repetition (shift)
}
ParseAction is a single parser action from the parse table.
type ParseActionEntry ¶
type ParseActionEntry struct {
Reusable bool
Actions []ParseAction
}
ParseActionEntry is a group of actions for a (state, symbol) pair.
type ParseActionTiming ¶ added in v0.19.0
type ParseActionTiming struct {
ExtraShiftNanos int64
NoActionNanos int64
NoActionRelexNanos int64
NoActionMissingNanos int64
NoActionRecoverNanos int64
NoActionErrorNanos int64
ConflictChoiceNanos int64
ConflictForkNanos int64
SingleShiftNanos int64
SingleReduceNanos int64
SingleAcceptNanos int64
SingleRecoverNanos int64
SingleOtherNanos int64
}
type ParseActionType ¶
type ParseActionType uint8
ParseActionType identifies the kind of parse action.
const ( ParseActionShift ParseActionType = iota ParseActionReduce ParseActionAccept ParseActionRecover )
type ParseEquivStateRuntime ¶ added in v0.19.0
type ParseEquivStateRuntime struct {
State StateID
StackEquivCalls uint64
StackEquivTrue uint64
StackEquivDepthMismatch uint64
StackEquivHashMismatch uint64
StackEquivStateMismatch uint64
StackEquivPayloadMismatch uint64
StackEquivEntryCompares uint64
StackEquivStateMismatchDepthSum uint64
StackEquivStateMismatchMaxDepth uint32
StackEquivStateMismatchDepthBuckets [stackEquivMismatchDepthBucketCount]uint64
StackEquivPayloadMismatchDepthSum uint64
StackEquivPayloadMismatchMaxDepth uint32
StackEquivPayloadMismatchDepthBuckets [stackEquivMismatchDepthBucketCount]uint64
StackEquivPayloadHeaderSigDiff uint64
StackEquivPayloadHeaderSigSame uint64
StackEquivPayloadShallowSigDiff uint64
StackEquivPayloadShallowSigSame uint64
StackEquivPairKeyed uint64
StackEquivPairUnkeyed uint64
StackEquivPairRepeats uint64
StackEquivPairRepeatTrue uint64
StackEquivPairRepeatFalse uint64
StackEquivPairRepeatMismatch uint64
StackEquivPairStores uint64
EquivCacheLookups uint64
EquivCacheHits uint64
EquivCacheStores uint64
EquivCacheMisses uint64
EquivCacheTrueHits uint64
EquivCacheFalseHits uint64
EquivCacheEpochMisses uint64
EquivCacheKeyMisses uint64
EquivCacheVersionMisses uint64
EquivSkipError uint64
EquivSkipLeaf uint64
EquivSkipFieldMismatch uint64
EquivExactCalls uint64
EquivExactTrue uint64
EquivExactPointerTrue uint64
EquivExactNilMismatch uint64
EquivExactHeaderMismatch uint64
EquivExactChildMismatch uint64
EquivExactTerminalCalls uint64
EquivExactTerminalTrue uint64
EquivExactTerminalFalse uint64
EquivFrontierCalls uint64
EquivFrontierTrue uint64
EquivExactChildCompares uint64
EquivFrontierChildScans uint64
EquivFrontierCandidateCompares uint64
}
type ParseOption ¶ added in v0.6.0
type ParseOption func(*parseConfig)
ParseOption configures ParseWith behavior.
func WithOldTree ¶ added in v0.6.0
func WithOldTree(oldTree *Tree) ParseOption
WithOldTree enables incremental parsing against an edited prior tree.
func WithProfiling ¶ added in v0.6.0
func WithProfiling() ParseOption
WithProfiling enables incremental parse attribution in ParseResult.Profile.
func WithTokenSource ¶ added in v0.6.0
func WithTokenSource(ts TokenSource) ParseOption
WithTokenSource provides a custom token source for parsing.
type ParseReduceTiming ¶ added in v0.19.0
type ParseResult ¶ added in v0.6.0
type ParseResult struct {
Tree *Tree
// Profile is populated only when ParseWith uses WithProfiling for
// incremental parsing.
Profile IncrementalParseProfile
// ProfileAvailable reports whether Profile contains attribution data.
ProfileAvailable bool
}
ParseResult is returned by ParseWith.
type ParseRuntime ¶ added in v0.6.0
type ParseRuntime struct {
StopReason ParseStopReason
SourceLen uint32
ExpectedEOFByte uint32
RootEndByte uint32
Truncated bool
TokenSourceEOFEarly bool
TokensConsumed uint64
LastTokenEndByte uint32
LastTokenSymbol Symbol
LastTokenWasEOF bool
IterationLimit int
StackDepthLimit int
NodeLimit int
MemoryBudgetBytes int64
Iterations int
NodesAllocated int
ArenaBytesAllocated int64
ScratchBytesAllocated int64
EntryScratchBytesAllocated int64
GSSBytesAllocated int64
PeakStackDepth int
MaxStacksSeen int
SingleStackIterations int
MultiStackIterations int
SingleStackTokens uint64
MultiStackTokens uint64
SingleStackGSSNodes uint64
MultiStackGSSNodes uint64
GSSNodesAllocated uint64
GSSNodesRetained uint64
GSSNodesDroppedSameToken uint64
ParentNodesAllocated uint64
ParentNodesRetained uint64
ParentNodesDroppedSameToken uint64
LeafNodesAllocated uint64
LeafNodesRetained uint64
LeafNodesDroppedSameToken uint64
ChildSlicesAllocated uint64
ChildSlicesRetained uint64
ChildSlicesDroppedSameToken uint64
ChildPointersAllocated uint64
ChildPointersRetained uint64
ChildPointersDroppedSameToken uint64
ReduceChildFastGSS ReduceChildPathRuntime
ReduceChildAllVisible ReduceChildPathRuntime
ReduceChildNoAlias ReduceChildPathRuntime
ReduceChildScratchGeneral ReduceChildPathRuntime
ReduceChildScratchNoAlias ReduceChildPathRuntime
TransientChildSlicesAllocated uint64
TransientChildPointersAllocated uint64
TransientChildSlicesMaterialized uint64
TransientChildPointersMaterialized uint64
TransientParentNodesAllocated uint64
TransientParentNodesMaterialized uint64
FinalNodes uint64
FinalParentNodes uint64
FinalLeafNodes uint64
FinalFieldedParentNodes uint64
FinalUnfieldedParentNodes uint64
FinalVisibleParentNodes uint64
FinalHiddenParentNodes uint64
FinalCheckpointLeafNodes uint64
FinalChildSlices uint64
FinalChildPointers uint64
FinalFieldIDElements uint64
FinalFieldSourceElements uint64
FinalChildRefParents uint64
FinalChildRefs uint64
FinalChildRefMaterializedParents uint64
FinalChildRefMaterializedChildren uint64
FinalChildRefSingleChildAccesses uint64
FinalChildRefSingleChildMaterializedChildren uint64
MergeStacksIn uint64
MergeStacksOut uint64
MergeSlotsUsed uint64
GlobalCullStacksIn uint64
GlobalCullStacksOut uint64
StackEquivCalls uint64
StackEquivTrue uint64
StackEquivDepthMismatch uint64
StackEquivHashMismatch uint64
StackEquivStateMismatch uint64
StackEquivPayloadMismatch uint64
StackEquivEntryCompares uint64
StackEquivStateMismatchDepthSum uint64
StackEquivStateMismatchMaxDepth uint32
StackEquivStateMismatchDepthBuckets [stackEquivMismatchDepthBucketCount]uint64
StackEquivPayloadMismatchDepthSum uint64
StackEquivPayloadMismatchMaxDepth uint32
StackEquivPayloadMismatchDepthBuckets [stackEquivMismatchDepthBucketCount]uint64
StackEquivPayloadHeaderSigDiff uint64
StackEquivPayloadHeaderSigSame uint64
StackEquivPayloadShallowSigDiff uint64
StackEquivPayloadShallowSigSame uint64
StackEquivPairKeyed uint64
StackEquivPairUnkeyed uint64
StackEquivPairRepeats uint64
StackEquivPairRepeatTrue uint64
StackEquivPairRepeatFalse uint64
StackEquivPairRepeatMismatch uint64
StackEquivPairStores uint64
MergeHeaderEqTotal uint64
MergeDeepTrue uint64
MergeDeepFalse uint64
MergeHeaderDeepDivergent uint64
EquivCacheLookups uint64
EquivCacheHits uint64
EquivCacheStores uint64
EquivCacheMisses uint64
EquivCacheTrueHits uint64
EquivCacheFalseHits uint64
EquivCacheEpochMisses uint64
EquivCacheKeyMisses uint64
EquivCacheVersionMisses uint64
EquivSkipError uint64
EquivSkipLeaf uint64
EquivSkipFieldMismatch uint64
EquivExactCalls uint64
EquivExactTrue uint64
EquivExactPointerTrue uint64
EquivExactNilMismatch uint64
EquivExactHeaderMismatch uint64
EquivExactChildMismatch uint64
EquivExactTerminalCalls uint64
EquivExactTerminalTrue uint64
EquivExactTerminalFalse uint64
EquivFrontierCalls uint64
EquivFrontierTrue uint64
EquivExactChildCompares uint64
EquivFrontierChildScans uint64
EquivFrontierCandidateCompares uint64
EquivStateStats []ParseEquivStateRuntime
ParseWallNanos int64
ParserLoopNanos int64
TokenNextNanos int64
ActionDispatchNanos int64
ActionLookupNanos int64
GLRMergeNanos int64
GLRCullNanos int64
ReduceTiming *ParseReduceTiming
ActionTiming *ParseActionTiming
ExternalScannerCheckpointRecords uint64
ExternalScannerCheckpointSlotsAllocated uint64
ExternalScannerCheckpointBytesAllocated int64
ExternalScannerSnapshotBytesAllocated uint64
ExternalScannerCheckpointLeafNodes uint64
CompactFullLeafCreated uint64
CompactFullLeafMaterialized uint64
CompactFullLeafMaterializedForParentReduce uint64
CompactFullLeafMaterializedForParentReject PendingParentRejectStats
CompactFullLeafMaterializedForFinalTree uint64
CompactFullLeafMaterializedForNormalization uint64
CompactFullLeafMaterializedForRecovery uint64
CompactFullLeafMaterializedForQuery uint64
CompactFullLeafMaterializedForCursor uint64
CompactFullLeafMaterializedForParentAPI uint64
CompactFullLeafMaterializedForEdit uint64
CompactFullLeafMaterializedForCheckpointRebuild uint64
CompactFullLeafDropped uint64
CompactFullLeafMaterializedForFieldRejectPayload PendingParentFieldRejectPayloadStats
PendingParentCreated uint64
PendingParentMaterialized uint64
PendingParentMaterializedForParentReduce uint64
PendingParentMaterializedForParentReject PendingParentRejectStats
PendingParentMaterializedForFieldReject PendingParentFieldRejectStats
PendingParentMaterializedForFieldRejectPayload PendingParentFieldRejectPayloadStats
PendingParentMaterializedForFinalTree uint64
PendingParentMaterializedForNormalization uint64
PendingParentMaterializedForRecovery uint64
PendingParentMaterializedForQuery uint64
PendingParentMaterializedForCursor uint64
PendingParentMaterializedForParentAPI uint64
PendingParentMaterializedForEdit uint64
PendingParentMaterializedForCheckpointRebuild uint64
PendingParentDropped uint64
PendingParentsFlattened uint64
PendingChildRefsFlattened uint64
PendingChildEntriesAllocated uint64
PendingChildEntryCapacity uint64
PendingChildEntryWaste uint64
PendingParentCandidates uint64
PendingParentRejectedEmpty uint64
PendingParentRejectedChildLimit uint64
PendingParentRejectedAlias uint64
PendingParentRejectedRawSpan uint64
PendingParentRejectedFields uint64
PendingParentRejectedFieldsParentHidden uint64
PendingParentRejectedFieldsNoIDs uint64
PendingParentRejectedFieldsInherited uint64
PendingParentRejectedFieldsHiddenChild uint64
PendingParentRejectedFieldsHiddenChildPlain uint64
PendingParentRejectedFieldsHiddenChildPlainEmpty uint64
PendingParentRejectedFieldsHiddenChildPlainOne uint64
PendingParentRejectedFieldsHiddenChildPlainMany uint64
PendingParentRejectedFieldsHiddenChildWithFields uint64
PendingParentRejectedFieldsChild uint64
PendingParentRejectedFieldsAllVisibleDirect uint64
PendingParentRejectedChild uint64
PendingParentRejectedSpan uint64
PendingParentRejectedFill uint64
PreMaterializationFieldRejectCandidates uint64
PreMaterializationFieldRejectSameKeyCandidates uint64
PreMaterializationFieldRejectOverflowCandidates uint64
CheckpointLeafFullNodesAvoided uint64
LeafNodesConstructed uint64
ParentNodesConstructed uint64
NoTreeReduceNodesConstructed uint64
NoTreeLeafNodesConstructed uint64
ResultSelectionNanos int64
TransientParentMaterializationNanos int64
ResultTreeBuildNanos int64
TransientChildMaterializationNanos int64
ResultPythonKeywordRepairNanos int64
ResultPythonRootRepairNanos int64
ResultFinalizeRootNanos int64
ResultExtendTrailingNanos int64
ResultNormalizeRootStartNanos int64
ResultCompatibilityNanos int64
ResultParentLinkNanos int64
NormalizationPassesChecked uint64
NormalizationPassesRun uint64
NormalizationNodesVisited uint64
NormalizationNodesRewritten uint64
NormalizationNanos int64
NormalizationPasses *[]NormalizationPassRuntime
}
ParseRuntime captures parser-loop diagnostics for a completed tree.
func (ParseRuntime) Summary ¶ added in v0.6.0
func (rt ParseRuntime) Summary() string
Summary returns a stable one-line diagnostic string for parse-runtime stats.
type ParseStopReason ¶ added in v0.6.0
type ParseStopReason string
ParseStopReason reports why parseInternal terminated.
const ( ParseStopNone ParseStopReason = "none" ParseStopAccepted ParseStopReason = "accepted" ParseStopNoStacksAlive ParseStopReason = "no_stacks_alive" ParseStopTokenSourceEOF ParseStopReason = "token_source_eof" ParseStopTimeout ParseStopReason = "timeout" ParseStopCancelled ParseStopReason = "cancelled" ParseStopIterationLimit ParseStopReason = "iteration_limit" ParseStopStackDepthLimit ParseStopReason = "stack_depth_limit" ParseStopNodeLimit ParseStopReason = "node_limit" ParseStopMemoryBudget ParseStopReason = "memory_budget" )
type Parser ¶
type Parser struct {
// contains filtered or unexported fields
}
Parser reads parse tables from a Language and produces a syntax tree. It supports GLR parsing: when a (state, symbol) pair maps to multiple actions, the parser forks the stack and explores all alternatives in parallel while preserving distinct parse paths. Duplicate stack versions are collapsed and ambiguities are resolved at selection time.
Parser is not safe for concurrent use. Use one parser per goroutine, a ParserPool, or guard shared parser instances with external synchronization.
func (*Parser) CancellationFlag ¶ added in v0.7.0
CancellationFlag returns the parser's current cancellation flag pointer.
func (*Parser) IncludedRanges ¶ added in v0.6.0
IncludedRanges returns a copy of the configured include ranges.
func (*Parser) InferredRootSymbol ¶ added in v0.9.0
InferredRootSymbol returns the root symbol inferred during parser construction, and whether inference succeeded.
func (*Parser) Logger ¶ added in v0.7.0
func (p *Parser) Logger() ParserLogger
Logger returns the currently configured parser debug logger.
func (*Parser) Parse ¶
Parse tokenizes and parses source using the built-in DFA lexer, returning a syntax tree. This works for hand-built grammars that provide LexStates. For real grammars that need a custom lexer, use ParseWithTokenSource. If the input is empty, it returns a tree with a nil root and no error.
func (*Parser) ParseForestExperimental ¶ added in v0.20.0
ParseForestExperimental parses source with the experimental GSS-forest GLR path and returns a releasable tree (or nil,false if the parse dies — the forest path has no error recovery yet). Exported so out-of-tree benchmarks and validation in packages that attach external scanners (e.g. grammars) can drive it; not part of the stable API.
func (*Parser) ParseIncremental ¶
ParseIncremental re-parses source after edits were applied to oldTree. It reuses unchanged subtrees from the old tree for better performance. Call oldTree.Edit() for each edit before calling this method.
func (*Parser) ParseIncrementalProfiled ¶ added in v0.6.0
func (p *Parser) ParseIncrementalProfiled(source []byte, oldTree *Tree) (*Tree, IncrementalParseProfile, error)
ParseIncrementalProfiled is like ParseIncremental and also returns runtime attribution for incremental reuse work vs parse/rebuild work.
func (*Parser) ParseIncrementalUTF16 ¶ added in v0.16.0
ParseIncrementalUTF16 re-parses UTF-16 source after edits were applied to oldTree. oldTree should have been produced by ParseUTF16, and UTF-16 edits can be recorded with Tree.EditUTF16.
func (*Parser) ParseIncrementalUTF16Bytes ¶ added in v0.16.0
func (p *Parser) ParseIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) (*Tree, error)
ParseIncrementalUTF16Bytes re-parses UTF-16 bytes after edits were applied to oldTree.
func (*Parser) ParseIncrementalUTF16BytesWithTokenSourceFactory ¶ added in v0.16.0
func (p *Parser) ParseIncrementalUTF16BytesWithTokenSourceFactory(source []byte, oldTree *Tree, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
ParseIncrementalUTF16BytesWithTokenSourceFactory re-parses UTF-16 bytes using a token source built from the parser's canonical UTF-8 source view.
func (*Parser) ParseIncrementalUTF16WithTokenSourceFactory ¶ added in v0.16.0
func (p *Parser) ParseIncrementalUTF16WithTokenSourceFactory(source []uint16, oldTree *Tree, factory TokenSourceFactory) (*Tree, error)
ParseIncrementalUTF16WithTokenSourceFactory re-parses UTF-16 source using a token source built from the parser's canonical UTF-8 source view.
func (*Parser) ParseIncrementalWithTokenSource ¶
func (p *Parser) ParseIncrementalWithTokenSource(source []byte, oldTree *Tree, ts TokenSource) (*Tree, error)
ParseIncrementalWithTokenSource is like ParseIncremental but uses a custom token source.
func (*Parser) ParseIncrementalWithTokenSourceFactory ¶ added in v0.16.0
func (p *Parser) ParseIncrementalWithTokenSourceFactory(source []byte, oldTree *Tree, factory TokenSourceFactory) (*Tree, error)
ParseIncrementalWithTokenSourceFactory is like ParseWithTokenSourceFactory for an edited old tree.
func (*Parser) ParseIncrementalWithTokenSourceProfiled ¶ added in v0.6.0
func (p *Parser) ParseIncrementalWithTokenSourceProfiled(source []byte, oldTree *Tree, ts TokenSource) (*Tree, IncrementalParseProfile, error)
ParseIncrementalWithTokenSourceProfiled is like ParseIncrementalWithTokenSource and also returns runtime attribution for incremental reuse work vs parse/rebuild work.
func (*Parser) ParseNoResultCompatibilityBenchmarkOnly ¶ added in v0.18.0
ParseNoResultCompatibilityBenchmarkOnly parses source while suppressing language-specific result compatibility rewrites. It is intended only for performance attribution; the returned tree is not API-compatible.
func (*Parser) ParseNoTreeBenchmarkOnly ¶ added in v0.17.0
ParseNoTreeBenchmarkOnly parses source while suppressing parent/child tree materialization in reduce actions. It is intended only for parser-loop performance experiments; the returned tree is not API-compatible.
func (*Parser) ParseNoTreeWithExternalCheckpointsBenchmarkOnly ¶ added in v0.18.0
ParseNoTreeWithExternalCheckpointsBenchmarkOnly parses source while suppressing parent/child tree materialization in reduce actions but keeping external-scanner checkpoint capture enabled. It is intended only for parser performance attribution; the returned tree is not API-compatible.
func (*Parser) ParseUTF16 ¶ added in v0.16.0
ParseUTF16 parses UTF-16 source represented as Go UTF-16 code units.
The parser core uses a canonical UTF-8 view internally so existing byte-based APIs remain unchanged. The returned tree retains the original UTF-16 source and can convert node ranges back to UTF-16 code-unit coordinates.
func (*Parser) ParseUTF16Bytes ¶ added in v0.16.0
func (p *Parser) ParseUTF16Bytes(source []byte, order UTF16ByteOrder) (*Tree, error)
ParseUTF16Bytes parses UTF-16 source encoded as bytes with an explicit byte order.
func (*Parser) ParseUTF16BytesWithTokenSourceFactory ¶ added in v0.16.0
func (p *Parser) ParseUTF16BytesWithTokenSourceFactory(source []byte, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
ParseUTF16BytesWithTokenSourceFactory parses UTF-16 bytes using a token source built from the parser's canonical UTF-8 source view.
func (*Parser) ParseUTF16WithTokenSourceFactory ¶ added in v0.16.0
func (p *Parser) ParseUTF16WithTokenSourceFactory(source []uint16, factory TokenSourceFactory) (*Tree, error)
ParseUTF16WithTokenSourceFactory parses UTF-16 source using a token source built from the parser's canonical UTF-8 source view.
func (*Parser) ParseWith ¶ added in v0.6.0
func (p *Parser) ParseWith(source []byte, opts ...ParseOption) (ParseResult, error)
ParseWith parses source using option-based configuration.
func (*Parser) ParseWithTokenSource ¶
func (p *Parser) ParseWithTokenSource(source []byte, ts TokenSource) (*Tree, error)
ParseWithTokenSource parses source using a custom token source. This is used for real grammars where the lexer DFA isn't available as data tables (e.g., Go grammar using go/scanner as a bridge).
func (*Parser) ParseWithTokenSourceFactory ¶ added in v0.16.0
func (p *Parser) ParseWithTokenSourceFactory(source []byte, factory TokenSourceFactory) (*Tree, error)
ParseWithTokenSourceFactory parses source using a freshly built custom token source. The factory is also retained for recovery reparses.
func (*Parser) SetAmbiguityProfile ¶ added in v0.17.0
func (p *Parser) SetAmbiguityProfile(profile *AmbiguityProfile)
SetAmbiguityProfile installs an optional diagnostic ambiguity profile. The profile receives parser state/lookahead/action counters for GLR-heavy benchmark runs. Pass nil to disable profiling.
func (*Parser) SetCancellationFlag ¶ added in v0.7.0
SetCancellationFlag configures a caller-owned cancellation flag. Parsing stops when the pointed value becomes non-zero.
func (*Parser) SetGLRTrace ¶ added in v0.7.0
SetGLRTrace enables verbose GLR stack tracing to stdout (debug only).
func (*Parser) SetIncludedRanges ¶ added in v0.6.0
SetIncludedRanges configures parser include ranges. Tokens outside these ranges are skipped.
func (*Parser) SetIncludedUTF16ByteRanges ¶ added in v0.16.0
func (p *Parser) SetIncludedUTF16ByteRanges(source []byte, order UTF16ByteOrder, ranges []UTF16Range) error
SetIncludedUTF16ByteRanges configures parser include ranges from endian-specific UTF-16 bytes.
func (*Parser) SetIncludedUTF16Ranges ¶ added in v0.16.0
func (p *Parser) SetIncludedUTF16Ranges(source []uint16, ranges []UTF16Range) bool
SetIncludedUTF16Ranges configures parser include ranges from UTF-16 code-unit ranges. Internal parser points are derived from source as UTF-8 columns.
func (*Parser) SetLogger ¶ added in v0.7.0
func (p *Parser) SetLogger(logger ParserLogger)
SetLogger installs a parser debug logger. Pass nil to disable logging.
func (*Parser) SetTimeoutMicros ¶ added in v0.7.0
SetTimeoutMicros configures a per-parse timeout in microseconds. A value of 0 disables timeout checks.
func (*Parser) TimeoutMicros ¶ added in v0.7.0
TimeoutMicros returns the parser timeout in microseconds.
type ParserLogType ¶ added in v0.7.0
type ParserLogType uint8
ParserLogType categorizes parser log messages.
const ( // ParserLogParse emits parser-loop lifecycle and control-flow logs. ParserLogParse ParserLogType = iota // ParserLogLex emits token-source and token-consumption logs. ParserLogLex )
type ParserLogger ¶ added in v0.7.0
type ParserLogger func(kind ParserLogType, message string)
ParserLogger receives parser debug logs when configured via SetLogger.
type ParserPool ¶ added in v0.7.0
type ParserPool struct {
// contains filtered or unexported fields
}
ParserPool provides concurrency-safe parsing by reusing Parser instances.
ParserPool is safe for concurrent use. Each call checks out one parser from an internal sync.Pool, applies configured defaults, runs the parse, and returns the parser to the pool.
Mutable parser state (logger, timeout, cancellation flag, included ranges, GLR trace) is reset on checkout so request-local state cannot bleed across callers.
func NewParserPool ¶ added in v0.7.0
func NewParserPool(lang *Language, opts ...ParserPoolOption) *ParserPool
NewParserPool creates a concurrency-safe parser pool for lang.
func (*ParserPool) Language ¶ added in v0.7.0
func (pp *ParserPool) Language() *Language
Language returns the pool's configured language.
func (*ParserPool) Parse ¶ added in v0.7.0
func (pp *ParserPool) Parse(source []byte) (*Tree, error)
Parse delegates to a pooled Parser.Parse call.
func (*ParserPool) ParseIncrementalUTF16 ¶ added in v0.16.0
func (pp *ParserPool) ParseIncrementalUTF16(source []uint16, oldTree *Tree) (*Tree, error)
ParseIncrementalUTF16 delegates to a pooled Parser.ParseIncrementalUTF16 call.
func (*ParserPool) ParseIncrementalUTF16Bytes ¶ added in v0.16.0
func (pp *ParserPool) ParseIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) (*Tree, error)
ParseIncrementalUTF16Bytes delegates to a pooled Parser.ParseIncrementalUTF16Bytes call.
func (*ParserPool) ParseIncrementalUTF16BytesWithTokenSourceFactory ¶ added in v0.16.0
func (pp *ParserPool) ParseIncrementalUTF16BytesWithTokenSourceFactory(source []byte, oldTree *Tree, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
ParseIncrementalUTF16BytesWithTokenSourceFactory delegates to a pooled Parser.ParseIncrementalUTF16BytesWithTokenSourceFactory call.
func (*ParserPool) ParseIncrementalUTF16WithTokenSourceFactory ¶ added in v0.16.0
func (pp *ParserPool) ParseIncrementalUTF16WithTokenSourceFactory(source []uint16, oldTree *Tree, factory TokenSourceFactory) (*Tree, error)
ParseIncrementalUTF16WithTokenSourceFactory delegates to a pooled Parser.ParseIncrementalUTF16WithTokenSourceFactory call.
func (*ParserPool) ParseNoResultCompatibilityBenchmarkOnly ¶ added in v0.18.0
func (pp *ParserPool) ParseNoResultCompatibilityBenchmarkOnly(source []byte) (*Tree, error)
ParseNoResultCompatibilityBenchmarkOnly delegates to Parser.ParseNoResultCompatibilityBenchmarkOnly. It is intended only for performance attribution of parser/tree construction versus compatibility rewrites; the returned tree is not API-compatible.
func (*ParserPool) ParseNoTreeBenchmarkOnly ¶ added in v0.17.0
func (pp *ParserPool) ParseNoTreeBenchmarkOnly(source []byte) (*Tree, error)
ParseNoTreeBenchmarkOnly delegates to Parser.ParseNoTreeBenchmarkOnly. It is intended only for parser-loop performance experiments; the returned tree is not API-compatible.
func (*ParserPool) ParseNoTreeWithExternalCheckpointsBenchmarkOnly ¶ added in v0.18.0
func (pp *ParserPool) ParseNoTreeWithExternalCheckpointsBenchmarkOnly(source []byte) (*Tree, error)
ParseNoTreeWithExternalCheckpointsBenchmarkOnly delegates to Parser.ParseNoTreeWithExternalCheckpointsBenchmarkOnly. It is intended only for parser performance attribution; the returned tree is not API-compatible.
func (*ParserPool) ParseUTF16 ¶ added in v0.16.0
func (pp *ParserPool) ParseUTF16(source []uint16) (*Tree, error)
ParseUTF16 delegates to a pooled Parser.ParseUTF16 call.
func (*ParserPool) ParseUTF16Bytes ¶ added in v0.16.0
func (pp *ParserPool) ParseUTF16Bytes(source []byte, order UTF16ByteOrder) (*Tree, error)
ParseUTF16Bytes delegates to a pooled Parser.ParseUTF16Bytes call.
func (*ParserPool) ParseUTF16BytesWithTokenSourceFactory ¶ added in v0.16.0
func (pp *ParserPool) ParseUTF16BytesWithTokenSourceFactory(source []byte, order UTF16ByteOrder, factory TokenSourceFactory) (*Tree, error)
ParseUTF16BytesWithTokenSourceFactory delegates to a pooled Parser.ParseUTF16BytesWithTokenSourceFactory call.
func (*ParserPool) ParseUTF16WithTokenSourceFactory ¶ added in v0.16.0
func (pp *ParserPool) ParseUTF16WithTokenSourceFactory(source []uint16, factory TokenSourceFactory) (*Tree, error)
ParseUTF16WithTokenSourceFactory delegates to a pooled Parser.ParseUTF16WithTokenSourceFactory call.
func (*ParserPool) ParseWith ¶ added in v0.7.0
func (pp *ParserPool) ParseWith(source []byte, opts ...ParseOption) (ParseResult, error)
ParseWith delegates to a pooled Parser.ParseWith call.
func (*ParserPool) ParseWithTokenSource ¶ added in v0.7.0
func (pp *ParserPool) ParseWithTokenSource(source []byte, ts TokenSource) (*Tree, error)
ParseWithTokenSource delegates to a pooled Parser.ParseWithTokenSource call.
func (*ParserPool) ParseWithTokenSourceFactory ¶ added in v0.16.0
func (pp *ParserPool) ParseWithTokenSourceFactory(source []byte, factory TokenSourceFactory) (*Tree, error)
ParseWithTokenSourceFactory delegates to a pooled Parser.ParseWithTokenSourceFactory call.
type ParserPoolOption ¶ added in v0.7.0
type ParserPoolOption func(*parserPoolConfig)
ParserPoolOption configures a ParserPool.
func WithParserPoolAmbiguityProfile ¶ added in v0.17.0
func WithParserPoolAmbiguityProfile(profile *AmbiguityProfile) ParserPoolOption
WithParserPoolAmbiguityProfile installs an optional diagnostic ambiguity profile on checked-out parsers.
func WithParserPoolGLRTrace ¶ added in v0.7.0
func WithParserPoolGLRTrace(enabled bool) ParserPoolOption
WithParserPoolGLRTrace toggles GLR trace logs on pooled parser instances.
func WithParserPoolIncludedRanges ¶ added in v0.7.0
func WithParserPoolIncludedRanges(ranges []Range) ParserPoolOption
WithParserPoolIncludedRanges sets default include ranges for pooled parsers.
func WithParserPoolLogger ¶ added in v0.7.0
func WithParserPoolLogger(logger ParserLogger) ParserPoolOption
WithParserPoolLogger sets the logger applied to pooled parser instances.
func WithParserPoolTimeoutMicros ¶ added in v0.7.0
func WithParserPoolTimeoutMicros(timeoutMicros uint64) ParserPoolOption
WithParserPoolTimeoutMicros sets the parse timeout for pooled parsers.
type Pattern ¶
type Pattern struct {
// contains filtered or unexported fields
}
Pattern is a single top-level S-expression pattern in a query.
type PendingParentFieldRejectPayloadStats ¶ added in v0.19.0
type PendingParentFieldRejectStats ¶ added in v0.19.0
type PendingParentFieldRejectStats struct {
Unknown uint64
ParentHidden uint64
NoIDs uint64
Inherited uint64
HiddenChild uint64
HiddenChildPlain uint64
HiddenChildPlainEmpty uint64
HiddenChildPlainOne uint64
HiddenChildPlainMany uint64
HiddenChildWithFields uint64
Child uint64
AllVisibleDirect uint64
}
type PendingParentRejectStats ¶ added in v0.19.0
type PerfCounters ¶ added in v0.6.0
type PerfCounters struct {
MergeCalls uint64
MergeDeadPruned uint64
MergePerKeyOverflow uint64
MergeReplacements uint64
StackEquivalentCalls uint64
StackEquivalentTrue uint64
StackEqHashMissSkips uint64
StackCompareCalls uint64
ConflictRR uint64
ConflictRS uint64
ConflictOther uint64
ForkCount uint64
FirstConflictToken uint64
MaxConcurrentStacks uint64
LexBytes uint64
LexTokens uint64
ReuseNodesVisited uint64
ReuseNodesPushed uint64
ReuseNodesPopped uint64
ReuseCandidatesChecked uint64
ReuseSuccesses uint64
ReuseLeafSuccesses uint64
ReuseNonLeafChecks uint64
ReuseNonLeafSuccesses uint64
ReuseNonLeafBytes uint64
ReuseNonLeafNoGoto uint64
ReuseNonLeafNoGotoTerm uint64
ReuseNonLeafNoGotoNt uint64
ReuseNonLeafStateMiss uint64
ReuseNonLeafStateZero uint64
MergeHashZero uint64
GlobalCapCulls uint64
GlobalCapCullDropped uint64
ReduceChainSteps uint64
ReduceChainMaxLen uint64
ReduceChainBreakMulti uint64
ReduceChainBreakShift uint64
ReduceChainBreakAccept uint64
ReduceChainHintCandidates uint64
ReduceChainHintTaken uint64
ReduceChainHintSteps uint64
ReduceChainHintTerminalOK uint64
ReduceChainHintTerminalMismatch uint64
ReduceChainHintLimit uint64
ReduceChainHintDead uint64
ReduceChainHintUnexpected uint64
ParentChildPointers uint64
ReduceChildrenFastGSS uint64
ReduceChildrenAllVis uint64
ReduceChildrenNoAlias uint64
ReduceChildrenScratch uint64
ReduceScratchNoAlias uint64
ReduceScratchGeneral uint64
ForestReduceCalls uint64
ForestReduceZero uint64
ForestReduceLinearNoExtras uint64
ForestReduceDFS uint64
ForestReduceDFSLinks uint64
ForestReduceDFSMultiLinkSteps uint64
ForestReduceDFSExtraLinks uint64
ForestReduceDFSVisits uint64
ForestReduceDFSPathEntries uint64
ForestReduceGotoHits uint64
ForestReduceGotoMisses uint64
ForestReduceMaxPathLen uint64
ForestReduceMaxChildCount uint64
ForestCoalesceCalls uint64
ForestCoalesceNewNodes uint64
ForestCoalesceLinkAppends uint64
ForestCoalesceDedupHits uint64
ForestCoalesceDedupReplacements uint64
ForestCoalescePreCapDrops uint64
ForestCoalesceCapDrops uint64
ForestCoalesceCapReplacements uint64
ExtraNodes uint64
ErrorNodes uint64
MergeStacksInHist [maxGLRStacks + 2]uint64
MergeAliveHist [maxGLRStacks + 2]uint64
MergeOutHist [maxGLRStacks + 2]uint64
ForkActionsHist [8]uint64
CloneTreeCalls uint64
CloneTreePublicNodes uint64
CloneTreeFinalRefs uint64
CloneTreeCompactCopies uint64
CloneTreeChildRefs uint64
CloneOffsetCalls uint64
CloneOffsetPublicNodes uint64
CloneOffsetCopies uint64
CloneOffsetShifted uint64
NodeEditCalls uint64
NodeEditNoopCalls uint64
NodeEditCompactRefs uint64
NodeEditShifted uint64
NodeEditMarked uint64
DenseMutationCalls uint64
DenseMutationDrains uint64
MutationChildRefCOW uint64
}
func PerfCountersSnapshot ¶ added in v0.6.0
func PerfCountersSnapshot() PerfCounters
type PointSkippableTokenSource ¶
type PointSkippableTokenSource interface {
ByteSkippableTokenSource
SkipToByteWithPoint(offset uint32, pt Point) Token
}
PointSkippableTokenSource extends ByteSkippableTokenSource with a hint-based skip that avoids recomputing row/column from byte offset. During incremental parsing the reused node already carries its endpoint, so passing it directly eliminates the O(n) offset-to-point scan.
type Query ¶
type Query struct {
// contains filtered or unexported fields
}
Query holds compiled patterns parsed from a tree-sitter .scm query file. It can be executed against a syntax tree to find matching nodes and return captured names. Query is safe for concurrent use after construction.
func NewQuery ¶
NewQuery compiles query source (tree-sitter .scm format) against a language. It returns an error if the query syntax is invalid or references unknown node types or field names.
func (*Query) CaptureCount ¶ added in v0.7.0
CaptureCount returns the number of unique capture names in this query.
func (*Query) CaptureNameForID ¶ added in v0.7.0
CaptureNameForID returns the capture name for the given capture id.
func (*Query) CaptureNames ¶
CaptureNames returns the list of unique capture names used in the query.
func (*Query) DisableCapture ¶ added in v0.7.0
DisableCapture removes captures with the given name from future query results. Matching behavior is unchanged; only returned captures are filtered.
func (*Query) DisablePattern ¶ added in v0.7.0
DisablePattern disables a pattern by index.
func (*Query) EndByteForPattern ¶ added in v0.7.0
EndByteForPattern returns the query-source end byte for patternIndex.
func (*Query) Exec ¶
func (q *Query) Exec(node *Node, lang *Language, source []byte) *QueryCursor
Exec creates a streaming cursor over matches rooted at node.
func (*Query) Execute ¶
func (q *Query) Execute(tree *Tree) []QueryMatch
Execute runs the query against a syntax tree and returns all matches.
func (*Query) ExecuteInto ¶ added in v0.10.2
func (q *Query) ExecuteInto(tree *Tree, dst []QueryMatch) []QueryMatch
ExecuteInto runs the query against a syntax tree, appending matches into dst and returning the updated slice. Callers can pre-allocate or reuse dst across calls to eliminate the per-call slice allocation from Execute.
Example:
var buf []QueryMatch
for _, tree := range trees {
buf = q.ExecuteInto(tree, buf[:0])
process(buf)
}
func (*Query) ExecuteNode ¶
func (q *Query) ExecuteNode(node *Node, lang *Language, source []byte) []QueryMatch
ExecuteNode runs the query starting from a specific node.
source is required for text predicates (like #eq? / #match?); pass the originating source bytes for correct predicate evaluation.
func (*Query) IsPatternGuaranteedAtStep ¶ added in v0.7.0
IsPatternGuaranteedAtStep reports whether all steps through stepIndex are definite and non-quantified.
func (*Query) IsPatternNonLocal ¶ added in v0.7.0
IsPatternNonLocal reports whether the pattern can begin at multiple roots.
func (*Query) IsPatternRooted ¶ added in v0.7.0
IsPatternRooted reports whether the pattern has exactly one root step at depth 0. Rooted patterns start matching from a single concrete root.
func (*Query) PatternCount ¶
PatternCount returns the number of patterns in the query.
func (*Query) PredicatesForPattern ¶ added in v0.7.0
func (q *Query) PredicatesForPattern(patternIndex uint32) ([]QueryPredicate, bool)
PredicatesForPattern returns a copy of predicates attached to patternIndex.
func (*Query) StartByteForPattern ¶ added in v0.7.0
StartByteForPattern returns the query-source start byte for patternIndex.
func (*Query) StepIsDefinite ¶ added in v0.7.0
StepIsDefinite reports whether a pattern step matches a definite symbol (i.e. not wildcard).
func (*Query) StringCount ¶ added in v0.7.0
StringCount returns the number of unique string literals in this query.
type QueryCapture ¶
type QueryCapture struct {
Name string
Node *Node
// TextOverride, when non-empty, replaces the node's source text for
// downstream consumers. It is set by the #strip! directive.
TextOverride string
}
QueryCapture is a single captured node within a match.
func (QueryCapture) Text ¶ added in v0.6.0
func (c QueryCapture) Text(source []byte) string
Text returns the effective text for this capture. If TextOverride is set (e.g. by the #strip! directive), it is returned. Otherwise the node's source text is returned.
func (QueryCapture) UTF16Range ¶ added in v0.16.0
func (c QueryCapture) UTF16Range(tree *Tree) (UTF16Range, bool)
UTF16Range returns this capture's node range in UTF-16 code-unit coordinates for trees produced by UTF-16 parse APIs.
type QueryCursor ¶
type QueryCursor struct {
// contains filtered or unexported fields
}
QueryCursor incrementally walks a node subtree and yields matches one by one. It is the streaming counterpart to Query.Execute and avoids materializing all matches up front. QueryCursor is not safe for concurrent use.
func (*QueryCursor) DidExceedMatchLimit ¶ added in v0.7.0
func (c *QueryCursor) DidExceedMatchLimit() bool
DidExceedMatchLimit reports whether query execution had additional matches beyond the configured match limit.
func (*QueryCursor) NextCapture ¶
func (c *QueryCursor) NextCapture() (QueryCapture, bool)
NextCapture yields captures in match order by draining NextMatch results. This is a practical first-pass ordering: captures are returned in each match's capture order, then by subsequent matches in DFS match order.
func (*QueryCursor) NextMatch ¶
func (c *QueryCursor) NextMatch() (QueryMatch, bool)
NextMatch yields the next query match from the cursor.
func (*QueryCursor) SetByteRange ¶ added in v0.6.0
func (c *QueryCursor) SetByteRange(startByte, endByte uint32)
SetByteRange restricts matches to nodes that intersect [startByte, endByte).
func (*QueryCursor) SetMatchLimit ¶ added in v0.7.0
func (c *QueryCursor) SetMatchLimit(limit uint32)
SetMatchLimit sets the maximum number of matches this cursor can return. A limit of 0 means unlimited.
func (*QueryCursor) SetMaxStartDepth ¶ added in v0.7.0
func (c *QueryCursor) SetMaxStartDepth(depth uint32)
SetMaxStartDepth limits the depth at which new matches can begin. Depth 0 means only the starting node passed to Exec.
func (*QueryCursor) SetPointRange ¶ added in v0.6.0
func (c *QueryCursor) SetPointRange(startPoint, endPoint Point)
SetPointRange restricts matches to nodes that intersect [startPoint, endPoint).
func (*QueryCursor) SetUTF16Range ¶ added in v0.16.0
func (c *QueryCursor) SetUTF16Range(tree *Tree, startCodeUnit, endCodeUnit uint32) bool
SetUTF16Range restricts matches to nodes that intersect the given UTF-16 code-unit range. tree must have been produced by a UTF-16 parse API.
type QueryMatch ¶
type QueryMatch struct {
PatternIndex int
Captures []QueryCapture
}
QueryMatch represents a successful pattern match with its captures.
type QueryPredicate ¶
type QueryPredicate struct {
// contains filtered or unexported fields
}
QueryPredicate is a post-match constraint attached to a pattern. Supported forms:
- (#eq? @a @b)
- (#eq? @a "literal")
- (#not-eq? @a @b)
- (#not-eq? @a "literal")
- (#match? @a "regex")
- (#not-match? @a "regex")
- (#lua-match? @a "lua-pattern")
- (#any-of? @a "v1" "v2" ...)
- (#not-any-of? @a "v1" "v2" ...)
- (#any-eq? @a "literal"), (#any-eq? @a @b)
- (#any-not-eq? @a "literal"), (#any-not-eq? @a @b)
- (#any-match? @a "regex")
- (#any-not-match? @a "regex")
- (#has-ancestor? @a type ...)
- (#not-has-ancestor? @a type ...)
- (#has-parent? @a type ...)
- (#not-has-parent? @a type ...)
- (#is? ...), (#is-not? ...)
- (#set! key value), (#offset! @cap ...)
- (#count? @a op value) -- op: >, <, >=, <=, ==, !=
- (#is-exported? @a)
type QueryStep ¶
type QueryStep struct {
// contains filtered or unexported fields
}
QueryStep is one matching instruction within a pattern.
type Range ¶
Range is a span of source text.
func DiffChangedRanges ¶ added in v0.6.0
DiffChangedRanges compares two syntax trees and returns the minimal ranges where syntactic structure differs. The old tree should have been edited (via Tree.Edit) to match the new tree's source positions before reparsing.
This is equivalent to C tree-sitter's ts_tree_get_changed_ranges().
func IncludedRangesForUTF16 ¶ added in v0.16.0
func IncludedRangesForUTF16(source []uint16, ranges []UTF16Range) ([]Range, bool)
IncludedRangesForUTF16 converts UTF-16 included ranges into the parser's internal UTF-8 byte ranges. The returned Range points use UTF-8 columns.
func IncludedRangesForUTF16Bytes ¶ added in v0.16.0
func IncludedRangesForUTF16Bytes(source []byte, order UTF16ByteOrder, ranges []UTF16Range) ([]Range, error)
IncludedRangesForUTF16Bytes converts endian-specific UTF-16 byte ranges into the parser's internal UTF-8 byte ranges. The returned Range points use UTF-8 columns.
type ReduceChainHint ¶ added in v0.19.0
type ReduceChainHint struct {
StartState StateID
Lookahead Symbol
TerminalStates []StateID
TerminalAction ReduceChainTerminalAction
MaxSteps uint16
}
ReduceChainHint describes a terminal-verified parser hot path for a deterministic reduce chain. The runtime still applies normal reduce semantics and stops before the terminal action; this metadata only lets it avoid repeated generic action dispatch for approved state/lookahead pairs.
type ReduceChainTerminalAction ¶ added in v0.19.0
type ReduceChainTerminalAction uint8
ReduceChainTerminalAction describes the action class expected after a generated reduce-chain hint finishes applying deterministic reductions.
const ( ReduceChainTerminalNoAction ReduceChainTerminalAction = iota ReduceChainTerminalSingleReduce ReduceChainTerminalSingleShift ReduceChainTerminalSingleAccept ReduceChainTerminalSingleOther ReduceChainTerminalMulti )
type ReduceChildPathRuntime ¶ added in v0.18.0
type Rewriter ¶ added in v0.6.0
type Rewriter struct {
// contains filtered or unexported fields
}
Rewriter collects source-text edits and applies them atomically. Edits target byte ranges (usually from Node.StartByte/EndByte). Apply returns new source bytes and InputEdit records for incremental reparsing. Rewriter is not safe for concurrent use.
func NewRewriter ¶ added in v0.6.0
NewRewriter creates a Rewriter for the given source text.
func (*Rewriter) Apply ¶ added in v0.6.0
Apply sorts edits, validates no overlaps, applies them, and returns the new source bytes plus InputEdit records for incremental reparsing. Returns error if edits overlap.
func (*Rewriter) ApplyToTree ¶ added in v0.6.0
ApplyToTree is a convenience that calls Apply(), then tree.Edit() for each edit, returning the new source ready for ParseIncremental.
func (*Rewriter) InsertAfter ¶ added in v0.6.0
InsertAfter inserts text immediately after node.
func (*Rewriter) InsertBefore ¶ added in v0.6.0
InsertBefore inserts text immediately before node.
func (*Rewriter) Replace ¶ added in v0.6.0
Replace replaces the source text covered by node with newText.
func (*Rewriter) ReplaceRange ¶ added in v0.6.0
ReplaceRange replaces bytes in [startByte, endByte) with newText.
type StateID ¶
type StateID uint32
StateID is a parser state index. uint32 supports grammars with >65K states (e.g. COBOL with 67K states from 1071 rules).
type SymbolMetadata ¶
SymbolMetadata holds display information about a symbol.
type Tag ¶
type Tag struct {
Kind string // e.g. "definition.function", "reference.call"
Name string // the captured symbol text
Range Range // full span of the tagged node
NameRange Range // span of the @name capture
}
Tag represents a tagged symbol in source code, extracted by a Tagger. Kind follows tree-sitter convention: "definition.function", "reference.call", etc. Name is the captured symbol text (e.g., the function name).
type Tagger ¶
type Tagger struct {
// contains filtered or unexported fields
}
Tagger extracts symbol definitions and references from source code using tree-sitter tags queries. It is the tagging counterpart to Highlighter.
Tags queries use a convention where captures follow the pattern:
- @name captures the symbol name (e.g., function identifier)
- @definition.X or @reference.X captures the kind
Example query:
(function_declaration name: (identifier) @name) @definition.function (call_expression function: (identifier) @name) @reference.call
func NewTagger ¶
func NewTagger(lang *Language, tagsQuery string, opts ...TaggerOption) (*Tagger, error)
NewTagger creates a Tagger for the given language and tags query.
func (*Tagger) TagIncremental ¶
TagIncremental re-tags source after edits to oldTree. Returns the tags and the new tree for subsequent incremental calls.
func (*Tagger) TagIncrementalUTF16 ¶ added in v0.16.0
TagIncrementalUTF16 re-tags UTF-16 source after edits to oldTree. Call oldTree.EditUTF16 before calling this.
func (*Tagger) TagIncrementalUTF16Bytes ¶ added in v0.16.0
func (tg *Tagger) TagIncrementalUTF16Bytes(source []byte, oldTree *Tree, order UTF16ByteOrder) ([]UTF16Tag, *Tree, error)
TagIncrementalUTF16Bytes is like TagIncrementalUTF16 for endian-specific UTF-16 bytes.
func (*Tagger) TagTreeUTF16 ¶ added in v0.16.0
TagTreeUTF16 extracts tags from an already-parsed UTF-16 tree.
func (*Tagger) TagUTF16 ¶ added in v0.16.0
TagUTF16 parses UTF-16 source and returns all tags with UTF-16 ranges.
func (*Tagger) TagUTF16Bytes ¶ added in v0.16.0
func (tg *Tagger) TagUTF16Bytes(source []byte, order UTF16ByteOrder) ([]UTF16Tag, error)
TagUTF16Bytes is like TagUTF16 for endian-specific UTF-16 bytes.
type TaggerOption ¶
type TaggerOption func(*Tagger)
TaggerOption configures a Tagger.
func WithTaggerTokenSourceFactory ¶
func WithTaggerTokenSourceFactory(factory func(source []byte) TokenSource) TaggerOption
WithTaggerTokenSourceFactory sets a factory function that creates a TokenSource for each Tag call.
type Token ¶
type Token struct {
Symbol Symbol
Text string
StartByte uint32
EndByte uint32
StartPoint Point
EndPoint Point
Missing bool
// NoLookahead marks a synthetic EOF used to force EOF-table reductions
// without consuming input, matching tree-sitter's lex_state = -1.
NoLookahead bool
}
Token is a lexed token with position info.
type TokenSource ¶
type TokenSource interface {
// Next returns the next token. It should skip whitespace and comments
// as appropriate for the language. Returns a zero-Symbol token at EOF.
Next() Token
}
TokenSource provides tokens to the parser. This interface abstracts over different lexer implementations: the built-in DFA lexer (for hand-built grammars) or custom bridges like GoTokenSource (for real grammars where we can't extract the C lexer DFA).
type TokenSourceFactory ¶ added in v0.16.0
type TokenSourceFactory func(source []byte) (TokenSource, error)
TokenSourceFactory builds a token source for parser source bytes.
type TokenSourceRebuilder ¶ added in v0.7.0
type TokenSourceRebuilder interface {
RebuildTokenSource(source []byte, lang *Language) (TokenSource, error)
}
TokenSourceRebuilder is an optional extension for token sources that can build a fresh equivalent token source for another source buffer. Result normalization uses this to reparse isolated fragments with the same lexer backend as the original parse.
type Tree ¶
type Tree struct {
// contains filtered or unexported fields
}
Tree holds a complete syntax tree along with its source text and language. Tree is safe for concurrent reads after construction. Edit and Release are not safe for concurrent use.
func (*Tree) ArenaBreakdown ¶ added in v0.18.0
func (t *Tree) ArenaBreakdown() (ArenaBreakdown, bool)
ArenaBreakdown returns optional arena/materialization attribution captured when EnableArenaBreakdown(true) was set before parsing.
func (*Tree) ChangedRanges ¶ added in v0.6.0
ChangedRanges converts this tree's recorded edits into changed source ranges. Overlapping ranges are coalesced.
func (*Tree) Copy ¶ added in v0.7.0
Copy returns an independent copy of this tree.
The copied tree has distinct node objects, so subsequent Tree.Edit calls on either tree do not mutate the other's spans/dirty bits. Source bytes and language pointer are shared (read-only).
func (*Tree) DescendantForUTF16Range ¶ added in v0.16.0
DescendantForUTF16Range returns the smallest descendant that fully contains the given UTF-16 code-unit range, or nil when no such descendant exists.
func (*Tree) Edit ¶
Edit records an edit on this tree. Call this before ParseIncremental to inform the parser which regions changed. The edit adjusts byte offsets and marks overlapping nodes as dirty so the incremental parser knows what to re-parse.
func (*Tree) EditUTF16 ¶ added in v0.16.0
EditUTF16 records a UTF-16 code-unit edit on a UTF-16 tree.
newSource is the full source after the edit; it is used to derive the internal UTF-8 endpoint for NewEndCodeUnit.
func (*Tree) InputEditForUTF16 ¶ added in v0.16.0
InputEditForUTF16 converts a UTF-16 code-unit edit into the parser's internal UTF-8 byte-coordinate edit. The tree must have been produced by ParseUTF16.
func (*Tree) NamedDescendantForUTF16Range ¶ added in v0.16.0
NamedDescendantForUTF16Range returns the smallest named descendant that fully contains the given UTF-16 code-unit range, or nil when no such descendant exists.
func (*Tree) ParseRuntime ¶ added in v0.6.0
func (t *Tree) ParseRuntime() ParseRuntime
ParseRuntime returns parser-loop diagnostics captured when this tree was built.
func (*Tree) ParseStopReason ¶ added in v0.6.0
func (t *Tree) ParseStopReason() ParseStopReason
ParseStopReason reports why parsing terminated.
func (*Tree) ParseStoppedEarly ¶ added in v0.6.0
ParseStoppedEarly reports whether parsing hit an early-stop condition.
func (*Tree) Release ¶
func (t *Tree) Release()
Release decrements arena references held by this tree. After Release, the tree should be treated as invalid and not reused.
func (*Tree) RootNodeWithOffset ¶ added in v0.7.0
RootNodeWithOffset returns a copy of the root node with all spans shifted by the provided byte and point offsets.
This mirrors tree-sitter C's root-node-with-offset behavior for callers that need to embed a parsed tree at a larger document offset.
func (*Tree) SourceEncoding ¶ added in v0.16.0
func (t *Tree) SourceEncoding() InputEncoding
SourceEncoding returns the encoding used by the caller that produced this tree.
For UTF-16 parses, Source still returns the parser's canonical UTF-8 copy. Use SourceUTF16 and UTF16RangeForNode when caller-facing UTF-16 coordinates are needed.
func (*Tree) SourceUTF16 ¶ added in v0.16.0
SourceUTF16 returns the original UTF-16 source for trees produced by ParseUTF16. It returns nil for ordinary UTF-8 parses.
func (*Tree) UTF8ByteForUTF16Offset ¶ added in v0.16.0
UTF8ByteForUTF16Offset converts a UTF-16 code-unit offset to the parser's canonical UTF-8 byte offset for trees produced by ParseUTF16.
func (*Tree) UTF16OffsetForByte ¶ added in v0.16.0
UTF16OffsetForByte converts a parser UTF-8 byte offset to a UTF-16 code-unit offset for trees produced by ParseUTF16.
func (*Tree) UTF16PointForByte ¶ added in v0.16.0
UTF16PointForByte converts a parser UTF-8 byte offset to a UTF-16 point.
func (*Tree) UTF16RangeForByteRange ¶ added in v0.16.0
func (t *Tree) UTF16RangeForByteRange(startByte, endByte uint32) (UTF16Range, bool)
UTF16RangeForByteRange converts a canonical UTF-8 byte range into UTF-16 code-unit coordinates.
func (*Tree) UTF16RangeForNode ¶ added in v0.16.0
func (t *Tree) UTF16RangeForNode(n *Node) (UTF16Range, bool)
UTF16RangeForNode returns a node range in UTF-16 code-unit coordinates.
func (*Tree) UTF16RangeForRange ¶ added in v0.16.0
func (t *Tree) UTF16RangeForRange(r Range) (UTF16Range, bool)
UTF16RangeForRange converts a canonical UTF-8 Range into UTF-16 code-unit coordinates.
func (*Tree) UTF16SourceForNode ¶ added in v0.16.0
UTF16SourceForNode returns the original UTF-16 code units covered by n.
type TreeCursor ¶ added in v0.6.0
type TreeCursor struct {
// contains filtered or unexported fields
}
TreeCursor provides stateful, O(1) tree navigation. It maintains a stack of (node, childIndex) frames enabling efficient parent, child, and sibling movement without scanning.
The cursor holds pointers to Nodes. If the underlying Tree is released, edited, or replaced via incremental reparse, the cursor should be recreated.
func NewTreeCursor ¶ added in v0.6.0
func NewTreeCursor(node *Node, tree *Tree) *TreeCursor
NewTreeCursor creates a cursor starting at the given node. The optional tree reference enables field name resolution and text extraction.
func NewTreeCursorFromTree ¶ added in v0.6.0
func NewTreeCursorFromTree(tree *Tree) *TreeCursor
NewTreeCursorFromTree creates a cursor starting at the tree's root node.
func (*TreeCursor) Copy ¶ added in v0.6.0
func (c *TreeCursor) Copy() *TreeCursor
Copy returns an independent copy of the cursor. The copy shares the same tree reference but has its own navigation stack.
func (*TreeCursor) CurrentFieldID ¶ added in v0.6.0
func (c *TreeCursor) CurrentFieldID() FieldID
CurrentFieldID returns the field ID of the current node within its parent. Returns 0 if the cursor is at the root or the node has no field assignment.
func (*TreeCursor) CurrentFieldName ¶ added in v0.6.0
func (c *TreeCursor) CurrentFieldName() string
CurrentFieldName returns the field name of the current node within its parent. Returns "" if no tree is associated, the cursor is at the root, or the node has no field assignment.
func (*TreeCursor) CurrentNode ¶ added in v0.6.0
func (c *TreeCursor) CurrentNode() *Node
CurrentNode returns the node the cursor is currently pointing to.
func (*TreeCursor) CurrentNodeIsNamed ¶ added in v0.6.0
func (c *TreeCursor) CurrentNodeIsNamed() bool
CurrentNodeIsNamed returns whether the current node is a named node.
func (*TreeCursor) CurrentNodeText ¶ added in v0.6.0
func (c *TreeCursor) CurrentNodeText() string
CurrentNodeText returns the source text of the current node. Requires a tree with source to be associated.
func (*TreeCursor) CurrentNodeType ¶ added in v0.6.0
func (c *TreeCursor) CurrentNodeType() string
CurrentNodeType returns the type name of the current node. Requires a tree with a language to be associated.
func (*TreeCursor) Depth ¶ added in v0.6.0
func (c *TreeCursor) Depth() int
Depth returns the cursor's current depth (0 at the root).
func (*TreeCursor) GotoChildByFieldID ¶ added in v0.6.0
func (c *TreeCursor) GotoChildByFieldID(fid FieldID) bool
GotoChildByFieldID moves the cursor to the first child with the given field ID. Returns false if no child has that field.
func (*TreeCursor) GotoChildByFieldName ¶ added in v0.6.0
func (c *TreeCursor) GotoChildByFieldName(name string) bool
GotoChildByFieldName moves the cursor to the first child with the given field name. Returns false if the tree has no language, the field name is unknown, or no child has that field.
func (*TreeCursor) GotoFirstChild ¶ added in v0.6.0
func (c *TreeCursor) GotoFirstChild() bool
GotoFirstChild moves the cursor to the first child of the current node. Returns false if the current node has no children.
func (*TreeCursor) GotoFirstChildForByte ¶ added in v0.6.0
func (c *TreeCursor) GotoFirstChildForByte(targetByte uint32) int64
GotoFirstChildForByte moves the cursor to the first child whose byte range contains targetByte (i.e., first child where endByte > targetByte). Returns the child index, or -1 when no child contains the byte.
func (*TreeCursor) GotoFirstChildForPoint ¶ added in v0.6.0
func (c *TreeCursor) GotoFirstChildForPoint(targetPoint Point) int64
GotoFirstChildForPoint moves the cursor to the first child whose point range contains targetPoint (i.e., first child where endPoint > targetPoint). Returns the child index, or -1 when no child contains the point.
func (*TreeCursor) GotoFirstNamedChild ¶ added in v0.6.0
func (c *TreeCursor) GotoFirstNamedChild() bool
GotoFirstNamedChild moves the cursor to the first named child of the current node, skipping anonymous nodes. Returns false if no named child exists.
func (*TreeCursor) GotoLastChild ¶ added in v0.6.0
func (c *TreeCursor) GotoLastChild() bool
GotoLastChild moves the cursor to the last child of the current node. Returns false if the current node has no children.
func (*TreeCursor) GotoLastNamedChild ¶ added in v0.6.0
func (c *TreeCursor) GotoLastNamedChild() bool
GotoLastNamedChild moves the cursor to the last named child of the current node, skipping anonymous nodes. Returns false if no named child exists.
func (*TreeCursor) GotoNextNamedSibling ¶ added in v0.6.0
func (c *TreeCursor) GotoNextNamedSibling() bool
GotoNextNamedSibling moves the cursor to the next named sibling, skipping anonymous nodes. Returns false if no named sibling follows.
func (*TreeCursor) GotoNextSibling ¶ added in v0.6.0
func (c *TreeCursor) GotoNextSibling() bool
GotoNextSibling moves the cursor to the next sibling. Returns false if the cursor is at the root or the last sibling.
func (*TreeCursor) GotoParent ¶ added in v0.6.0
func (c *TreeCursor) GotoParent() bool
GotoParent moves the cursor to the parent of the current node. Returns false if the cursor is at the root.
func (*TreeCursor) GotoPrevNamedSibling ¶ added in v0.6.0
func (c *TreeCursor) GotoPrevNamedSibling() bool
GotoPrevNamedSibling moves the cursor to the previous named sibling, skipping anonymous nodes. Returns false if no named sibling precedes.
func (*TreeCursor) GotoPrevSibling ¶ added in v0.6.0
func (c *TreeCursor) GotoPrevSibling() bool
GotoPrevSibling moves the cursor to the previous sibling. Returns false if the cursor is at the root or the first sibling.
func (*TreeCursor) Reset ¶ added in v0.6.0
func (c *TreeCursor) Reset(node *Node)
Reset resets the cursor to a new root node, clearing the navigation stack.
func (*TreeCursor) ResetTree ¶ added in v0.6.0
func (c *TreeCursor) ResetTree(tree *Tree)
ResetTree resets the cursor to the root of a new tree.
type UTF16ByteOrder ¶ added in v0.16.0
type UTF16ByteOrder uint8
UTF16ByteOrder identifies the byte order used by a UTF-16 byte source.
const ( UTF16LittleEndian UTF16ByteOrder = iota UTF16BigEndian )
func (UTF16ByteOrder) String ¶ added in v0.16.0
func (o UTF16ByteOrder) String() string
type UTF16HighlightRange ¶ added in v0.16.0
type UTF16HighlightRange struct {
StartCodeUnit uint32
EndCodeUnit uint32
StartPoint Point
EndPoint Point
Capture string
PatternIndex int
}
UTF16HighlightRange is a styled source range in UTF-16 code-unit coordinates.
type UTF16Injection ¶ added in v0.16.0
type UTF16Injection struct {
// Language is the detected language name (e.g., "javascript").
Language string
// Tree is the parse tree for this region, or nil if the language
// was not registered.
Tree *Tree
// Ranges are the source ranges this tree covers in UTF-16 code units.
Ranges []UTF16Range
// Node is the parent tree node that triggered the injection.
Node *Node
}
UTF16Injection is a single embedded language region with ranges in UTF-16 code-unit coordinates.
type UTF16InjectionResult ¶ added in v0.16.0
type UTF16InjectionResult struct {
// Tree is the parent language's parse tree.
Tree *Tree
// Injections contains child language parse results, ordered by position.
Injections []UTF16Injection
// contains filtered or unexported fields
}
UTF16InjectionResult holds parse results for a UTF-16 multi-language document. Injection ranges are expressed in UTF-16 code units.
type UTF16Range ¶ added in v0.16.0
UTF16Range is a source range in UTF-16 code units.
StartPoint and EndPoint use UTF-16 code-unit columns, matching the coordinate system used by many editors and LSP clients.
type UTF16Tag ¶ added in v0.16.0
type UTF16Tag struct {
Kind string
Name string
Range UTF16Range
NameRange UTF16Range
}
UTF16Tag represents a tagged symbol with ranges in UTF-16 code-unit coordinates.
type WalkAction ¶
type WalkAction int
WalkAction controls the tree walk behavior.
const ( // WalkContinue continues the walk to children and siblings. WalkContinue WalkAction = iota // WalkSkipChildren skips the current node's children but continues to siblings. WalkSkipChildren // WalkStop terminates the walk entirely. WalkStop )
Source Files
¶
- ambiguity_profile.go
- arena.go
- bound_tree.go
- cursor.go
- external.go
- external_lexer.go
- external_scanner_adapter.go
- external_scanner_checkpoints.go
- external_symbol_resolver.go
- external_vm.go
- glr.go
- glr_forest.go
- glr_gss.go
- highlight.go
- highlight_injections.go
- highlight_injections_exec.go
- imports.go
- included_ranges.go
- incremental.go
- incremental_leaf_fastpath.go
- injection.go
- intern.go
- language.go
- lex_mode_repair.go
- lexer.go
- load_language.go
- lookahead.go
- materialization_reason.go
- no_tree_node.go
- parse_dispatch.go
- parser.go
- parser_api.go
- parser_config.go
- parser_dfa_token_source.go
- parser_error_tree.go
- parser_incremental_support.go
- parser_limits.go
- parser_normalization_stats.go
- parser_pool.go
- parser_recovery.go
- parser_reduce.go
- parser_result.go
- parser_result_bash.go
- parser_result_c.go
- parser_result_cobol.go
- parser_result_collapsed_helpers.go
- parser_result_collapsed_text.go
- parser_result_compat.go
- parser_result_csharp.go
- parser_result_csharp_attribute.go
- parser_result_csharp_expr.go
- parser_result_csharp_helpers.go
- parser_result_csharp_invocation.go
- parser_result_csharp_lambda.go
- parser_result_csharp_property.go
- parser_result_csharp_query.go
- parser_result_csharp_statement.go
- parser_result_csharp_string.go
- parser_result_csharp_type_body.go
- parser_result_d.go
- parser_result_dart.go
- parser_result_elixir.go
- parser_result_erlang.go
- parser_result_go.go
- parser_result_go_compat.go
- parser_result_haskell.go
- parser_result_hcl.go
- parser_result_helpers.go
- parser_result_html.go
- parser_result_ini.go
- parser_result_java.go
- parser_result_javascript_typescript.go
- parser_result_kotlin.go
- parser_result_lua.go
- parser_result_make.go
- parser_result_misc_spans.go
- parser_result_node_helpers.go
- parser_result_perl.go
- parser_result_php.go
- parser_result_powershell.go
- parser_result_powershell_command.go
- parser_result_powershell_expr.go
- parser_result_python.go
- parser_result_python_recovery.go
- parser_result_root_build.go
- parser_result_ruby.go
- parser_result_rust_closure.go
- parser_result_rust_expression.go
- parser_result_rust_items.go
- parser_result_rust_recovery.go
- parser_result_rust_struct.go
- parser_result_scala_compilation.go
- parser_result_scala_template.go
- parser_result_scala_top_level.go
- parser_result_sql.go
- parser_result_svelte.go
- parser_result_swift.go
- parser_result_trivia_helpers.go
- parser_result_typescript.go
- parser_result_yaml.go
- parser_result_zig.go
- parser_retry.go
- parser_scratch.go
- parser_tables.go
- pending_parent.go
- perf_counters.go
- perf_metrics_stub.go
- query.go
- query_alternation_index.go
- query_compile.go
- query_compile_helpers.go
- query_compile_predicates.go
- query_matcher.go
- query_predicates.go
- rewrite.go
- runtime_audit.go
- tagger.go
- transient_children.go
- transient_parents.go
- tree.go
- utf16.go
- walk.go
Directories
¶
| Path | Synopsis |
|---|---|
|
cgo_harness
module
|
|
|
cmd
|
|
|
benchgate
command
|
|
|
benchmatrix
command
|
|
|
gen_linguist
command
Command gen_linguist generates grammars/linguist_gen.go by matching gotreesitter grammar names to GitHub Linguist's languages.yml.
|
Command gen_linguist generates grammars/linguist_gen.go by matching gotreesitter grammar names to GitHub Linguist's languages.yml. |
|
gen_subset_blob_embeds
command
Command gen_subset_blob_embeds generates the per-language z_subset_blob_embed_<lang>.go files that power embedded grammar_subset builds (issue #88: per-language compile-time grammar selection).
|
Command gen_subset_blob_embeds generates the per-language z_subset_blob_embed_<lang>.go files that power embedded grammar_subset builds (issue #88: per-language compile-time grammar selection). |
|
grammar_update_guard
command
Command grammar_update_guard checks lock-update reports for scanner-facing changes that require hand-written scanner review before grammar blobs move.
|
Command grammar_update_guard checks lock-update reports for scanner-facing changes that require hand-written scanner review before grammar blobs move. |
|
grammar_updater
command
Command grammar_updater refreshes pinned grammar commits in grammars/languages.lock and emits a machine-readable update report.
|
Command grammar_updater refreshes pinned grammar commits in grammars/languages.lock and emits a machine-readable update report. |
|
grammarblobprobe
command
Command grammarblobprobe is a minimal binary that blank-imports the grammars package so that whatever grammar blobs are embedded by the active build tags are linked into the binary.
|
Command grammarblobprobe is a minimal binary that blank-imports the grammars package so that whatever grammar blobs are embedded by the active build tags are linked into the binary. |
|
grammargen
command
Command grammargen generates tree-sitter parser artifacts from grammar definitions.
|
Command grammargen generates tree-sitter parser artifacts from grammar definitions. |
|
harnessgate
command
|
|
|
parity_report
command
|
|
|
ts2go
command
Command ts2go reads a tree-sitter generated parser.c file and outputs a Go source file containing a function that returns a populated *gotreesitter.Language with all extracted parse tables.
|
Command ts2go reads a tree-sitter generated parser.c file and outputs a Go source file containing a function that returns a populated *gotreesitter.Language with all extracted parse tables. |
|
tsquery
command
Command tsquery generates type-safe Go code from tree-sitter .scm query files.
|
Command tsquery generates type-safe Go code from tree-sitter .scm query files. |
|
Package grammargen implements a pure-Go grammar generator for gotreesitter.
|
Package grammargen implements a pure-Go grammar generator for gotreesitter. |
|
Package grammars provides built-in and extension tree-sitter grammars with lazy loading.
|
Package grammars provides built-in and extension tree-sitter grammars with lazy loading. |
|
Package grep provides structural code search, match, and rewrite using tree-sitter parse trees.
|
Package grep provides structural code search, match, and rewrite using tree-sitter parse trees. |
|
Package taproot is the common front-end harness shared by M31 DSLs that use the gotreesitter runtime.
|
Package taproot is the common front-end harness shared by M31 DSLs that use the gotreesitter runtime. |
|
diag
Package diag provides a generic structured diagnostic type and a source-quoting renderer.
|
Package diag provides a generic structured diagnostic type and a source-quoting renderer. |
|
wasm
|
|
|
grammargen
command
|
|
|
runtime
command
|