ast

package
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 27, 2026 License: GPL-3.0 Imports: 7 Imported by: 0

Documentation

Overview

Package ast - node.go defines the Node interface that abstracts over the concrete *sitter.Node type from github.com/smacker/go-tree-sitter.

Design rationale

With 308+ usages of *sitter.Node spread across pkg/tracer, pkg/semantic, and pkg/sources, a full migration to the interface in a single pass would be high-risk. The chosen approach is therefore incremental:

  1. The Node interface is defined here, covering every method that pkg/ast itself calls on tree-sitter nodes.
  2. pkg/ast public function signatures accept Node (or *sitter.Node where the return type forces it — see "Pragmatic exceptions" below).
  3. pkg/tracer and pkg/semantic continue to use *sitter.Node for now. Callers that migrate to Node can use Wrap() to convert.

Pragmatic exceptions

Child, NamedChild, and Parent return *sitter.Node in the upstream library. The interface preserves these concrete return types so that call-sites do not need to type-assert after every child traversal. When a future release of the upstream library (or a fork) returns a Node interface, these signatures can be updated without changing the rest of the codebase.

Gradual migration plan

Future work: update pkg/tracer/propagation.go, pkg/tracer/scope.go, and the pkg/semantic/analyzer/* files to accept ast.Node instead of *sitter.Node. The Wrap helper makes that transition zero-cost: Wrap(n) is a no-op because *sitter.Node already satisfies the interface.

Package ast - node_types.go provides centralized AST node type patterns. All AST node type definitions should be referenced from here. This is language-agnostic structural knowledge about how ASTs are shaped — it belongs in pkg/ast, not in pkg/sources (input-detection logic).

Package ast - text_extractor.go provides language-aware extraction of assignment targets and values from Tree-Sitter AST nodes.

This logic belongs in pkg/ast (language-agnostic AST facts) rather than in pkg/tracer (orchestration), because it is pure structural knowledge about how each language's grammar encodes assignments.

Index

Constants

This section is empty.

Variables

View Source
var LanguageASTNodeTypes = map[string]ASTNodeTypes{
	"php": {
		FunctionTypes: []string{
			"function_definition",
			"method_declaration",
			"arrow_function",
		},
		ScopeTypes: []string{
			"function_definition",
			"method_declaration",
			"class_declaration",
			"program",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"augmented_assignment_expression",
		},
		CallTypes: []string{
			"function_call_expression",
			"member_call_expression",
			"scoped_call_expression",
		},
		IdentifierTypes: []string{
			"variable_name",
			"name",
		},
	},
	"javascript": {
		FunctionTypes: []string{
			"function_declaration",
			"function_expression",
			"arrow_function",
			"method_definition",
		},
		ScopeTypes: []string{
			"function_declaration",
			"function_expression",
			"arrow_function",
			"method_definition",
			"class_declaration",
			"program",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"augmented_assignment_expression",
			"variable_declarator",
		},
		CallTypes: []string{
			"call_expression",
			"new_expression",
		},
		IdentifierTypes: []string{
			"identifier",
			"property_identifier",
		},
	},
	"typescript": {
		FunctionTypes: []string{
			"function_declaration",
			"function_expression",
			"arrow_function",
			"method_definition",
		},
		ScopeTypes: []string{
			"function_declaration",
			"function_expression",
			"arrow_function",
			"method_definition",
			"class_declaration",
			"program",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"augmented_assignment_expression",
			"variable_declarator",
		},
		CallTypes: []string{
			"call_expression",
			"new_expression",
		},
		IdentifierTypes: []string{
			"identifier",
			"property_identifier",
		},
	},
	"tsx": {
		FunctionTypes: []string{
			"function_declaration",
			"function_expression",
			"arrow_function",
			"method_definition",
		},
		ScopeTypes: []string{
			"function_declaration",
			"function_expression",
			"arrow_function",
			"method_definition",
			"class_declaration",
			"program",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"augmented_assignment_expression",
			"variable_declarator",
		},
		CallTypes: []string{
			"call_expression",
			"new_expression",
		},
		IdentifierTypes: []string{
			"identifier",
			"property_identifier",
		},
	},
	"python": {
		FunctionTypes: []string{
			"function_definition",
			"lambda",
		},
		ScopeTypes: []string{
			"function_definition",
			"class_definition",
			"module",
		},
		AssignmentTypes: []string{
			"assignment",
			"augmented_assignment",
			"named_expression",
		},
		CallTypes: []string{
			"call",
		},
		IdentifierTypes: []string{
			"identifier",
			"attribute",
		},
	},
	"go": {
		FunctionTypes: []string{
			"function_declaration",
			"method_declaration",
			"func_literal",
		},
		ScopeTypes: []string{
			"function_declaration",
			"method_declaration",
			"source_file",
		},
		AssignmentTypes: []string{
			"short_var_declaration",
			"assignment_statement",
		},
		CallTypes: []string{
			"call_expression",
		},
		IdentifierTypes: []string{
			"identifier",
			"selector_expression",
		},
	},
	"java": {
		FunctionTypes: []string{
			"method_declaration",
			"constructor_declaration",
			"lambda_expression",
		},
		ScopeTypes: []string{
			"method_declaration",
			"constructor_declaration",
			"class_declaration",
			"interface_declaration",
			"program",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"variable_declarator",
		},
		CallTypes: []string{
			"method_invocation",
			"object_creation_expression",
		},
		IdentifierTypes: []string{
			"identifier",
		},
	},
	"c": {
		FunctionTypes: []string{
			"function_definition",
		},
		ScopeTypes: []string{
			"function_definition",
			"translation_unit",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"init_declarator",
		},
		CallTypes: []string{
			"call_expression",
		},
		IdentifierTypes: []string{
			"identifier",
		},
	},
	"cpp": {
		FunctionTypes: []string{
			"function_definition",
			"lambda_expression",
		},
		ScopeTypes: []string{
			"function_definition",
			"class_specifier",
			"translation_unit",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"init_declarator",
		},
		CallTypes: []string{
			"call_expression",
		},
		IdentifierTypes: []string{
			"identifier",
		},
	},
	"c_sharp": {
		FunctionTypes: []string{
			"method_declaration",
			"constructor_declaration",
			"lambda_expression",
		},
		ScopeTypes: []string{
			"method_declaration",
			"constructor_declaration",
			"class_declaration",
			"interface_declaration",
			"compilation_unit",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"variable_declarator",
		},
		CallTypes: []string{
			"invocation_expression",
			"object_creation_expression",
		},
		IdentifierTypes: []string{
			"identifier",
		},
	},
	"ruby": {
		FunctionTypes: []string{
			"method",
			"singleton_method",
			"lambda",
			"block",
		},
		ScopeTypes: []string{
			"method",
			"singleton_method",
			"class",
			"module",
			"program",
		},
		AssignmentTypes: []string{
			"assignment",
			"operator_assignment",
		},
		CallTypes: []string{
			"call",
			"method_call",
		},
		IdentifierTypes: []string{
			"identifier",
			"constant",
		},
	},
	"rust": {
		FunctionTypes: []string{
			"function_item",
			"closure_expression",
		},
		ScopeTypes: []string{
			"function_item",
			"impl_item",
			"mod_item",
			"source_file",
		},
		AssignmentTypes: []string{
			"assignment_expression",
			"let_declaration",
		},
		CallTypes: []string{
			"call_expression",
			"method_call_expression",
		},
		IdentifierTypes: []string{
			"identifier",
		},
	},
}

LanguageASTNodeTypes provides language-specific AST node types

View Source
var UniversalASTNodeTypes = ASTNodeTypes{
	FunctionTypes: []string{
		"function_definition",
		"function_declaration",
		"method_definition",
		"method_declaration",
		"function_item",
		"arrow_function",
		"function_expression",
		"lambda",
		"def",
		"fn_item",
	},
	ScopeTypes: []string{
		"function_definition",
		"function_declaration",
		"method_definition",
		"method_declaration",
		"class_definition",
		"class_declaration",
		"module",
		"program",
		"source_file",
	},
	AssignmentTypes: []string{
		"assignment_expression",
		"assignment_statement",
		"augmented_assignment",
		"variable_declarator",
		"short_var_declaration",
	},
	CallTypes: []string{
		"call_expression",
		"function_call_expression",
		"member_call_expression",
		"method_invocation",
	},
	IdentifierTypes: []string{
		"identifier",
		"variable_name",
		"name",
		"property_identifier",
		"attribute",
		"constant",
	},
}

UniversalASTNodeTypes contains AST patterns that work across languages. Used by propagation.go findContainingFunction() and getCurrentScope().

Functions

func ExtractAssignmentParts

func ExtractAssignmentParts(node Node, src []byte, language string) (target, value string)

ExtractAssignmentParts extracts the target variable name and the value expression text from an assignment AST node for the given language. Returns ("", "") when the node is not a recognisable assignment.

The node parameter accepts the ast.Node interface; callers that hold a *sitter.Node can pass it directly because *sitter.Node satisfies ast.Node.

func GetAssignmentTypes

func GetAssignmentTypes() []string

GetAssignmentTypes returns the list of assignment node types

func GetAssignmentTypesForLanguage

func GetAssignmentTypesForLanguage(language string) []string

GetAssignmentTypesForLanguage returns assignment node types for a specific language

func GetCallTypes

func GetCallTypes() []string

GetCallTypes returns the list of call node types

func GetCallTypesForLanguage

func GetCallTypesForLanguage(language string) []string

GetCallTypesForLanguage returns call node types for a specific language

func GetFunctionTypes

func GetFunctionTypes() []string

GetFunctionTypes returns the list of function node types

func GetIdentifierTypes

func GetIdentifierTypes() []string

GetIdentifierTypes returns the list of identifier node types

func GetIdentifierTypesForLanguage

func GetIdentifierTypesForLanguage(language string) []string

GetIdentifierTypesForLanguage returns identifier node types for a specific language

func GetScopeTypes

func GetScopeTypes() []string

GetScopeTypes returns the list of scope node types

func IsAssignmentNode

func IsAssignmentNode(nodeType string) bool

IsAssignmentNode checks if a node type is an assignment

func IsAssignmentNodeForLanguage

func IsAssignmentNodeForLanguage(nodeType, language string) bool

IsAssignmentNodeForLanguage checks if a node type is an assignment for a specific language

func IsCallNode

func IsCallNode(nodeType string) bool

IsCallNode checks if a node type is a function/method call

func IsCallNodeForLanguage

func IsCallNodeForLanguage(nodeType, language string) bool

IsCallNodeForLanguage checks if a node type is a call for a specific language

func IsFunctionNode

func IsFunctionNode(nodeType string) bool

IsFunctionNode checks if a node type represents a function definition

func IsFunctionNodeForLanguage

func IsFunctionNodeForLanguage(nodeType, language string) bool

IsFunctionNodeForLanguage checks if a node type is a function for a specific language

func IsIdentifierNode

func IsIdentifierNode(nodeType string) bool

IsIdentifierNode checks if a node type is an identifier

func IsIdentifierNodeForLanguage

func IsIdentifierNodeForLanguage(nodeType, language string) bool

IsIdentifierNodeForLanguage checks if a node type is an identifier for a specific language

func IsScopeNode

func IsScopeNode(nodeType string) bool

IsScopeNode checks if a node type defines a scope

func IsScopeNodeForLanguage

func IsScopeNodeForLanguage(nodeType, language string) bool

IsScopeNodeForLanguage checks if a node type defines a scope for a specific language

func IterateChildren

func IterateChildren(node *sitter.Node, fn func(child *sitter.Node))

IterateChildren calls fn for each non-punctuation child of node. It is the canonical way to walk argument lists and parameter lists in Tree-Sitter ASTs, avoiding the repeated inline loop with punctuation guards scattered across the codebase.

func RegisterAll

func RegisterAll(r *Registry)

RegisterAll registers all language extractors with the registry

Types

type ASTNodeCategory

type ASTNodeCategory string

ASTNodeCategory represents categories of AST nodes

const (
	ASTCategoryFunction   ASTNodeCategory = "function"
	ASTCategoryScope      ASTNodeCategory = "scope"
	ASTCategoryAssignment ASTNodeCategory = "assignment"
	ASTCategoryCall       ASTNodeCategory = "call"
)

type ASTNodeTypes

type ASTNodeTypes struct {
	// FunctionTypes are node types that represent function/method definitions
	FunctionTypes []string

	// ScopeTypes are node types that define variable scopes
	ScopeTypes []string

	// AssignmentTypes are node types for assignment operations
	AssignmentTypes []string

	// CallTypes are node types for function/method calls
	CallTypes []string

	// IdentifierTypes are node types for variable/identifier names
	IdentifierTypes []string
}

ASTNodeTypes holds node type patterns for different categories

func GetASTNodeTypesForLanguage

func GetASTNodeTypesForLanguage(language string) ASTNodeTypes

GetASTNodeTypesForLanguage returns the complete ASTNodeTypes for a language

type Assignment

type Assignment struct {
	LHS       string
	RHS       *sitter.Node
	RHSText   string
	Scope     string
	Line      int
	Column    int
	EndLine   int
	EndColumn int
	Snippet   string
}

Assignment represents an assignment operation in code

type BaseExtractor

type BaseExtractor struct {
	// contains filtered or unexported fields
}

BaseExtractor provides common functionality for AST extraction

func NewBaseExtractor

func NewBaseExtractor(language string, assignmentTypes, callTypes, identifierTypes []string) *BaseExtractor

NewBaseExtractor creates a new base extractor

func (*BaseExtractor) ExpressionContains

func (e *BaseExtractor) ExpressionContains(node Node, varName string, src []byte) bool

ExpressionContains checks if an expression contains a variable. Uses boundary-aware matching to avoid substring false positives (e.g. "$order" must not match "$order_id").

func (*BaseExtractor) ExtractAssignments

func (e *BaseExtractor) ExtractAssignments(root Node, src []byte) []Assignment

ExtractAssignments extracts all assignments from the AST

func (*BaseExtractor) ExtractCalls

func (e *BaseExtractor) ExtractCalls(root Node, src []byte) []FunctionCall

ExtractCalls extracts all function calls from the AST

func (*BaseExtractor) Language

func (e *BaseExtractor) Language() string

Language returns the language this extractor handles

type CallArgument

type CallArgument struct {
	Name  string
	Node  *sitter.Node
	Index int
}

CallArgument represents an argument in a function call

type Extractor

type Extractor interface {
	Language() string
	ExtractAssignments(root Node, src []byte) []Assignment
	ExtractCalls(root Node, src []byte) []FunctionCall
	ExpressionContains(node Node, varName string, src []byte) bool
}

Extractor interface for language-specific AST extraction. Parameters that were previously *sitter.Node now accept the ast.Node interface. Callers that hold a *sitter.Node can pass it directly because *sitter.Node satisfies ast.Node (see node.go).

type FunctionCall

type FunctionCall struct {
	Name      string
	Arguments []CallArgument
	Line      int
	Column    int
	EndLine   int
	EndColumn int
	Scope     string
}

FunctionCall represents a function call in code

type Node

type Node interface {
	// Positional information
	StartByte() uint32
	EndByte() uint32
	StartPoint() sitter.Point
	EndPoint() sitter.Point

	// Type returns the grammar node type (e.g. "assignment_expression").
	Type() string

	// Content returns the source text spanned by the node.
	Content(input []byte) string

	// Tree navigation — pragmatically return *sitter.Node (see package doc).
	ChildCount() uint32
	Child(idx int) *sitter.Node
	NamedChildCount() uint32
	NamedChild(idx int) *sitter.Node
	Parent() *sitter.Node
}

Node is the abstract syntax tree node interface used throughout pkg/ast. It is satisfied by *sitter.Node and by any test double that implements the same surface area, making pkg/ast independently testable without a live tree-sitter parser.

All methods mirror the tree-sitter Go binding signatures exactly so that existing *sitter.Node values can be passed without wrapping or conversion.

func Wrap

func Wrap(n *sitter.Node) Node

Wrap converts a concrete *sitter.Node to ast.Node. The conversion is a compile-time no-op: *sitter.Node already satisfies the Node interface because all required methods are defined on sitter.Node with value receivers (promoted to the pointer type by Go).

Use Wrap at call-sites that will gradually migrate from *sitter.Node to ast.Node, or in tests that need to pass a concrete node to a function that accepts ast.Node.

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry manages AST extractors for all languages

func NewRegistry

func NewRegistry() *Registry

NewRegistry creates a new AST registry

func (*Registry) GetExtractor

func (r *Registry) GetExtractor(language string) Extractor

GetExtractor returns the extractor for a language

func (*Registry) Register

func (r *Registry) Register(extractor Extractor)

Register registers an extractor for a language

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL