Documentation
¶
Overview ¶
Package extraction provides AST-based code extraction utilities for Python source code.
This package uses tree-sitter to extract program statements from Python source files, converting AST nodes into structured Statement objects for analysis.
Example:
statements, err := extraction.ExtractStatements(sourceCode, "myFunction")
if err != nil {
log.Fatal(err)
}
for _, stmt := range statements {
fmt.Printf("Statement type: %s\n", stmt.Type)
}
Index ¶
- func ExtractClassAttributes(filePath string, sourceCode []byte, modulePath string, ...) error
- func ExtractStatements(filePath string, sourceCode []byte, functionNode *sitter.Node) ([]*core.Statement, error)
- func ExtractVariableAssignments(filePath string, sourceCode []byte, typeEngine *resolution.TypeInferenceEngine, ...) error
- func ParsePythonFile(sourceCode []byte) (*sitter.Tree, error)
- type AttributeAssignment
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ExtractClassAttributes ¶
func ExtractClassAttributes( filePath string, sourceCode []byte, modulePath string, typeEngine *resolution.TypeInferenceEngine, attrRegistry *registry.AttributeRegistry, ) error
ExtractClassAttributes extracts all class attributes from a Python file This is Pass 1 & 2 of the attribute extraction algorithm:
Pass 1: Extract class metadata (FQN, methods, file path) Pass 2: Extract attribute assignments (self.attr = value)
Algorithm:
- Parse file with tree-sitter
- Find all class definitions
- For each class: a. Create ClassAttributes entry b. Collect method names c. Scan for self.attr assignments d. Infer types using 6 strategies
Parameters:
- filePath: absolute path to Python file
- sourceCode: file contents
- modulePath: fully qualified module path (e.g., "myapp.models")
- typeEngine: type inference engine with return types and variables
- registry: attribute registry to populate
Returns:
- error if parsing fails
func ExtractStatements ¶
func ExtractStatements(filePath string, sourceCode []byte, functionNode *sitter.Node) ([]*core.Statement, error)
ExtractStatements extracts all statements from a Python function body. It processes assignments, calls, and returns to build def-use chains. Returns a slice of Statement objects or an error if parsing fails.
func ExtractVariableAssignments ¶
func ExtractVariableAssignments( filePath string, sourceCode []byte, typeEngine *resolution.TypeInferenceEngine, registry *core.ModuleRegistry, builtinRegistry *registry.BuiltinRegistry, ) error
ExtractVariableAssignments extracts variable assignments from a Python file and populates the type inference engine with inferred types.
Algorithm:
- Parse source code with tree-sitter Python parser
- Traverse AST to find assignment statements
- For each assignment: - Extract variable name - Infer type from RHS (literal, function call, or method call) - Create VariableBinding with inferred type - Add binding to function scope
Parameters:
- filePath: absolute path to the Python file
- sourceCode: contents of the file as byte array
- typeEngine: type inference engine to populate
- registry: module registry for resolving module paths
- builtinRegistry: builtin types registry for literal inference
Returns:
- error: if parsing fails
Types ¶
type AttributeAssignment ¶
type AttributeAssignment struct {
AttributeName string // Name of the attribute (e.g., "value", "user")
RightSide *sitter.Node // AST node of the right-hand side expression
Node *sitter.Node // Full assignment node
}
AttributeAssignment represents a self.attr = value assignment.