php-parser

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 3, 2025 License: MIT

README ΒΆ

PHP Parser

Go Report Card Go Version

A high-performance, comprehensive PHP parser implementation in Go with full PHP 8.4 syntax support. This parser provides complete lexical analysis, syntax parsing, and AST generation capabilities for PHP code.

δΈ­ζ–‡ζ–‡ζ‘£ | Documentation | Examples

Features

πŸš€ Core Capabilities
  • Full PHP 8.4 Compatibility: Complete syntax support including modern PHP features
  • High Performance: Optimized lexer and parser with benchmark support
  • Complete AST: Rich Abstract Syntax Tree with 150+ node types
  • Error Recovery: Robust error handling with partial parsing capabilities
  • Position Tracking: Precise line/column information for all tokens and nodes
  • Visitor Pattern: Comprehensive AST traversal and transformation utilities
πŸ“¦ Components
  • Lexer: State-machine based tokenizer with 11 parsing states
  • Parser: Recursive descent parser with Pratt parsing for expressions
  • AST: Interface-based node system with visitor pattern support
  • CLI Tool: Feature-rich command-line interface for parsing and analysis
  • Examples: 5 comprehensive examples demonstrating different use cases
🎯 Use Cases
  • Static Analysis: Code quality tools, linters, security scanners
  • Development Tools: IDEs, language servers, code formatters
  • Documentation: API documentation generators, code visualization
  • Refactoring: Automated code transformation and migration tools
  • Testing: Code coverage analysis, mutation testing
  • Transpilation: PHP-to-PHP transformations, version compatibility

Quick Start

Installation
go get github.com/wudi/php-parser
Basic Usage
package main

import (
    "fmt"
    "github.com/wudi/php-parser/lexer"
    "github.com/wudi/php-parser/parser"
)

func main() {
    code := `<?php
    function hello($name) {
        return "Hello, " . $name . "!";
    }
    echo hello("World");
    ?>`

    l := lexer.New(code)
    p := parser.New(l)
    program := p.ParseProgram()

    if len(p.Errors()) > 0 {
        for _, err := range p.Errors() {
            fmt.Printf("Error: %s\n", err)
        }
        return
    }

    fmt.Printf("Parsed %d statements\n", len(program.Body))
    fmt.Printf("AST: %s\n", program.String())
}
Command Line Tool

Build and use the CLI tool:

# Build the parser
go build -o php-parser ./cmd/php-parser

# Parse a PHP file
./php-parser example.php

# Show tokens and AST
./php-parser -tokens -ast example.php

# Output as JSON
./php-parser -format json example.php

# Parse from stdin
echo '<?php echo "Hello"; ?>' | ./php-parser

Architecture

Project Structure
php-parser/
β”œβ”€β”€ ast/            # Abstract Syntax Tree implementation
β”‚   β”œβ”€β”€ node.go     # 150+ AST node types (6K+ lines)
β”‚   β”œβ”€β”€ kind.go     # AST node type constants  
β”‚   β”œβ”€β”€ visitor.go  # Visitor pattern utilities
β”‚   └── builder.go  # AST construction helpers
β”œβ”€β”€ lexer/          # Lexical analyzer  
β”‚   β”œβ”€β”€ lexer.go    # Main lexer with state machine (1.5K+ lines)
β”‚   β”œβ”€β”€ token.go    # PHP token definitions (150+ tokens)
β”‚   └── states.go   # Lexer state management
β”œβ”€β”€ parser/         # Syntax parser
β”‚   β”œβ”€β”€ parser.go   # Recursive descent parser (7K+ lines)
β”‚   β”œβ”€β”€ pool.go     # Parser pooling for concurrency
β”‚   └── testdata/   # Test cases and fixtures
β”œβ”€β”€ cmd/            # Command-line interface
β”‚   └── php-parser/ # CLI implementation (244 lines)
β”œβ”€β”€ examples/       # Usage examples and tutorials
β”‚   β”œβ”€β”€ basic-parsing/      # Fundamental parsing concepts
β”‚   β”œβ”€β”€ ast-visitor/        # Visitor pattern examples
β”‚   β”œβ”€β”€ token-extraction/   # Lexical analysis
β”‚   β”œβ”€β”€ error-handling/     # Error recovery examples
β”‚   └── code-analysis/      # Static analysis tools
β”œβ”€β”€ errors/         # Error handling utilities  
└── scripts/        # Development and testing scripts

Code Statistics: 18,500+ lines of Go code across 16 source files with 29 test files

Core Components
Lexer (Tokenizer)
  • 11 Parsing States: Including ST_IN_SCRIPTING, ST_DOUBLE_QUOTES, ST_HEREDOC
  • 150+ Token Types: Complete PHP 8.4 token compatibility
  • State Machine: Handles complex PHP syntax like string interpolation
  • Shebang Support: Recognizes executable PHP files
  • Position Tracking: Line, column, and offset information
Parser (Syntax Analyzer)
  • Recursive Descent: Clean, maintainable parsing architecture
  • Pratt Parsing: Elegant operator precedence handling (14 levels)
  • Error Recovery: Continues parsing after errors to find multiple issues
  • 50+ Parse Functions: Comprehensive PHP syntax coverage
  • Alternative Syntax: Full support for if:...endif; style constructs
AST (Abstract Syntax Tree)
  • Interface-Based: Clean separation between node types
  • 150+ Node Types: Matching PHP's official zend_ast.h constants
  • Visitor Pattern: Easy traversal and transformation
  • JSON Serialization: Export AST for external tools
  • Position Preservation: All nodes retain source location

PHP 8.4 Language Support

Operators
  • Arithmetic: +, -, *, /, %, ** (power)
  • Assignment: =, +=, -=, *=, /=, %=, **=, .=, ??=, etc.
  • Comparison: ==, ===, !=, !==, <, <=, >, >=, <=> (spaceship)
  • Logical: &&, ||, !, and, or, xor
  • Bitwise: &, |, ^, ~, <<, >>
  • Null Coalescing: ??, ??=
Language Constructs
  • Control Structures: if, else, elseif, while, for, foreach, switch
  • Alternative Syntax: if:...endif;, while:...endwhile;, switch:...endswitch;
  • Functions: Parameters, return types, references, variadic
  • Classes: Properties, methods, constants, inheritance, interfaces
  • Namespaces: Full namespace support with use statements
  • Special: __halt_compiler(), match expressions, attributes
Modern PHP Features
  • Typed Properties: private int $id, public ?string $name
  • Union Types: int|string, Foo|Bar|null
  • Intersection Types: Foo&Bar
  • Match Expressions: Pattern matching with match()
  • Attributes: #[Route('/api')] syntax
  • Nullsafe Operator: $user?->getProfile()?->getName()
  • Class Constants with Visibility: private const SECRET = 'value'

Examples

The examples/ directory contains comprehensive demonstrations:

1. Basic Parsing

Learn fundamental parsing concepts and AST examination.

cd examples/basic-parsing && go run main.go
2. AST Visitor Pattern

Implement custom visitors for code analysis and traversal.

cd examples/ast-visitor && go run main.go
3. Token Extraction

Explore lexical analysis and token statistics.

cd examples/token-extraction && go run main.go
4. Error Handling

Understand parser error recovery and reporting.

cd examples/error-handling && go run main.go  
5. Code Analysis

Build static analysis tools with metrics and quality assessment.

cd examples/code-analysis && go run main.go

Each example includes:

  • Runnable Code: Complete working examples
  • Documentation: Detailed README with explanations
  • Real PHP Samples: Realistic code for testing
  • Progressive Complexity: From beginner to advanced

Testing

Running Tests
# Run all tests
go test ./...

# Run with verbose output
go test ./... -v

# Run specific component tests
go test ./lexer -v
go test ./parser -v  
go test ./ast -v

# Run benchmarks
go test ./parser -bench=.
go test ./parser -bench=. -benchmem

# Run specific test cases
go test ./parser -run=TestParsing_TryCatchWithStatements
go test ./parser -run=TestParsing_ClassMethodsWithVisibility
Test Coverage

The project maintains comprehensive test coverage with:

  • 29 Test Files: Covering all major components
  • 200+ Test Cases: Including edge cases and error conditions
  • Real-world Testing: Compatibility tests against major PHP frameworks
  • Benchmark Tests: Performance validation and optimization
Compatibility Testing

Test against popular PHP codebases:

# Test with WordPress
go run scripts/test_folder.go /path/to/wordpress-develop

# Test with Laravel  
go run scripts/test_folder.go /path/to/framework

# Test with Symfony
go run scripts/test_folder.go /path/to/symfony

Performance

Benchmarks
  • Simple Statements: ~1.6ΞΌs per parse
  • Complex Expressions: ~4ΞΌs per parse
  • Large Files: Efficient memory usage with streaming
  • Concurrent Parsing: Parser pool support for parallel processing
Memory Usage
  • Low Footprint: Minimal memory allocation per parse
  • Streaming: Large file support without loading entire content
  • Pool Pattern: Reusable parser instances for server applications

Development

Requirements
  • Go 1.21+
  • Optional: PHP 8.4 for compatibility testing
Development Commands
# Build CLI tool
go build -o php-parser ./cmd/php-parser

# Run all tests
go test ./...

# Run performance benchmarks
go test ./parser -bench=. -run=^$

# Test with memory profiling
go test ./parser -bench=. -benchmem

# Compatibility testing
go run scripts/test_folder.go /path/to/php/codebase

# Clean build artifacts
go clean
rm -f php-parser main

Contributing

We welcome contributions! Please see our contributing guidelines:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Write tests for your changes
  4. Ensure all tests pass: go test ./...
  5. Run linting: go fmt ./...
  6. Submit a pull request
Development Guidelines
  • Follow Go coding standards (gofmt, effective Go)
  • Maintain PHP compatibility with official implementation
  • Add comprehensive tests for new features
  • Reference PHP's official grammar at /php-src/Zend/zend_language_parser.y
  • Update documentation for new features

Use Cases in Production

Static Analysis Tools
  • Code Quality: Detect anti-patterns, complexity metrics
  • Security Scanning: Find vulnerabilities, injection risks
  • Style Checking: Enforce coding standards and conventions
Development Tools
  • IDEs: Syntax highlighting, auto-completion, error detection
  • Language Servers: LSP implementation for editor support
  • Code Formatters: Consistent code styling and formatting
Documentation Tools
  • API Generators: Extract documentation from code and comments
  • Code Visualization: Generate class diagrams, call graphs
  • Dependency Analysis: Track code relationships and coupling
Migration and Refactoring
  • Version Upgrades: PHP version compatibility transformations
  • Framework Migrations: Automated code pattern updates
  • Code Modernization: Apply modern PHP practices and syntax

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • PHP Team: For the official PHP language specification
  • Go Community: For excellent tooling and ecosystem
  • Contributors: Everyone who has contributed to this project

Built with ❀️ in Go | PHP 8.4 Compatible | Production Ready

Directories ΒΆ

Path Synopsis
cmd
php-parser command
examples
ast-visitor command
basic-parsing command
code-analysis command
error-handling command
Package parser with object pooling to reduce allocations.
Package parser with object pooling to reduce allocations.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL