gosqlx

package module
v1.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 11, 2025 License: AGPL-3.0 Imports: 0 Imported by: 0

README ยถ

GoSQLX

GoSQLX Logo

โšก High-Performance SQL Parser for Go โšก

Go Version Release PRs Welcome

Tests Go Report Card GoDoc

GitHub Stars GitHub Forks GitHub Watchers

Production-ready, high-performance SQL parsing SDK for Go
Zero-copy tokenization โ€ข Object pooling โ€ข Multi-dialect support โ€ข Unicode-first design

๐Ÿš€ New to GoSQLX? Get Started in 5 Minutes โ†’

๐Ÿ“– Installation โ€ข โšก Quick Start โ€ข ๐Ÿ“š Documentation โ€ข ๐Ÿ’ก Examples โ€ข ๐Ÿ“Š Benchmarks

Getting Started User Guide API Docs Discussions Report Bug


Overview

GoSQLX is a high-performance SQL parsing library designed for production use. It provides zero-copy tokenization, intelligent object pooling, and comprehensive SQL dialect support while maintaining a simple, idiomatic Go API.

Key Features
  • Blazing Fast: 1.38M+ ops/sec sustained, 1.5M+ ops/sec peak throughput
  • Memory Efficient: 60-80% reduction through intelligent object pooling
  • Thread-Safe: Race-free, linear scaling to 128+ cores, 0 race conditions detected
  • Production-Grade Testing: Token 100%, Keywords 100%, Errors 95.6%, Tokenizer 76.1%, Parser 76.1%, CLI 63.3% coverage
  • Complete JOIN Support: All JOIN types (INNER/LEFT/RIGHT/FULL OUTER/CROSS/NATURAL) with proper tree logic
  • Advanced SQL Features: CTEs with RECURSIVE support, Set Operations (UNION/EXCEPT/INTERSECT)
  • Window Functions: Complete SQL-99 window function support with OVER clause, PARTITION BY, ORDER BY, frame specs
  • MERGE Statements: Full SQL:2003 MERGE support with WHEN MATCHED/NOT MATCHED clauses
  • Grouping Operations: GROUPING SETS, ROLLUP, CUBE (SQL-99 T431)
  • Materialized Views: CREATE, DROP, REFRESH MATERIALIZED VIEW support
  • Table Partitioning: PARTITION BY RANGE, LIST, HASH support
  • SQL Injection Detection: Built-in security scanner (pkg/sql/security) for injection pattern detection
  • Unicode Support: Complete UTF-8 support for international SQL
  • Multi-Dialect: PostgreSQL, MySQL, SQL Server, Oracle, SQLite
  • PostgreSQL Extensions: LATERAL JOIN, DISTINCT ON, FILTER clause, JSON/JSONB operators, aggregate ORDER BY
  • Zero-Copy: Direct byte slice operations, <1ฮผs latency
  • Intelligent Errors: Structured error codes with typo detection, context highlighting, and helpful hints
  • Production Ready: Battle-tested with 0 race conditions detected, ~80-85% SQL-99 compliance
Performance & Quality Highlights (v1.6.0)
1.38M+ 8M+ <1ฮผs 14x 575x 100% โญ
Ops/sec Tokens/sec Latency Faster Tokens Cache Speedup Token Coverage

โœ… v1.6.0 Released โ€ข LSP Server โ€ข VSCode Extension โ€ข PostgreSQL JSON/JSONB โ€ข 10 Linter Rules โ€ข ~85% SQL-99 compliance

๐ŸŽ‰ What's New in v1.6.0
Feature Description
๐Ÿ”Œ LSP Server Full Language Server Protocol for IDE integration with diagnostics, completion, hover
๐Ÿ“ VSCode Extension Official extension with syntax highlighting, formatting, and autocomplete
๐Ÿ˜ PostgreSQL Extensions LATERAL JOIN, JSON/JSONB operators (->, ->>, @>, #>), DISTINCT ON, FILTER clause
๐Ÿ” Linter Rules 10 built-in rules (L001-L010) with auto-fix for SELECT *, missing aliases, etc.
๐Ÿ›ก๏ธ Security Scanner Enhanced SQL injection detection with severity classification
โšก Performance 14x faster token comparison, 575x faster keyword suggestions via caching
๐Ÿ—๏ธ go-task Modern task runner (Taskfile.yml) replacing Makefile
๐Ÿ”ข Structured Errors Error codes E1001-E3004 for tokenizer, parser, and semantic errors

See CHANGELOG.md for the complete list of 20+ PRs merged in this release.

Project Stats

Contributors Issues Pull Requests Downloads Last Commit Commit Activity

Installation

Library Installation
go get github.com/ajitpratap0/GoSQLX
CLI Installation
# Install the CLI tool
go install github.com/ajitpratap0/GoSQLX/cmd/gosqlx@latest

# Or build from source
git clone https://github.com/ajitpratap0/GoSQLX.git
cd GoSQLX
go build -o gosqlx ./cmd/gosqlx

Requirements:

  • Go 1.24 or higher
  • No external dependencies

Quick Start

CLI Usage

Standard Usage:

# Validate SQL syntax
gosqlx validate "SELECT * FROM users WHERE active = true"

# Format SQL files with intelligent indentation
gosqlx format -i query.sql

# Analyze SQL structure and complexity
gosqlx analyze "SELECT COUNT(*) FROM orders GROUP BY status"

# Parse SQL to AST representation
gosqlx parse -f json complex_query.sql

# Unix Pipeline Support (NEW in v1.5.0)
cat query.sql | gosqlx format                    # Format from stdin
echo "SELECT * FROM users" | gosqlx validate     # Validate from pipe
gosqlx format query.sql | gosqlx validate        # Chain commands
cat *.sql | gosqlx format | tee formatted.sql    # Pipeline composition

Pipeline/Stdin Support (New in v1.6.0):

# Auto-detect piped input
echo "SELECT * FROM users" | gosqlx validate
cat query.sql | gosqlx format
cat complex.sql | gosqlx analyze --security

# Explicit stdin marker
gosqlx validate -
gosqlx format - < query.sql

# Input redirection
gosqlx validate < query.sql
gosqlx parse < complex_query.sql

# Full pipeline chains
cat query.sql | gosqlx format | gosqlx validate
echo "select * from users" | gosqlx format > formatted.sql
find . -name "*.sql" -exec cat {} \; | gosqlx validate

# Works on Windows PowerShell too!
Get-Content query.sql | gosqlx format
"SELECT * FROM users" | gosqlx validate

Cross-Platform Pipeline Examples:

# Unix/Linux/macOS
cat query.sql | gosqlx format | tee formatted.sql | gosqlx validate
echo "SELECT 1" | gosqlx validate && echo "Valid!"

# Windows PowerShell
Get-Content query.sql | gosqlx format | Set-Content formatted.sql
"SELECT * FROM users" | gosqlx validate

# Git hooks (pre-commit)
git diff --cached --name-only --diff-filter=ACM "*.sql" | \
  xargs cat | gosqlx validate --quiet

Language Server Protocol (LSP) (v1.6.0):

# Start LSP server for IDE integration
gosqlx lsp

# With debug logging
gosqlx lsp --log /tmp/gosqlx-lsp.log

The LSP server provides real-time SQL intelligence for IDEs:

  • Diagnostics: Real-time syntax error detection with position info
  • Hover: Documentation for 60+ SQL keywords
  • Completion: 100+ SQL keywords, functions, and 22 snippets
  • Formatting: SQL code formatting via textDocument/formatting
  • Document Symbols: SQL statement outline navigation
  • Signature Help: Function signatures for 20+ SQL functions
  • Code Actions: Quick fixes (add semicolon, uppercase keywords)

Linting (v1.6.0):

# Run built-in linter rules
gosqlx lint query.sql

# With auto-fix
gosqlx lint --fix query.sql

# Specific rules
gosqlx lint --rules L001,L002,L003 query.sql

Available rules (L001-L010):

  • L001: Avoid SELECT *
  • L002: Missing table aliases in JOIN
  • L003: Implicit column aliases
  • L004: Missing WHERE clause in UPDATE/DELETE
  • L005: Inefficient LIKE patterns
  • L006: Use explicit JOIN syntax (not comma joins)
  • L007: ORDER BY ordinal numbers
  • L008: Inconsistent keyword casing
  • L009: Missing column list in INSERT
  • L010: Avoid DISTINCT without ORDER BY

IDE Integration:

// VSCode settings.json
{
  "gosqlx.lsp.enable": true,
  "gosqlx.lsp.path": "gosqlx"
}
-- Neovim (nvim-lspconfig)
require('lspconfig.configs').gosqlx = {
  default_config = {
    cmd = { 'gosqlx', 'lsp' },
    filetypes = { 'sql' },
    root_dir = function() return vim.fn.getcwd() end,
  },
}
require('lspconfig').gosqlx.setup{}
Library Usage - Simple API

GoSQLX provides a simple, high-level API that handles all complexity for you:

package main

import (
    "fmt"
    "log"

    "github.com/ajitpratap0/GoSQLX/pkg/gosqlx"
)

func main() {
    // Parse SQL in one line - that's it!
    ast, err := gosqlx.Parse("SELECT * FROM users WHERE active = true")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Successfully parsed %d statement(s)\n", len(ast.Statements))
}

That's it! Just 3 lines of code. No pool management, no manual cleanup - everything is handled for you.

More Examples
// Validate SQL without parsing
if err := gosqlx.Validate("SELECT * FROM users"); err != nil {
    fmt.Println("Invalid SQL:", err)
}

// Parse multiple queries efficiently
queries := []string{
    "SELECT * FROM users",
    "SELECT * FROM orders",
}
asts, err := gosqlx.ParseMultiple(queries)

// Parse with timeout for long queries
ast, err := gosqlx.ParseWithTimeout(sql, 5*time.Second)

// Parse from byte slice (zero-copy)
ast, err := gosqlx.ParseBytes([]byte("SELECT * FROM users"))
Advanced Usage - Low-Level API

For performance-critical code that needs fine-grained control, use the low-level API:

package main

import (
    "fmt"

    "github.com/ajitpratap0/GoSQLX/pkg/sql/tokenizer"
    "github.com/ajitpratap0/GoSQLX/pkg/sql/parser"
)

func main() {
    // Get tokenizer from pool (always return it!)
    tkz := tokenizer.GetTokenizer()
    defer tokenizer.PutTokenizer(tkz)

    // Tokenize SQL
    sql := "SELECT id, name FROM users WHERE age > 18"
    tokens, err := tkz.Tokenize([]byte(sql))
    if err != nil {
        panic(err)
    }

    // Convert tokens
    converter := parser.NewTokenConverter()
    result, err := converter.Convert(tokens)
    if err != nil {
        panic(err)
    }

    // Parse to AST
    p := parser.NewParser()
    defer p.Release()

    ast, err := p.Parse(result.Tokens)
    if err != nil {
        panic(err)
    }

    fmt.Printf("Statement type: %T\n", ast)
}

Note: The simple API has < 1% performance overhead compared to low-level API. Use the simple API unless you need fine-grained control.

Documentation

Comprehensive Guides
Guide Description
Getting Started Get started in 5 minutes
Comparison Guide GoSQLX vs SQLFluff, JSQLParser, pg_query
CLI Guide Complete CLI documentation and usage examples
API Reference Complete API documentation with examples
Usage Guide Detailed patterns and best practices
Architecture System design and internal architecture
Troubleshooting Common issues and solutions
Getting Started
Document Purpose
Production Guide Deployment and monitoring
SQL Compatibility Dialect support matrix
Security Analysis Security assessment
Examples Working code examples
Advanced SQL Features (v1.2.0)

GoSQLX now supports Common Table Expressions (CTEs) and Set Operations alongside complete JOIN support:

Common Table Expressions (CTEs)
// Simple CTE
sql := `
    WITH sales_summary AS (
        SELECT region, SUM(amount) as total 
        FROM sales 
        GROUP BY region
    ) 
    SELECT region FROM sales_summary WHERE total > 1000
`

// Recursive CTE for hierarchical data
sql := `
    WITH RECURSIVE employee_tree AS (
        SELECT employee_id, manager_id, name 
        FROM employees 
        WHERE manager_id IS NULL
        UNION ALL
        SELECT e.employee_id, e.manager_id, e.name 
        FROM employees e 
        JOIN employee_tree et ON e.manager_id = et.employee_id
    ) 
    SELECT * FROM employee_tree
`

// Multiple CTEs in single query
sql := `
    WITH regional AS (SELECT region, total FROM sales),
         summary AS (SELECT region FROM regional WHERE total > 1000)
    SELECT * FROM summary
`
Set Operations
// UNION - combine results with deduplication
sql := "SELECT name FROM users UNION SELECT name FROM customers"

// UNION ALL - combine results preserving duplicates
sql := "SELECT id FROM orders UNION ALL SELECT id FROM invoices"

// EXCEPT - set difference
sql := "SELECT product FROM inventory EXCEPT SELECT product FROM discontinued"

// INTERSECT - set intersection
sql := "SELECT customer_id FROM orders INTERSECT SELECT customer_id FROM payments"

// Left-associative parsing for multiple operations
sql := "SELECT a FROM t1 UNION SELECT b FROM t2 INTERSECT SELECT c FROM t3"
// Parsed as: (SELECT a FROM t1 UNION SELECT b FROM t2) INTERSECT SELECT c FROM t3
Complete JOIN Support

GoSQLX supports all JOIN types with proper left-associative tree logic:

// Complex JOIN query with multiple table relationships
sql := `
    SELECT u.name, o.order_date, p.product_name, c.category_name
    FROM users u
    LEFT JOIN orders o ON u.id = o.user_id  
    INNER JOIN products p ON o.product_id = p.id
    RIGHT JOIN categories c ON p.category_id = c.id
    WHERE u.active = true
    ORDER BY o.order_date DESC
`

// Parse with automatic JOIN tree construction
tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz)

tokens, err := tkz.Tokenize([]byte(sql))
parser := parser.NewParser()
ast, err := parser.Parse(tokens)

// Access JOIN information
if selectStmt, ok := ast.Statements[0].(*ast.SelectStatement); ok {
    fmt.Printf("Found %d JOINs:\n", len(selectStmt.Joins))
    for i, join := range selectStmt.Joins {
        fmt.Printf("JOIN %d: %s (left: %s, right: %s)\n", 
            i+1, join.Type, join.Left.Name, join.Right.Name)
    }
}

Supported JOIN Types:

  • โœ… INNER JOIN - Standard inner joins
  • โœ… LEFT JOIN / LEFT OUTER JOIN - Left outer joins
  • โœ… RIGHT JOIN / RIGHT OUTER JOIN - Right outer joins
  • โœ… FULL JOIN / FULL OUTER JOIN - Full outer joins
  • โœ… CROSS JOIN - Cartesian product joins
  • โœ… NATURAL JOIN - Natural joins (implicit ON clause)
  • โœ… USING (column) - Single-column using clause
Advanced SQL Features (v1.4+)
MERGE Statements (SQL:2003 F312)
sql := `
    MERGE INTO target_table t
    USING source_table s ON t.id = s.id
    WHEN MATCHED THEN
        UPDATE SET t.name = s.name, t.value = s.value
    WHEN NOT MATCHED THEN
        INSERT (id, name, value) VALUES (s.id, s.name, s.value)
`
ast, err := gosqlx.Parse(sql)
GROUPING SETS, ROLLUP, CUBE (SQL-99 T431)
// GROUPING SETS - explicit grouping combinations
sql := `SELECT region, product, SUM(sales)
        FROM orders
        GROUP BY GROUPING SETS ((region), (product), (region, product), ())`

// ROLLUP - hierarchical subtotals
sql := `SELECT year, quarter, month, SUM(revenue)
        FROM sales
        GROUP BY ROLLUP (year, quarter, month)`

// CUBE - all possible combinations
sql := `SELECT region, product, SUM(amount)
        FROM sales
        GROUP BY CUBE (region, product)`
Materialized Views
// Create materialized view
sql := `CREATE MATERIALIZED VIEW sales_summary AS
        SELECT region, SUM(amount) as total
        FROM sales GROUP BY region`

// Refresh materialized view
sql := `REFRESH MATERIALIZED VIEW CONCURRENTLY sales_summary`

// Drop materialized view
sql := `DROP MATERIALIZED VIEW IF EXISTS sales_summary`
SQL Injection Detection
import "github.com/ajitpratap0/GoSQLX/pkg/sql/security"

// Create scanner
scanner := security.NewScanner()

// Scan for injection patterns
result := scanner.Scan(ast)

if result.HasCritical() {
    fmt.Printf("Found %d critical issues!\n", result.CriticalCount)
    for _, finding := range result.Findings {
        fmt.Printf("  [%s] %s: %s\n",
            finding.Severity, finding.Pattern, finding.Description)
    }
}

// Detected patterns include:
// - Tautology (1=1, 'a'='a')
// - UNION-based injection
// - Time-based blind (SLEEP, WAITFOR DELAY)
// - Comment bypass (--, /**/)
// - Stacked queries
// - Dangerous functions (xp_cmdshell, LOAD_FILE)
Expression Operators (BETWEEN, IN, LIKE, IS NULL)
// BETWEEN with expressions
sql := `SELECT * FROM orders WHERE amount BETWEEN 100 AND 500`

// IN with subquery
sql := `SELECT * FROM users WHERE id IN (SELECT user_id FROM admins)`

// LIKE with pattern matching
sql := `SELECT * FROM products WHERE name LIKE '%widget%'`

// IS NULL / IS NOT NULL
sql := `SELECT * FROM users WHERE deleted_at IS NULL`

// NULLS FIRST/LAST ordering (SQL-99 F851)
sql := `SELECT * FROM users ORDER BY last_login DESC NULLS LAST`
PostgreSQL-Specific Features (v1.6+)

LATERAL JOIN - Correlated subqueries in FROM clause:

// LATERAL allows referencing columns from preceding tables
sql := `
    SELECT u.name, recent_orders.order_date, recent_orders.total
    FROM users u
    LEFT JOIN LATERAL (
        SELECT order_date, total
        FROM orders
        WHERE user_id = u.id
        ORDER BY order_date DESC
        LIMIT 1
    ) AS recent_orders ON true
`
ast, err := gosqlx.Parse(sql)

ORDER BY inside Aggregates - Ordered set functions:

// STRING_AGG with ORDER BY
sql := `SELECT STRING_AGG(name, ', ' ORDER BY name DESC NULLS LAST) FROM users`

// ARRAY_AGG with ORDER BY
sql := `SELECT ARRAY_AGG(value ORDER BY created_at, priority DESC) FROM items`

// JSON_AGG with ORDER BY
sql := `SELECT JSON_AGG(employee_data ORDER BY hire_date) FROM employees`

// Multiple aggregates with different orderings
sql := `
    SELECT
        department,
        STRING_AGG(name, '; ' ORDER BY name ASC NULLS FIRST) AS employee_names,
        ARRAY_AGG(salary ORDER BY salary DESC) AS salaries
    FROM employees
    GROUP BY department
`
ast, err := gosqlx.Parse(sql)

JSON/JSONB Operators - PostgreSQL JSON support:

// Arrow operators for field access
sql := `SELECT data -> 'user' -> 'profile' ->> 'email' FROM users`

// Path operators for nested access
sql := `SELECT data #> '{address,city}', data #>> '{address,zipcode}' FROM users`

// Containment operators
sql := `SELECT * FROM users WHERE data @> '{"active": true}'`
sql := `SELECT * FROM users WHERE '{"admin": true}' <@ data`

// Combined JSON operators in complex queries
sql := `
    SELECT
        u.id,
        u.data ->> 'name' AS user_name,
        u.data -> 'settings' ->> 'theme' AS theme
    FROM users u
    WHERE u.data @> '{"verified": true}'
    AND u.data ->> 'status' = 'active'
`
ast, err := gosqlx.Parse(sql)

DISTINCT ON - PostgreSQL unique row selection:

// Select first row per group based on ordering
sql := `
    SELECT DISTINCT ON (user_id) user_id, created_at, status
    FROM orders
    ORDER BY user_id, created_at DESC
`
ast, err := gosqlx.Parse(sql)

FILTER Clause - Conditional aggregation:

// COUNT with FILTER
sql := `
    SELECT
        COUNT(*) AS total_orders,
        COUNT(*) FILTER (WHERE status = 'completed') AS completed_orders,
        SUM(amount) FILTER (WHERE region = 'US') AS us_revenue
    FROM orders
`
ast, err := gosqlx.Parse(sql)

Examples

Multi-Dialect Support
// PostgreSQL with array operators
sql := `SELECT * FROM users WHERE tags @> ARRAY['admin']`

// MySQL with backticks
sql := "SELECT `user_id`, `name` FROM `users`"

// SQL Server with brackets
sql := "SELECT [user_id], [name] FROM [users]"
Unicode and International SQL
// Japanese
sql := `SELECT "ๅๅ‰", "ๅนด้ฝข" FROM "ใƒฆใƒผใ‚ถใƒผ"`

// Russian
sql := `SELECT "ะธะผั", "ะฒะพะทั€ะฐัั‚" FROM "ะฟะพะปัŒะทะพะฒะฐั‚ะตะปะธ"`

// Arabic
sql := `SELECT "ุงู„ุงุณู…", "ุงู„ุนู…ุฑ" FROM "ุงู„ู…ุณุชุฎุฏู…ูˆู†"`

// Emoji support
sql := `SELECT * FROM users WHERE status = '๐Ÿš€'`
Concurrent Processing
func ProcessConcurrently(queries []string) {
    var wg sync.WaitGroup
    
    for _, sql := range queries {
        wg.Add(1)
        go func(query string) {
            defer wg.Done()
            
            // Each goroutine gets its own tokenizer
            tkz := tokenizer.GetTokenizer()
            defer tokenizer.PutTokenizer(tkz)
            
            tokens, _ := tkz.Tokenize([]byte(query))
            // Process tokens...
        }(sql)
    }
    
    wg.Wait()
}

Performance

v1.0.0 Performance Improvements
Metric Previous v1.0.0 Improvement
Sustained Throughput 2.2M ops/s 946K+ ops/s Production Grade โœ…
Peak Throughput 2.2M ops/s 1.25M+ ops/s Enhanced โœ…
Token Processing 8M tokens/s 8M+ tokens/s Maintained โœ…
Simple Query Latency 200ns <280ns Optimized โœ…
Complex Query Latency N/A <1ฮผs (CTE/Set Ops) New Capability โœ…
Memory Usage Baseline 60-80% reduction -70% โœ…
SQL-92 Compliance 40% ~70% +75% โœ…
Latest Benchmark Results
BenchmarkParserSustainedLoad-16           946,583      1,057 ns/op     1,847 B/op      23 allocs/op
BenchmarkParserThroughput-16            1,252,833        798 ns/op     1,452 B/op      18 allocs/op
BenchmarkParserSimpleSelect-16          3,571,428        279 ns/op       536 B/op       9 allocs/op
BenchmarkParserComplexSelect-16           985,221      1,014 ns/op     2,184 B/op      31 allocs/op

BenchmarkCTE/SimpleCTE-16                 524,933      1,891 ns/op     3,847 B/op      52 allocs/op
BenchmarkCTE/RecursiveCTE-16              387,654      2,735 ns/op     5,293 B/op      71 allocs/op
BenchmarkSetOperations/UNION-16           445,782      2,234 ns/op     4,156 B/op      58 allocs/op

BenchmarkTokensPerSecond-16               815,439      1,378 ns/op   8,847,625 tokens/sec
Performance Characteristics
Metric Value Details
Sustained Throughput 946K+ ops/sec 30s load testing
Peak Throughput 1.25M+ ops/sec Concurrent goroutines
Token Rate 8M+ tokens/sec Sustained processing
Simple Query Latency <280ns Basic SELECT (p50)
Complex Query Latency <1ฮผs CTEs/Set Operations
Memory 1.8KB/query Complex SQL with pooling
Scaling Linear to 128+ Perfect concurrency
Pool Efficiency 95%+ hit rate Effective reuse

Run go test -bench=. -benchmem ./pkg/... for detailed performance analysis.

Testing

# Run all tests with race detection
go test -race ./...

# Run benchmarks
go test -bench=. -benchmem ./...

# Generate coverage report
go test -coverprofile=coverage.out ./...
go tool cover -html=coverage.out

# Run specific test suites
go test -v ./pkg/sql/tokenizer/
go test -v ./pkg/sql/parser/

Project Structure

GoSQLX/
โ”œโ”€โ”€ pkg/
โ”‚   โ”œโ”€โ”€ models/              # Core data structures
โ”‚   โ”‚   โ”œโ”€โ”€ token.go        # Token definitions
โ”‚   โ”‚   โ””โ”€โ”€ location.go     # Position tracking
โ”‚   โ””โ”€โ”€ sql/
โ”‚       โ”œโ”€โ”€ tokenizer/       # Lexical analysis
โ”‚       โ”‚   โ”œโ”€โ”€ tokenizer.go
โ”‚       โ”‚   โ””โ”€โ”€ pool.go
โ”‚       โ”œโ”€โ”€ parser/          # Syntax analysis
โ”‚       โ”‚   โ”œโ”€โ”€ parser.go
โ”‚       โ”‚   โ””โ”€โ”€ expressions.go
โ”‚       โ”œโ”€โ”€ ast/            # Abstract syntax tree
โ”‚       โ”‚   โ”œโ”€โ”€ nodes.go
โ”‚       โ”‚   โ””โ”€โ”€ statements.go
โ”‚       โ””โ”€โ”€ keywords/        # SQL keywords
โ”œโ”€โ”€ examples/               # Usage examples
โ”‚   โ””โ”€โ”€ cmd/
โ”‚       โ”œโ”€โ”€ example.go
โ”‚       โ””โ”€โ”€ example_test.go
โ”œโ”€โ”€ docs/                   # Documentation
โ”‚   โ”œโ”€โ”€ API_REFERENCE.md
โ”‚   โ”œโ”€โ”€ USAGE_GUIDE.md
โ”‚   โ”œโ”€โ”€ ARCHITECTURE.md
โ”‚   โ””โ”€โ”€ TROUBLESHOOTING.md
โ””โ”€โ”€ tools/                  # Development tools

Development

Prerequisites
  • Go 1.24+
  • Task - task runner (install: go install github.com/go-task/task/v3/cmd/task@latest)
  • golangci-lint, staticcheck (for code quality, install: task deps:tools)
Task Runner

This project uses Task as the task runner. Install with:

go install github.com/go-task/task/v3/cmd/task@latest
# Or: brew install go-task (macOS)
Building
# Show all available tasks
task

# Build the project
task build

# Build the CLI binary
task build:cli

# Install CLI globally
task install

# Run all quality checks
task quality

# Run all tests
task test

# Run tests with race detection (recommended)
task test:race

# Clean build artifacts
task clean
Code Quality
# Format code
task fmt

# Run go vet
task vet

# Run golangci-lint
task lint

# Run all quality checks (fmt, vet, lint)
task quality

# Full CI check (format, vet, lint, test:race)
task check

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

How to Contribute
  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request
Development Guidelines
  • Write tests for new features
  • Ensure all tests pass with race detection
  • Follow Go idioms and best practices
  • Update documentation for API changes
  • Add benchmarks for performance-critical code

Roadmap

Phase 1: Core SQL Enhancements (Q1 2025) - v1.1.0 โœ…
  • โœ… Complete JOIN support (INNER/LEFT/RIGHT/FULL OUTER/CROSS/NATURAL)
  • โœ… Proper join tree logic with left-associative relationships
  • โœ… USING clause parsing (single-column, multi-column planned for Phase 2)
  • โœ… Enhanced error handling with contextual JOIN error messages
  • โœ… Comprehensive test coverage (15+ JOIN scenarios including error cases)
  • ๐Ÿ—๏ธ CTE foundation laid (AST structures, tokens, parser integration points)
Phase 2: CTE & Advanced Features (Q1 2025) - v1.2.0 โœ…
  • โœ… Common Table Expressions (CTEs) with RECURSIVE support
  • โœ… Set operations (UNION/EXCEPT/INTERSECT with ALL modifier)
  • โœ… Left-associative set operation parsing
  • โœ… CTE column specifications and multiple CTE definitions
  • โœ… Integration of CTEs with set operations
  • โœ… Enhanced error handling with contextual messages
  • โœ… ~70% SQL-92 compliance achieved
Phase 3: Dialect Specialization (Q1 2025) - v2.0.0
  • ๐Ÿ“‹ PostgreSQL arrays, JSONB, custom types
  • ๐Ÿ“‹ MySQL-specific syntax and functions
  • ๐Ÿ“‹ SQL Server T-SQL extensions
  • ๐Ÿ“‹ Multi-dialect parser with auto-detection
Phase 4: Intelligence Layer (Q2 2025) - v2.1.0
  • ๐Ÿ“‹ Query optimization suggestions
  • ๐Ÿ“‹ Security vulnerability detection
  • ๐Ÿ“‹ Performance analysis and hints
  • ๐Ÿ“‹ Schema validation

See ARCHITECTURE.md for detailed system design

Community & Support

Join Our Community

GitHub Discussions GitHub Issues

Get Help
Channel Purpose Response Time
๐Ÿ› Bug Reports Report issues Community-driven
๐Ÿ’ก Feature Requests Suggest improvements Community-driven
๐Ÿ’ฌ Discussions Q&A, ideas, showcase Community-driven
๐Ÿ”’ Security Report vulnerabilities Best effort

Contributors

Core Team
Contributors
How to Contribute

We love your input! We want to make contributing as easy and transparent as possible.

Contributing Guide Start Contributing

Quick Contribution Guide
  1. ๐Ÿด Fork the repo
  2. ๐Ÿ”จ Make your changes
  3. โœ… Ensure tests pass (go test -race ./...)
  4. ๐Ÿ“ Update documentation
  5. ๐Ÿš€ Submit a PR

Use Cases

Industry Use Case Benefits
๐Ÿฆ FinTech SQL validation & auditing Fast validation, compliance tracking
๐Ÿ“Š Analytics Query parsing & optimization Real-time analysis, performance insights
๐Ÿ›ก๏ธ Security SQL injection detection Pattern matching, threat prevention
๐Ÿ—๏ธ DevTools IDE integration & linting Syntax highlighting, auto-completion
๐Ÿ“š Education SQL learning platforms Interactive parsing, error explanation
๐Ÿ”„ Migration Cross-database migration Dialect conversion, compatibility check

Who's Using GoSQLX

Using GoSQLX in production? Let us know!

Project Metrics

Performance Benchmarks
graph LR
    A[SQL Input] -->|946K+ ops/sec| B[Tokenizer]
    B -->|8M+ tokens/sec| C[Parser]
    C -->|Zero-copy| D[AST]
    D -->|60-80% less memory| E[Output]

Support This Project

If GoSQLX helps your project, please consider:

Star This Repo

Other Ways to Support
  • โญ Star this repository
  • ๐Ÿฆ Tweet about GoSQLX
  • ๐Ÿ“ Write a blog post
  • ๐ŸŽฅ Create a tutorial
  • ๐Ÿ› Report bugs
  • ๐Ÿ’ก Suggest features
  • ๐Ÿ”ง Submit PRs

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) - see the LICENSE file for details.


Built with โค๏ธ by the GoSQLX Team

Star Us Fork Me Watch

Copyright ยฉ 2024-2025 GoSQLX. All rights reserved.

Documentation ยถ

Overview ยถ

Package gosqlx provides a high-performance SQL parsing SDK for Go with zero-copy tokenization and object pooling. It offers production-ready SQL lexing, parsing, and AST generation with support for multiple SQL dialects and advanced SQL features.

GoSQLX v1.6.0 includes both a powerful Go SDK and a high-performance CLI tool for SQL processing.

Core Features:

- Zero-copy tokenization for optimal performance - Object pooling for 60-80% memory reduction - Multi-dialect SQL support (PostgreSQL, MySQL, SQL Server, Oracle, SQLite) - Thread-safe implementation with linear scaling to 128+ cores - Full Unicode/UTF-8 support for international SQL - Performance monitoring and metrics collection - Visitor pattern support for AST traversal - Production-ready CLI tool with 1.38M+ ops/sec performance

Advanced SQL Features (Phase 2.5 - v1.3.0+, PostgreSQL Extensions v1.6.0+):

- Window functions with OVER clause (ROW_NUMBER, RANK, LAG, LEAD, etc.) - PARTITION BY and ORDER BY window specifications - Window frame clauses (ROWS/RANGE with bounds) - Common Table Expressions (CTEs) with WITH clause - Recursive CTEs with WITH RECURSIVE support - Multiple CTEs in single query - Set operations: UNION, UNION ALL, EXCEPT, INTERSECT - Complete JOIN support (INNER/LEFT/RIGHT/FULL/CROSS/NATURAL) - ~80-85% SQL-99 standards compliance

CLI Tool (v1.6.0):

Install the CLI:

go install github.com/ajitpratap0/GoSQLX/cmd/gosqlx@latest

CLI Commands:

gosqlx validate "SELECT * FROM users"     // Ultra-fast validation
gosqlx format -i query.sql               // Intelligent formatting
gosqlx analyze complex_query.sql         // Advanced analysis
gosqlx parse -f json query.sql           // AST generation

Basic Usage:

import (
    "github.com/ajitpratap0/GoSQLX/pkg/sql/tokenizer"
    "github.com/ajitpratap0/GoSQLX/pkg/sql/parser"
    "github.com/ajitpratap0/GoSQLX/pkg/sql/ast"
)

// Get a tokenizer from the pool
tkz := tokenizer.GetTokenizer()
defer tokenizer.PutTokenizer(tkz)

// Tokenize SQL
tokens, err := tkz.Tokenize([]byte("SELECT * FROM users WHERE id = 1"))
if err != nil {
    log.Fatal(err)
}

// Parse tokens into AST
p := &parser.Parser{}
astObj, err := p.Parse(tokens)
if err != nil {
    log.Fatal(err)
}
defer ast.ReleaseAST(astObj)

Advanced Usage (Phase 2 Features):

// Common Table Expression (CTE)
cteSQL := `WITH sales_summary AS (
    SELECT region, SUM(amount) as total
    FROM sales
    GROUP BY region
) SELECT region FROM sales_summary WHERE total > 1000`

// Recursive CTE
recursiveSQL := `WITH RECURSIVE employee_tree AS (
    SELECT employee_id, manager_id, name FROM employees WHERE manager_id IS NULL
    UNION ALL
    SELECT e.employee_id, e.manager_id, e.name
    FROM employees e JOIN employee_tree et ON e.manager_id = et.employee_id
) SELECT * FROM employee_tree`

// Set Operations
unionSQL := `SELECT name FROM customers UNION SELECT name FROM suppliers`
exceptSQL := `SELECT product FROM inventory EXCEPT SELECT product FROM discontinued`
intersectSQL := `SELECT customer_id FROM orders INTERSECT SELECT customer_id FROM payments`

Performance:

GoSQLX Library achieves: - 1.38M+ sustained operations/second (validated benchmarks) - 1.5M+ operations/second peak throughput (concurrent) - 8M+ tokens/second processing speed - <1ฮผs latency for complex queries with window functions - Linear scaling to 128+ cores - 60-80% memory reduction with object pooling - Zero memory leaks under extended load - Race-free concurrent operation validated

CLI Performance: - 1.38M+ operations/second SQL validation - 2,600+ files/second formatting throughput - 1M+ queries/second analysis performance - Memory leak prevention with proper AST cleanup

For more examples and detailed documentation, see: https://github.com/ajitpratap0/GoSQLX

Directories ยถ

Path Synopsis
cmd
gosqlx command
Package main demonstrates PostgreSQL DISTINCT ON clause parsing
Package main demonstrates PostgreSQL DISTINCT ON clause parsing
cmd command
error-demo command
linter-example command
sql-formatter command
sql-validator command
pkg
errors
Package errors provides a structured error system for GoSQLX with error codes, context extraction, and intelligent hints for debugging SQL parsing issues.
Package errors provides a structured error system for GoSQLX with error codes, context extraction, and intelligent hints for debugging SQL parsing issues.
gosqlx
Package gosqlx provides convenient high-level functions for SQL parsing and extraction.
Package gosqlx provides convenient high-level functions for SQL parsing and extraction.
gosqlx/testing
Package testing provides helper functions for testing SQL parsing in Go tests.
Package testing provides helper functions for testing SQL parsing in Go tests.
lsp
Package lsp implements a Language Server Protocol (LSP) server for GoSQLX.
Package lsp implements a Language Server Protocol (LSP) server for GoSQLX.
metrics
Package metrics provides production performance monitoring for GoSQLX
Package metrics provides production performance monitoring for GoSQLX
models
Package models provides core data structures for SQL tokenization and parsing, including tokens, spans, locations, and error types.
Package models provides core data structures for SQL tokenization and parsing, including tokens, spans, locations, and error types.
sql/ast
Package ast provides Abstract Syntax Tree (AST) node definitions for SQL statements.
Package ast provides Abstract Syntax Tree (AST) node definitions for SQL statements.
sql/keywords
Package keywords provides SQL keyword definitions and categorization for multiple SQL dialects.
Package keywords provides SQL keyword definitions and categorization for multiple SQL dialects.
sql/monitor
Package monitor provides performance monitoring and metrics collection for GoSQLX
Package monitor provides performance monitoring and metrics collection for GoSQLX
sql/parser
Package parser provides a recursive descent SQL parser that converts tokens into an Abstract Syntax Tree (AST).
Package parser provides a recursive descent SQL parser that converts tokens into an Abstract Syntax Tree (AST).
sql/security
Package security provides SQL injection pattern detection and security scanning.
Package security provides SQL injection pattern detection and security scanning.
sql/tokenizer
Package tokenizer provides a high-performance SQL tokenizer with zero-copy operations
Package tokenizer provides a high-performance SQL tokenizer with zero-copy operations

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL