summarize

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package summarize produces a structured, AST-only description of a SQL statement: the operation, the tables it touches, the WHERE predicates, the joins, the columns it mutates, and a delegated risk level from the risk package.

It is the deterministic "what does this statement do?" companion to the risk package's "what is dangerous about it?". Like risk, it works purely from the cockroachdb-parser AST and never connects to a cluster.

Example:

s, _ := summarize.Summarize("DELETE FROM orders WHERE status='x'")
// s[0].Operation == OpDelete
// s[0].Tables    == []string{"orders"}
// s[0].Predicates == []string{"status = 'x'"}
// s[0].RiskLevel == risk.SeverityInfo

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Join

type Join struct {
	Type      string `json:"type"`
	Left      string `json:"left,omitempty"`
	Right     string `json:"right,omitempty"`
	Condition string `json:"condition,omitempty"`
}

Join describes one JOIN clause inside a statement.

Left and Right are best-effort table names: they hold the bare table name (or alias when present) when the side resolves to an AliasedTableExpr backed by a TableName, and are empty for nested joins or subquery sources.

Condition holds the rendered ON expression, "USING (col1, col2)", "NATURAL", or empty for a CROSS join.

type Operation

type Operation string

Operation classifies the top-level kind of a statement. The values are part of the wire format — adding new ones is fine, renaming existing ones is a breaking change.

const (
	OpSelect Operation = "SELECT"
	OpInsert Operation = "INSERT"
	OpUpsert Operation = "UPSERT"
	OpUpdate Operation = "UPDATE"
	OpDelete Operation = "DELETE"
	OpOther  Operation = "OTHER"
)

Operation values. OpOther is the catch-all for statement kinds that summarize does not structurally decompose; the StatementTag is surfaced via Summary.Tag so agents can still distinguish e.g. DROP TABLE from CREATE TABLE.

type Summary

type Summary struct {
	Operation         Operation     `json:"operation"`
	Tag               string        `json:"tag"`
	Tables            []string      `json:"tables"`
	Predicates        []string      `json:"predicates"`
	Joins             []Join        `json:"joins"`
	AffectedColumns   []string      `json:"affected_columns"`
	ReferencedColumns []string      `json:"referenced_columns"`
	SelectStar        bool          `json:"select_star"`
	RiskLevel         risk.Severity `json:"risk_level"`
	Position          risk.Position `json:"position"`
}

Summary is the per-statement result returned by Summarize. It is the JSON-serializable shape embedded in both the CLI envelope's Data field and any future MCP tool result.

Field discipline:

  • Tables, Predicates, Joins, AffectedColumns, ReferencedColumns are emitted as empty JSON arrays rather than null when there are no entries, so consumers can iterate without nil checks.
  • AffectedColumns contains only columns mutated by DML: the INSERT explicit column list, the UPDATE SET targets, and (for INSERT ... ON CONFLICT DO UPDATE) the conflict-resolution SET targets. It is empty for DELETE and SELECT.
  • ReferencedColumns is the full read-and-write footprint: every column the statement names in any expression position (SELECT projection, WHERE, JOIN ON, GROUP BY, HAVING, ORDER BY, RETURNING, ON CONFLICT body, plus subquery and CTE bodies) unioned with AffectedColumns. It is therefore a superset of AffectedColumns whenever any mutated columns are known. Known gap: JOIN USING (col) names are stored as a NameList, not an Expr, and are not surfaced.
  • SelectStar is true when the statement's outermost projection uses "*" or "t.*" (and for INSERT ... SELECT, when the embedded SELECT does). When set, ReferencedColumns is a lower bound: summarize never expands a star against a catalog because it has no schema. Function-argument stars like count(*) do not set this flag — they don't introduce an unenumerated footprint.
  • RiskLevel is the highest severity reported by risk.Analyze for the statement; risk.SeverityInfo is the baseline meaning "no risks detected", not "an info-level risk was detected".

func Parsed

func Parsed(stmts statements.Statements, sql string) []Summary

Parsed produces summaries for already-parsed statements. sql is the original source text — required for positionFromSQL to locate each statement's line/column. Exposed so callers that already invoked parser.Parse (e.g. to also run version.Inspect on the same AST) can reuse the parsed output rather than reparsing. Mirrors the Classify/ClassifyParsed split in package sqlparse; the shorter name avoids stuttering with the package name.

func Summarize

func Summarize(sql string) ([]Summary, error)

Summarize parses sql and returns one Summary per statement, in source order. Parse errors are returned to the caller; partial summaries are not produced.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL