diag

package

v0.1.0 Latest Latest Go to latest Published: Apr 24, 2026 License: Apache-2.0 Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/spilchen/sql-ai-tools

Links

Open Source Insights

Documentation ¶

Overview ¶

Package diag converts CockroachDB parser errors into the structured output.Error type that the CLI envelope exposes to agents. The enrichment has two layers:

Position extraction (position.go): parses the pgerror Detail caret string to compute a 1-based line/column/byte_offset relative to the full SQL input.
Error mapping (this file): extracts the SQLSTATE code, severity, and human-readable message from the parser's error chain via pgerror, then attaches the position.

Index ¶

Constants
func AdjustPosition(pos *output.Position, originalSQL string, translate func(int) int) *output.Position
func CategoryForCode(code string) string
func CharIndexToByteOffset(sql string, charIdx int) int
func ExtractPosition(detail, fullSQL string) *output.Position
func FromClusterError(err error, fullSQL string) output.Error
func FromParseError(err error, fullSQL string) output.Error
func FromTypeError(err error, exprText string, fullSQL string) output.Error
func IsKeyword(token string) bool
func PositionFromByteOffset(fullSQL string, byteOffset int) *output.Position
func Suggest(misspelled string, candidates []string, pos *output.Position) []output.Suggestion
func SuggestKeyword(token string, pos *output.Position) []output.Suggestion

Constants ¶

View Source

const (
	CategorySyntaxError        = "syntax_error"
	CategoryTypeMismatch       = "type_mismatch"
	CategoryUnknownColumn      = "unknown_column"
	CategoryUnknownTable       = "unknown_table"
	CategoryUnknownFunction    = "unknown_function"
	CategoryAmbiguousReference = "ambiguous_reference"
	CategoryPermissionDenied   = "permission_denied"
	CategoryConnectionError    = "connection_error"
	CategoryQueryCanceled      = "query_canceled"
)

Category constants match the design doc (search "Error categorization") and are the wire-format strings agents receive in the JSON envelope.

View Source

const (
	ReasonLevenshtein        = "levenshtein_distance"
	ReasonDamerauLevenshtein = "damerau_levenshtein_distance"
)

ReasonLevenshtein and ReasonDamerauLevenshtein are the metric prefixes embedded in the output.Suggestion.Reason field. Each emitted Reason has the shape "<prefix>_<distance>" — for example "levenshtein_distance_1" or "damerau_levenshtein_distance_2" — so callers branching on the metric should compare with strings.HasPrefix(s.Reason, ReasonLevenshtein+"_") rather than identity. The trailing distance integer is comparable only within the same metric: a Damerau-Levenshtein distance of 1 can correspond to a Levenshtein distance of 2 when the typo is an adjacent transposition.

Variables ¶

This section is empty.

Functions ¶

func AdjustPosition ¶

func AdjustPosition(
	pos *output.Position, originalSQL string, translate func(int) int,
) *output.Position

AdjustPosition translates a Position computed against a stripped SQL buffer back to coordinates in the original input. originalSQL is the pre-strip text; translate maps stripped byte offsets to original byte offsets (typically sqlformat.StripResult.Translate). Returns nil when pos is nil so callers can chain the call unconditionally.

Line and Column are recomputed against originalSQL via the same 1-based / byte-counting convention used by the rest of this package (see lineColumn). Pass a no-op identity translate (or simply skip the call) when the stripper did not modify the input — every other field then re-derives to the same values.

func CategoryForCode ¶

func CategoryForCode(code string) string

CategoryForCode returns the agent-facing category string for the given SQLSTATE code. It tries an exact 5-character match first, then falls back to a 2-character class-level match. If neither matches, it returns the empty string (the Category field in output.Error is omitempty, so unmapped codes simply omit the field from JSON output).

func CharIndexToByteOffset ¶

func CharIndexToByteOffset(sql string, charIdx int) int

CharIndexToByteOffset converts a 0-based UTF-8 character (rune) index within sql into a 0-based byte offset. Used to translate the pgwire protocol's character-based Position field into the byte offset the rest of this package operates on. Negative indices clamp to 0; indices past the rune count clamp to len(sql).

func ExtractPosition ¶

func ExtractPosition(detail, fullSQL string) *output.Position

ExtractPosition parses a pgerror Detail string produced by the CockroachDB parser's PopulateErrorDetails and computes a 1-based line/column Position relative to fullSQL.

The Detail format is:

source SQL:\n<stmt_sql_up_to_error_line>\n<spaces>^

For multi-statement input the Detail contains only the failing statement's SQL fragment. ExtractPosition locates that fragment within fullSQL via strings.LastIndex and adjusts the offset. LastIndex is used because the parser processes statements left-to-right and fails on the last attempted one; if the same fragment appears earlier (e.g. inside a valid statement), the last occurrence is the correct match.

Returns nil when detail is empty, lacks the expected prefix, or does not contain a caret line.

func FromClusterError ¶

func FromClusterError(err error, fullSQL string) output.Error

FromClusterError converts a cluster-side error into a structured output.Error.

When err's chain contains a *pgconn.PgError (a pgwire protocol error from the server), Code, Severity, Message, Category, and Position (when fullSQL is supplied and the server reported a position) are populated from it. Any error whose chain does not contain a *pgconn.PgError falls back to the generic internal_error shape so callers always get a single envelope schema.

fullSQL is the originating statement; pass "" when no statement is associated with the call. The pgwire Position field is a 1-based character index into the original query, so it is only meaningful when fullSQL is provided.

func FromParseError ¶

func FromParseError(err error, fullSQL string) output.Error

FromParseError converts a Go error returned by parser.Parse into a structured output.Error with SQLSTATE code, severity, message, and source position. fullSQL is the complete SQL input; it is used to compute position relative to the full input when the parser operated on a per-statement fragment.

The returned Error always has Code, Severity, and Message populated. Position is nil when the error lacks the expected pgerror Detail format (e.g. non-parser errors passed by mistake).

func FromTypeError ¶

func FromTypeError(err error, exprText string, fullSQL string) output.Error

FromTypeError converts a type-check error into a structured output.Error. exprText is the formatted expression that failed (used to locate the error position within fullSQL via substring match). Position is nil when exprText cannot be found.

func IsKeyword ¶

func IsKeyword(token string) bool

IsKeyword reports whether token (case-insensitively) is a recognized SQL keyword. lexbase.KeywordsCategories keys are stored lower-cased, so the lookup converts in place.

func PositionFromByteOffset ¶

func PositionFromByteOffset(fullSQL string, byteOffset int) *output.Position

PositionFromByteOffset converts a 0-based byte offset within fullSQL into a 1-based line/column Position. Returns nil if byteOffset is negative.

func Suggest ¶

func Suggest(misspelled string, candidates []string, pos *output.Position) []output.Suggestion

Suggest returns up to three "did you mean?" fix suggestions for the misspelled token, ranked by Levenshtein edit distance against candidates. Suggestions carry a Reason prefix of ReasonLevenshtein.

Suggestions are filtered by a length-scaled threshold: short names permit fewer edits (e.g. distance 2 between "id" and "od" is too loose to be useful), longer names tolerate up to three. The full rule lives in maxDistance.

Returns nil when:

misspelled is empty;
pos is nil (no byte range can be computed, so the suggestion cannot be applied programmatically by the agent);
candidates is empty;
no candidate is within the length-scaled threshold.

Per-candidate filtering: empty strings and case-insensitive exact matches against misspelled are skipped (they don't represent a useful fix). If every candidate is filtered, the function returns nil along the "no candidate within threshold" path.

The returned Range covers [pos.ByteOffset, pos.ByteOffset + len(misspelled)) in the original SQL input. Confidence is in [0, 1] and rounded to two decimals.

candidates is read but not retained, and Suggestion.Replacement borrows the underlying string from candidates (Go strings are immutable, so this is safe for typical []string callers).

func SuggestKeyword ¶

func SuggestKeyword(token string, pos *output.Position) []output.Suggestion

SuggestKeyword returns up to three "did you mean?" fix suggestions when token is close to a recognized SQL keyword under Damerau-Levenshtein distance. The cap is min(2, maxDistance(len(token))), so tokens up to 3 characters are capped at 1 edit and longer tokens at 2 edits — see keywordSuggestionDistanceCap for the rationale, and maxDistance for the per-length scaling. Suggestions carry a Reason prefix of ReasonDamerauLevenshtein, so callers branching on Reason can tell this metric apart from the classic Levenshtein suggestions emitted by Suggest.

SuggestKeyword is the keyword-typo equivalent of Suggest, used by FromParseError to enrich syntax errors.

Returns nil under the same conditions as Suggest (empty token, nil pos, no candidate within distance). It additionally returns nil when token is itself a recognized SQL keyword: the error is then a misuse-of-keyword situation (wrong keyword for this grammatical slot, e.g. `INSERT FROM t` flags FROM because INSERT requires INTO), not a typo, so a "did you mean?" would either echo the same word back or fire on a coincidentally-close keyword.

Token preconditions: SuggestKeyword does not validate that token looks like an identifier; passing "3abc" or "(*)" will run the DP against the full keyword candidate list and may return a coincidental hit. The intended caller is FromParseError, which uses identifierAt to filter non-identifier offsets before invoking; standalone callers should apply equivalent filtering.

Types ¶

This section is empty.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL