spark

package
v3.8.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2026 License: Apache-2.0 Imports: 5 Imported by: 0

Documentation

Overview

Package spark implements the Apache Spark SQL dialect for cel2sql.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Dialect

type Dialect struct{}

Dialect implements dialect.Dialect for Apache Spark SQL.

func New

func New() *Dialect

New creates a new Spark SQL dialect.

func (*Dialect) ConvertRegex

func (d *Dialect) ConvertRegex(re2Pattern string) (string, bool, error)

ConvertRegex converts an RE2 regex pattern to Spark/Java regex format. Spark uses java.util.regex.Pattern, which is largely a superset of RE2 for the safe patterns cel2sql accepts.

func (*Dialect) MaxIdentifierLength

func (d *Dialect) MaxIdentifierLength() int

MaxIdentifierLength returns 128 for Spark (Hive-derived limit).

func (*Dialect) Name

func (d *Dialect) Name() dialect.Name

Name returns the dialect name.

func (*Dialect) ReservedKeywords

func (d *Dialect) ReservedKeywords() map[string]bool

ReservedKeywords returns the set of reserved SQL keywords for Spark.

func (*Dialect) SupportsIndexAnalysis

func (d *Dialect) SupportsIndexAnalysis() bool

SupportsIndexAnalysis returns false. Spark indexing is highly storage-layer-specific (Delta Z-order vs Iceberg sort vs plain Parquet) and out of scope for v1.

func (*Dialect) SupportsJSONB

func (d *Dialect) SupportsJSONB() bool

SupportsJSONB returns false as Spark has no separate JSONB type.

func (*Dialect) SupportsNativeArrays

func (d *Dialect) SupportsNativeArrays() bool

SupportsNativeArrays returns true as Spark has native ARRAY<T> types.

func (*Dialect) SupportsRegex

func (d *Dialect) SupportsRegex() bool

SupportsRegex returns true as Spark supports regex via RLIKE.

func (*Dialect) ValidateFieldName

func (d *Dialect) ValidateFieldName(name string) error

ValidateFieldName validates a field name against Spark naming rules.

func (*Dialect) WriteArrayLength

func (d *Dialect) WriteArrayLength(w *strings.Builder, dimension int, writeExpr func() error) error

WriteArrayLength writes COALESCE(size(expr), 0) for Spark. Spark's size() returns -1 for null; COALESCE+ifnull-style handling matches cel2sql semantics where size(null) should be 0.

func (*Dialect) WriteArrayLiteralClose

func (d *Dialect) WriteArrayLiteralClose(w *strings.Builder)

WriteArrayLiteralClose writes the Spark array literal closing.

func (*Dialect) WriteArrayLiteralOpen

func (d *Dialect) WriteArrayLiteralOpen(w *strings.Builder)

WriteArrayLiteralOpen writes the Spark array literal opening (array().

func (*Dialect) WriteArrayMembership

func (d *Dialect) WriteArrayMembership(w *strings.Builder, writeElem, writeArray func() error) error

WriteArrayMembership writes a Spark array membership test using array_contains().

func (*Dialect) WriteArraySubqueryExprClose

func (d *Dialect) WriteArraySubqueryExprClose(w *strings.Builder)

WriteArraySubqueryExprClose closes the collect_list() argument list.

func (*Dialect) WriteArraySubqueryOpen

func (d *Dialect) WriteArraySubqueryOpen(w *strings.Builder)

WriteArraySubqueryOpen writes the Spark array-building subquery prefix. Spark has no ARRAY(SELECT ...) constructor; collect_list() inside a subquery is the closest equivalent.

func (*Dialect) WriteBytesLiteral

func (d *Dialect) WriteBytesLiteral(w *strings.Builder, value []byte) error

WriteBytesLiteral writes a Spark SQL byte literal as X'HEX'.

func (*Dialect) WriteCastToNumeric

func (d *Dialect) WriteCastToNumeric(w *strings.Builder)

WriteCastToNumeric writes a Spark numeric coercion suffix. Spark does not support PostgreSQL-style `::TYPE` postfix casts, so we use the arithmetic coercion `+ 0` (same convention MySQL and SQLite use): `'5' + 0` evaluates as a number in Spark, ensuring JSON text extractions are compared numerically rather than lexicographically.

func (*Dialect) WriteContains

func (d *Dialect) WriteContains(w *strings.Builder, writeHaystack, writeNeedle func() error) error

WriteContains writes Spark string contains using LOCATE() > 0. LOCATE(substr, str) returns 1-based position or 0 when not found.

func (*Dialect) WriteDuration

func (d *Dialect) WriteDuration(w *strings.Builder, value int64, unit string)

WriteDuration writes a Spark INTERVAL literal.

func (*Dialect) WriteEmptyTypedArray

func (d *Dialect) WriteEmptyTypedArray(w *strings.Builder, typeName string)

WriteEmptyTypedArray writes an empty Spark typed array.

func (*Dialect) WriteEpochExtract

func (d *Dialect) WriteEpochExtract(w *strings.Builder, writeExpr func() error) error

WriteEpochExtract writes UNIX_TIMESTAMP(expr) for Spark.

func (*Dialect) WriteExtract

func (d *Dialect) WriteExtract(w *strings.Builder, part string, writeExpr func() error, writeTZ func() error) error

WriteExtract writes a Spark EXTRACT expression. Spark dayofweek() returns 1=Sunday..7=Saturday; CEL convention is 0=Sunday..6=Saturday.

func (*Dialect) WriteInterval

func (d *Dialect) WriteInterval(w *strings.Builder, writeValue func() error, unit string) error

WriteInterval writes a Spark INTERVAL expression.

func (*Dialect) WriteJSONArrayElements

func (d *Dialect) WriteJSONArrayElements(w *strings.Builder, _, _ bool, writeExpr func() error) error

WriteJSONArrayElements writes Spark JSON array expansion as EXPLODE(from_json(...)). The converter uses this in `FROM <here> AS iter`, so the result must be a set-returning expression. EXPLODE turns the parsed array into a relation of element rows. Element type is fixed to STRING in v1; comparisons coerce via arithmetic context (see WriteCastToNumeric).

func (*Dialect) WriteJSONArrayLength

func (d *Dialect) WriteJSONArrayLength(w *strings.Builder, writeExpr func() error) error

WriteJSONArrayLength writes COALESCE(size(from_json(expr, 'ARRAY<STRING>')), 0) for Spark.

func (*Dialect) WriteJSONArrayMembership

func (d *Dialect) WriteJSONArrayMembership(w *strings.Builder, _ string, writeExpr func() error) error

WriteJSONArrayMembership writes Spark JSON array membership as a scalar subquery that scans elements. The converter writes `lhs = ` before this, so the result is `lhs = (SELECT col FROM (SELECT EXPLODE(from_json(rhs, 'ARRAY<STRING>')) AS col) t)`. This mirrors SQLite's `lhs = (SELECT value FROM json_each(...))` pattern; both dialects rely on the subquery returning at most one match for the comparison to succeed.

func (*Dialect) WriteJSONExistence

func (d *Dialect) WriteJSONExistence(w *strings.Builder, _ bool, fieldName string, writeBase func() error) error

WriteJSONExistence writes a Spark JSON key existence check.

func (*Dialect) WriteJSONExtractPath

func (d *Dialect) WriteJSONExtractPath(w *strings.Builder, pathSegments []string, writeRoot func() error) error

WriteJSONExtractPath writes a Spark JSON path existence check using get_json_object.

func (*Dialect) WriteJSONFieldAccess

func (d *Dialect) WriteJSONFieldAccess(w *strings.Builder, writeBase func() error, fieldName string, _ bool) error

WriteJSONFieldAccess writes Spark JSON field access using get_json_object. Spark's get_json_object always returns a string; the same function is used for both intermediate and final access (Spark has no JSON_QUERY equivalent).

func (*Dialect) WriteJoin

func (d *Dialect) WriteJoin(w *strings.Builder, writeArray, writeDelim func() error) error

WriteJoin writes Spark array join using array_join().

func (*Dialect) WriteLikeEscape

func (d *Dialect) WriteLikeEscape(w *strings.Builder)

WriteLikeEscape writes the Spark SQL LIKE escape clause.

func (*Dialect) WriteListIndex

func (d *Dialect) WriteListIndex(w *strings.Builder, writeArray, writeIndex func() error) error

WriteListIndex writes Spark 0-indexed array access.

func (*Dialect) WriteListIndexConst

func (d *Dialect) WriteListIndexConst(w *strings.Builder, writeArray func() error, index int64) error

WriteListIndexConst writes a Spark constant array index (0-indexed).

func (*Dialect) WriteNestedJSONArrayMembership

func (d *Dialect) WriteNestedJSONArrayMembership(w *strings.Builder, writeExpr func() error) error

WriteNestedJSONArrayMembership writes Spark nested JSON array membership.

func (*Dialect) WriteParamPlaceholder

func (d *Dialect) WriteParamPlaceholder(w *strings.Builder, _ int)

WriteParamPlaceholder writes a Spark SQL positional parameter (?). The paramIndex argument is intentionally unused: Spark JDBC uses positional ? placeholders, so the converter relies on parameter order.

func (*Dialect) WriteRegexMatch

func (d *Dialect) WriteRegexMatch(w *strings.Builder, writeTarget func() error, pattern string, _ bool) error

WriteRegexMatch writes a Spark SQL regex match using RLIKE. Spark regex uses Java pattern syntax; (?i) inline flag is supported, so caseInsensitive is folded into the pattern by ConvertRegex.

func (*Dialect) WriteSplit

func (d *Dialect) WriteSplit(w *strings.Builder, writeStr, writeDelim func() error) error

WriteSplit writes Spark string split using split().

func (*Dialect) WriteSplitWithLimit

func (d *Dialect) WriteSplitWithLimit(w *strings.Builder, writeStr, writeDelim func() error, limit int64) error

WriteSplitWithLimit writes Spark string split with limit (3-arg split, Spark 3.x+).

func (*Dialect) WriteStringConcat

func (d *Dialect) WriteStringConcat(w *strings.Builder, writeLHS, writeRHS func() error) error

WriteStringConcat writes Spark string concatenation using the concat() function. concat() works in all Spark versions; the || operator was added in 3.0+.

func (*Dialect) WriteStringLiteral

func (d *Dialect) WriteStringLiteral(w *strings.Builder, value string)

WriteStringLiteral writes a Spark SQL string literal with ” escaping.

func (*Dialect) WriteStructClose

func (d *Dialect) WriteStructClose(w *strings.Builder)

WriteStructClose writes the Spark struct literal closing.

func (*Dialect) WriteStructOpen

func (d *Dialect) WriteStructOpen(w *strings.Builder)

WriteStructOpen writes the Spark struct literal opening using struct().

func (*Dialect) WriteTimestampArithmetic

func (d *Dialect) WriteTimestampArithmetic(w *strings.Builder, op string, writeTS, writeDur func() error) error

WriteTimestampArithmetic writes Spark timestamp arithmetic.

func (*Dialect) WriteTimestampCast

func (d *Dialect) WriteTimestampCast(w *strings.Builder, writeExpr func() error) error

WriteTimestampCast writes a Spark CAST to TIMESTAMP.

func (*Dialect) WriteTypeName

func (d *Dialect) WriteTypeName(w *strings.Builder, celTypeName string)

WriteTypeName writes a Spark SQL type name for CAST expressions.

func (*Dialect) WriteUnnest

func (d *Dialect) WriteUnnest(w *strings.Builder, writeSource func() error) error

WriteUnnest writes Spark explode-style unnesting via lateral view replacement. Note: cel2sql wraps this in an ARRAY-building subquery; Spark uses array higher-order functions (transform/filter/exists/forall) which don't need UNNEST. For the SELECT FROM UNNEST() pattern Spark requires a lateral view. We emit EXPLODE() and rely on the converter's subquery scaffolding.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL