Documentation
¶
Overview ¶
Package sema implements the scope-analysis pass for openCypher queries. It operates on a parsed github.com/FlavioCFOliveira/GoGraph/cypher/ast.Query and enforces variable scoping rules: WITH boundaries, UNWIND introduction, undefined references, and redeclaration within the same scope.
Concurrency: Analyse is a pure function; the returned slice of errors is safe for concurrent reads after the call returns. Input AST nodes are treated as immutable (see github.com/FlavioCFOliveira/GoGraph/cypher/ast package documentation).
Index ¶
- Constants
- Variables
- func CheckParams(inferred map[string]expr.Kind, params map[string]expr.Value) error
- func InferParamTypes(plan ir.LogicalPlan) map[string]expr.Kind
- func InferParamTypesWithResolver(plan ir.LogicalPlan, resolve PropTypeResolver) map[string]expr.Kind
- type CypherType
- type ErrorKind
- type ParamTypeError
- type PropTypeResolver
- type SchemaError
- type Scope
- type ScopeError
- type SemanticError
- type Symbol
- type TypeError
Examples ¶
Constants ¶
const ( // CategorySyntaxError matches the TCK "a SyntaxError should be raised" // step. Used for scope violations such as UndefinedVariable and // VariableTypeConflict. CategorySyntaxError = "SyntaxError" // CategoryTypeError matches the TCK "a TypeError should be raised" step. // Reserved for static type mismatches surfaced by future passes. CategoryTypeError = "TypeError" // SubTypeUndefinedVariable is the canonical TCK sub-type for references // to variables that are not in scope. Produced from KindUndefinedVar and // KindScopeLeak (both surface as "variable not visible here"). SubTypeUndefinedVariable = "UndefinedVariable" // SubTypeVariableTypeConflict is the TCK sub-type for re-introductions // of a name with an incompatible type within the same scope. Produced // from KindRedeclaration. SubTypeVariableTypeConflict = "VariableTypeConflict" // SubTypeInvalidArgumentType is the TCK sub-type for operator/function // argument type mismatches detected at compile time. Reserved for // [TypeError] use. SubTypeInvalidArgumentType = "InvalidArgumentType" // SubTypeInvalidAggregation is the TCK sub-type for an aggregation // used in ORDER BY when the projection does not itself aggregate. SubTypeInvalidAggregation = "InvalidAggregation" // SubTypeVariableAlreadyBound is the TCK sub-type for CREATE on a // previously bound variable with new labels or properties. SubTypeVariableAlreadyBound = "VariableAlreadyBound" // SubTypeColumnNameConflict is the TCK sub-type for a RETURN/WITH // projection that declares duplicate output column names. SubTypeColumnNameConflict = "ColumnNameConflict" // SubTypeUnknownFunction is the canonical TCK sub-type for // references to functions the engine does not implement. SubTypeUnknownFunction = "UnknownFunction" // SubTypeRelationshipUniqueness is the canonical TCK sub-type for // a relationship variable introduced more than once in the same // path pattern. SubTypeRelationshipUniqueness = "RelationshipUniquenessViolation" // SubTypeNegativeIntegerArgument is the canonical TCK sub-type for // a negative integer literal supplied to SKIP or LIMIT. SubTypeNegativeIntegerArgument = "NegativeIntegerArgument" // SubTypeAmbiguousAggregationExpression is the canonical TCK // sub-type for non-grouped references appearing outside an // aggregate call in an aggregating projection item. SubTypeAmbiguousAggregationExpression = "AmbiguousAggregationExpression" // SubTypeNoVariablesInScope is the canonical TCK sub-type for a // star projection with no in-scope variables. SubTypeNoVariablesInScope = "NoVariablesInScope" // SubTypeNoExpressionAlias is the canonical TCK sub-type for a // WITH item that is not a bare Variable and lacks an explicit // alias. SubTypeNoExpressionAlias = "NoExpressionAlias" )
Bolt-compatible error category / sub-type strings raised at compile time by the semantic-analysis pass. They mirror the openCypher TCK expectations:
"a <Category> should be raised at compile time: <SubType>"
See cypher/tck/features/**/*.feature for the full enumeration. Only the subset emitted by Analyse is defined here.
Variables ¶
var IsKnownFunction func(qualifiedLowerName string) bool
IsKnownFunction is an optional hook consulted by Analyse to decide whether a scalar function-call expression refers to a known function. When non-nil, it reports whether the lower-cased qualified function name is registered. The argument is the lower-cased qualified name (namespace components joined to the function name with '.', e.g. "duration.between").
The hook is intentionally a package-level variable rather than an argument so existing call sites do not need to change. It is set by cypher/api.init() to a closure that consults the engine's function registry; sema fails closed (no UnknownFunction reports) when the hook is nil, preserving the pre-hook behaviour.
Functions ¶
func CheckParams ¶
CheckParams validates that every parameter in inferred also appears in params with a compatible Kind. It returns a *ParamTypeError for the first mismatch found, or nil when all checked parameters are type-compatible.
Parameters present in params but absent from inferred are silently accepted (they may be referenced in positions the pass does not yet analyse).
func InferParamTypes ¶
func InferParamTypes(plan ir.LogicalPlan) map[string]expr.Kind
InferParamTypes walks plan looking for Selection nodes whose predicate is an equality comparison involving a parameter reference ($name) and a property access (n.prop). It returns a map from parameter name (without $) to the expected expr.Kind, defaulting to KindString for every property-vs-parameter equality.
It is equivalent to InferParamTypesWithResolver(plan, nil) and is retained for callers (and tests) that have no index information to offer.
func InferParamTypesWithResolver ¶
func InferParamTypesWithResolver(plan ir.LogicalPlan, resolve PropTypeResolver) map[string]expr.Kind
InferParamTypesWithResolver behaves like InferParamTypes but consults resolve to determine the expected kind of a parameter compared against a property. When resolve reports a known kind for the (scanLabel, prop) pair it is used; otherwise the kind falls back to KindString. A nil resolve always falls back.
When the same parameter appears in multiple incompatible contexts the first encountered wins. Parameters used in non-inferrable positions are omitted.
Types ¶
type CypherType ¶
type CypherType uint8
CypherType enumerates the static types that can be inferred for a Cypher expression. The zero value is TypeAny, which is used when no more specific type can be determined (e.g. for variables and parameters in the absence of scope context).
const ( // TypeAny is the top type: any value is assignable to it. Used for // variables, parameters, and expressions whose type cannot be statically // narrowed. TypeAny CypherType = iota // TypeNull is the type of the literal null value. TypeNull // TypeBoolean is the type of boolean expressions. TypeBoolean // TypeInteger is the type of integer literals and integer-valued functions. TypeInteger // TypeFloat is the type of floating-point literals and float-valued // functions, as well as the result of mixed Int+Float arithmetic. TypeFloat // TypeString is the type of string literals and string-valued functions. TypeString // TypeNode is the type of graph node entities. TypeNode // TypeRelationship is the type of graph relationship entities. TypeRelationship // TypePath is the type of graph path values. TypePath // TypeList is the type of list values. The element type is not tracked at // this level of inference; use TypeList for all list expressions. TypeList // TypeMap is the type of map literals and map-valued expressions. TypeMap )
func InferType ¶
func InferType(expr ast.Expression) (CypherType, error)
InferType infers the static CypherType of expr according to openCypher 9 type rules. It returns a non-nil error only when a type violation is detected (e.g. unsupported operand combination for an operator).
For expressions whose type cannot be statically determined (Variable, Parameter, Property, CaseExpression, etc.) the function returns TypeAny and a nil error.
InferType is a pure function; it does not mutate the AST.
Concurrency: safe for concurrent use — no shared mutable state.
func (CypherType) String ¶
func (t CypherType) String() string
String returns the Cypher type name as it would appear in an error message.
type ErrorKind ¶
type ErrorKind string
ErrorKind classifies a scope-analysis violation.
const ( // KindUndefinedVar is reported when an expression references a variable // that has not been introduced by any preceding clause in the current scope. KindUndefinedVar ErrorKind = "UNDEFINED_VAR" // KindRedeclaration is reported when a variable is introduced a second time // within the same scope without a WITH boundary that would shadow it. KindRedeclaration ErrorKind = "REDECLARATION" // KindScopeLeak is reported when a variable introduced inside a sub-scope // (e.g. a list comprehension) is referenced outside that scope. KindScopeLeak ErrorKind = "SCOPE_LEAK" // KindInvalidArgumentType is reported when a literal expression of a // statically known non-boolean type is used as the operand of a logical // operator (AND / OR / XOR / NOT). Variables and other expressions whose // type is only known at runtime are not flagged. KindInvalidArgumentType ErrorKind = "INVALID_ARGUMENT_TYPE" // KindInvalidAggregation is reported when an aggregation function call // appears in an ORDER BY item but the surrounding projection does not // itself contain any aggregation. Aggregations only fold over groups // defined by the projection; standing alone in an ORDER BY is illegal. KindInvalidAggregation ErrorKind = "INVALID_AGGREGATION" // KindVariableAlreadyBound is reported when CREATE re-uses a previously // bound variable AND attempts to add new labels or properties to it. // openCypher 9 §3.5.1: a bound node may be referenced from CREATE (so // the pattern can describe edges around it) but cannot have its labels // or properties augmented. KindVariableAlreadyBound ErrorKind = "VARIABLE_ALREADY_BOUND" // KindColumnNameConflict is reported when a RETURN or WITH projection // declares two columns with the same output name (e.g. // `RETURN 1 AS a, 2 AS a`). openCypher 9 §3.3.3 rejects this at // compile time because the downstream consumer cannot disambiguate // the columns. KindColumnNameConflict ErrorKind = "COLUMN_NAME_CONFLICT" // KindUnknownFunction is reported when a function-call expression // names a function that is not registered in the engine's function // registry and is not a recognised aggregate (count, sum, avg, min, // max, collect, stdev, stdevp, percentileCont, percentileDisc). // openCypher 9 §6.1 requires compile-time rejection of unknown // function calls. KindUnknownFunction ErrorKind = "UNKNOWN_FUNCTION" // KindRelationshipUniqueness is reported when a relationship // variable is introduced more than once within the same path // pattern (e.g. `MATCH (a)-[r]->()-[r]->(a)`). openCypher 9 // §3.3.1.2 forbids this: a single path pattern cannot bind two // distinct edges to the same relationship name. KindRelationshipUniqueness ErrorKind = "RELATIONSHIP_UNIQUENESS" // KindNegativeIntegerArgument is reported when a SKIP or LIMIT // clause is given a negative integer literal. openCypher 9 §3.6 // requires the argument to be a non-negative INTEGER. KindNegativeIntegerArgument ErrorKind = "NEGATIVE_INTEGER_ARGUMENT" // KindAmbiguousAggregationExpression is reported when a projection // item contains an aggregating sub-expression nested inside a larger // expression (e.g. `me.age + count(you.age)`) and a Variable or // Property reference appearing OUTSIDE the aggregate call does not // match any standalone "simple" grouping-key projection item. // openCypher 9 §5.3.3 rejects this at compile time because the // runtime cannot decide which row of the group should supply the // non-grouped reference. KindAmbiguousAggregationExpression ErrorKind = "AMBIGUOUS_AGGREGATION_EXPRESSION" // KindNoVariablesInScope is reported when `RETURN *` or `WITH *` // appears in a projection but no variables are in scope at that // point (openCypher 9 §3.3.2 forbids a star projection with an // empty scope). KindNoVariablesInScope ErrorKind = "NO_VARIABLES_IN_SCOPE" // KindNoExpressionAlias is reported when a WITH projection item is // neither a bare Variable nor aliased via AS. openCypher 9 §5.1.2 // requires every non-Variable WITH item to have an explicit alias // so the downstream scope can name it. KindNoExpressionAlias ErrorKind = "NO_EXPRESSION_ALIAS" )
type ParamTypeError ¶
type ParamTypeError struct {
// Name is the Cypher parameter name (without the leading $).
Name string
// Expected is the Kind inferred from the expression context.
Expected expr.Kind
// Got is the Kind of the value provided by the caller.
Got expr.Kind
}
ParamTypeError is returned by CheckParams when a parameter value's Kind does not match the expected Kind inferred from the query context.
func (*ParamTypeError) Error ¶
func (e *ParamTypeError) Error() string
Error implements the error interface.
type PropTypeResolver ¶
PropTypeResolver returns the declared expr.Kind of a (nodeLabel, property) pair when an authoritative type signal exists for it — in practice, an index whose key type is known. label is empty when the property is read from an unlabelled scan. ok is false when no type is known, in which case the caller keeps its conservative default.
A resolver must be a pure read-only lookup; InferParamTypesWithResolver may call it once per inferrable predicate.
type SchemaError ¶
type SchemaError struct {
// PropertyName is the property key that triggered the mismatch.
PropertyName string
// DeclaredKind is the kind registered in the schema.
DeclaredKind lpg.PropertyKind
// UsedAs is the CypherType of the literal on the other side of the
// comparison.
UsedAs CypherType
// Pos is the source position of the BinaryOp that holds the mismatch.
Pos ast.Position
// Hint is a human-readable suggestion to help the caller fix the query.
Hint string
}
SchemaError is reported by CheckSchema when a property access is statically compared against a literal whose type contradicts the schema-declared kind for that property key.
func CheckSchema ¶
func CheckSchema(q ast.Query, sch *schema.Schema) []SchemaError
CheckSchema validates property accesses in q against the declared schema. If schema is nil, this is a no-op and an empty slice is returned.
The check is applied to every ast.BinaryOp node that has:
- one side being a ast.Property access, AND
- the other side being a literal whose static type can be determined.
When the declared PropertyKind for the property key is incompatible with the literal's type, a SchemaError is appended to the result.
If a property key is not registered in the schema the access is silently skipped (warning-only / partial-schema policy).
CheckSchema is a pure function and safe for concurrent use.
func (*SchemaError) Error ¶
func (e *SchemaError) Error() string
Error implements the error interface.
type Scope ¶
type Scope struct {
// contains filtered or unexported fields
}
Scope is a single layer of the variable-scope stack. Scopes form a parent chain: child scopes inherit visibility of all symbols defined in their ancestors unless a WITH boundary resets the chain.
Scope is not safe for concurrent use; callers must synchronise externally.
func (*Scope) Child ¶
Child creates a child scope that inherits the current scope's visibility. The child shares read access to the parent chain but has its own symbol table, so definitions in the child do not pollute the parent.
func (*Scope) Define ¶
Define introduces a new symbol in this scope. It returns a KindRedeclaration error if the name is already defined in this exact scope (shadowing a parent-scope symbol is permitted and does not error).
func (*Scope) Lookup ¶
Lookup searches for name starting in the current scope and walking up the parent chain. It returns the Symbol and true when found, nil and false otherwise.
func (*Scope) LookupLocal ¶
LookupLocal checks only this scope (no parent walk).
type ScopeError ¶
type ScopeError struct {
// Kind classifies the violation; one of the Kind* constants.
Kind ErrorKind
// Pos is the source position of the offending token or node.
Pos ast.Position
// Message is a human-readable description.
Message string
}
ScopeError is the error type produced by the scope-analysis pass. It implements the standard error interface.
func Analyse ¶
func Analyse(q ast.Query) []ScopeError
Analyse runs the scope-analysis pass on q and returns all scope violations found. An empty (or nil) slice means the query is scope-clean.
Rules enforced:
- MATCH / OPTIONAL MATCH: NodePattern and RelationshipPattern variables are introduced into the current scope. Duplicate variable names within the same scope are an error.
- WHERE: references must be defined in the current scope; no new variables.
- UNWIND … AS x: introduces x into the current scope.
- WITH: acts as a scope boundary — after WITH only the projected names survive. AS aliases create new names; bare variable references must be in scope before the WITH.
- RETURN: each projected expression must reference only defined variables.
- CREATE / MERGE: pattern variables may be new (introduction) or previously defined (re-use). Re-use of an already-defined variable in CREATE is permitted (bound-node reuse); introducing a duplicate in the same scope is an error.
- SET / REMOVE / DELETE: references must be in scope.
- CALL … YIELD: each yielded item introduces a new variable.
- List comprehension / pattern comprehension: variable binding is local to the comprehension; using it outside is a scope leak.
- EXISTS { } / COUNT { } with a full subquery: analysed in an isolated child scope; outer variables are visible inside but inner variables do not leak out.
Example ¶
ExampleAnalyse shows a scope-clean query: Analyse returns an empty slice.
package main
import (
"fmt"
"github.com/FlavioCFOliveira/GoGraph/cypher/parser"
"github.com/FlavioCFOliveira/GoGraph/cypher/sema"
)
func main() {
q, err := parser.Parse("MATCH (n:Person) RETURN n.name")
if err != nil {
fmt.Println("parse error:", err)
return
}
errs := sema.Analyse(q)
fmt.Println("scope errors:", len(errs))
}
Output: scope errors: 0
Example (UndefinedVariable) ¶
ExampleAnalyse_undefinedVariable shows a query that references a variable never introduced by any clause. Analyse reports it as an UNDEFINED_VAR violation.
package main
import (
"fmt"
"github.com/FlavioCFOliveira/GoGraph/cypher/parser"
"github.com/FlavioCFOliveira/GoGraph/cypher/sema"
)
func main() {
q, err := parser.Parse("MATCH (n) RETURN m")
if err != nil {
fmt.Println("parse error:", err)
return
}
errs := sema.Analyse(q)
if len(errs) == 0 {
fmt.Println("query is scope-clean")
return
}
fmt.Println("kind:", errs[0].Kind)
fmt.Println("message:", errs[0].Message)
}
Output: kind: UNDEFINED_VAR message: undefined variable "m"
func ScopeLeakError ¶
func ScopeLeakError(name string, pos ast.Position) *ScopeError
ScopeLeakError constructs a KindScopeLeak ScopeError, returned when a variable introduced inside a sub-scope is referenced outside that scope. It is exported so that callers and future analysis passes can build KindScopeLeak errors with a consistent message format.
func (*ScopeError) Error ¶
func (e *ScopeError) Error() string
Error implements the error interface.
type SemanticError ¶
type SemanticError struct {
// Category is the Bolt error category ("SyntaxError" or "TypeError").
Category string
// SubType is the Bolt error sub-type (e.g. "UndefinedVariable").
SubType string
// Errors holds every scope violation reported by [Analyse] in source
// order. Always non-empty when SemanticError is non-nil.
Errors []ScopeError
}
SemanticError is the engine-facing wrapper around one or more [ScopeError]s. It carries the Bolt-compatible Category/SubType strings expected by the TCK error assertions and embeds the first underlying ScopeError so callers can recover the source position via errors.As.
SemanticError implements the error interface; its message is the message of the first wrapped ScopeError, prefixed with the Bolt category.
Concurrency: SemanticError values are immutable after construction; safe for concurrent reads.
func MapToBolt ¶
func MapToBolt(errs []ScopeError) *SemanticError
MapToBolt converts a slice of [ScopeError]s into a single *SemanticError tagged with the Bolt category/sub-type the TCK expects. It returns nil when errs is empty.
When the slice contains multiple kinds the precedence in [kindMappings] decides which (Category, SubType) pair labels the wrapper; the full error slice is preserved in [SemanticError.Errors] regardless of which mapping was chosen, so callers retain visibility into every violation.
Unknown kinds fall back to ("SyntaxError", "SemanticError") so the engine never returns an unmapped sema failure.
func (*SemanticError) Error ¶
func (e *SemanticError) Error() string
Error implements the error interface. The format is:
"cypher: <Category>.<SubType>: <first underlying ScopeError message>"
func (*SemanticError) Unwrap ¶
func (e *SemanticError) Unwrap() error
Unwrap returns the first underlying ScopeError so errors.As can recover it. Only the first error is exposed because errors.Unwrap is single-valued; callers needing the full set should read [SemanticError.Errors] directly.
type Symbol ¶
type Symbol struct {
// Name is the variable name exactly as written in the query.
Name string
// Pos is the source position where the variable was first introduced.
Pos ast.Position
// Type is a coarse type hint populated by later analysis passes (e.g.
// "node", "relationship", "path", "any"). The scope pass uses the empty
// string as a catch-all; callers must not depend on this value being set.
Type string
}
Symbol records the introduction point of a variable within a scope.
type TypeError ¶
type TypeError struct {
// Op is the operator string (e.g. "+", "AND", "NOT").
Op string
// Left is the inferred type of the left operand. For unary operators it
// holds the operand type and Right is TypeAny.
Left CypherType
// Right is the inferred type of the right operand. Zero (TypeAny) for
// unary operators.
Right CypherType
// Pos is the source position of the operator or expression.
Pos ast.Position
}
TypeError is returned when a binary or unary operator is applied to operand type(s) that are incompatible with the operator under Cypher 9 type rules.