Documentation
¶
Overview ¶
Package opt contains the Cockroach SQL optimizer. The optimizer transforms the AST of a SQL query into a physical query plan for execution. Naive execution of a SQL query can be prohibitively expensive, because SQL specifies the desired results and not how to achieve them. A given SQL query can have thousands of equivalent query plans with vastly different execution times. The Cockroach optimizer is cost-based, meaning that it enumerates some or all of these alternate plans and chooses the one with the lowest estimated cost.
Overview ¶
SQL query planning is often described in terms of 8 modules:
1. Properties
2. Stats
3. Cost Model
4. Memo
5. Transforms
6. Prep
7. Rewrite
8. Search
Note Prep, Rewrite and Search could be considered phases, though this document will refer to all 8 uniformly as modules. Memo is a technique for compactly representing the forest of trees generated during Search. Stats, Properties, Cost Model and Transformations are modules that power the Prep, Rewrite and Search phases.
SQL query text
|
+-----v-----+ - parse SQL text according to grammar
| Parse | - report syntax errors
+-----+-----+
|
(ast)
|
+-----v-----+ - fold constants, check types, resolve
| Analyze | names
+-----+-----+ - annotate tree with semantic info
| - report semantic errors
(ast+)
+-------+ |
| Stats +----->-----v-----+ - normalize tree with cost-agnostic
+-------+ | Prep | transforms (placeholders present)
+-->-----+-----+ - compute initial properties
| | - retrieve and attach stats
| (expr) - done once per PREPARE
| |
+------------+ | +-----v-----+ - capture placeholder values / timestamps
| Transforms |--+--> Rewrite | - normalize tree with cost-agnostic
+------------+ | +-----+-----+ transforms (placeholders not present)
| | - done once per EXECUTE
| (expr)
| |
+-->-----v-----+ - generate equivalent expression trees
+------------+ | Search | - find lowest cost physical plan
| Cost Model +----->-----+-----+ - includes DistSQL physical planning
+------------+ |
(physical plan)
|
+-----v-----+
| Execution |
+-----------+
The opt-related packages implement portions of these modules, while other parts are implemented elsewhere. For example, other sql packages are used to perform name resolution and type checking which are part of the Analyze phase.
Parse ¶
The parse phase is not discussed in this document. It transforms the SQL query text into an abstract syntax tree (AST).
Analyze ¶
The analyze phase ensures that the AST obeys all SQL semantic rules, and annotates the AST with information that will be used by later phases. In addition, some simple transforms are applied to the AST in order to simplify handling in later phases. Semantic rules are many and varied; this document describes a few major categories of semantic checks and rewrites.
"Name resolution" binds table, column, and other references. Each name must be matched to the appropriate schema object, and an error reported if no matching object can be found. Name binding can result in AST annotations that make it easy for other components to find the target object, or rewrites that replace unbound name nodes with new nodes that are easier to handle (e.g. IndexedVar).
"Constant folding" rewrites expressions that have constant inputs. For example, 1+1 would be folded to 2. Cockroach's typing rules assume that constants have been folded, as there are some expressions that would otherwise produce a semantic error if they are not first folded.
"Type inference" automatically determines the return data type of various SQL expressions, based on the types of inputs, as well as the context in which the expression is used. The AST is annotated with the resolved types for later use.
"Type checking" ensures that all inputs to SQL expressions and statements have legal static types. For example, the CONCAT function only accepts zero or more arguments that are statically typed as strings. Violation of the typing rules produces a semantic error.
Properties ¶
Properties are meta-information that are computed (and sometimes stored) for each node in an expression. Properties power transformations and optimization.
"Logical properties" describe the structure and content of data returned by an expression, such as whether relational output columns can contain nulls, or the data type of a scalar expression. Two expressions which are logically equivalent according to the rules of the relational algebra will return the same set of rows and columns, and will have the same set of logical properties. However, the order of the rows, naming of the columns, and other presentational aspects of the result are not governed by the logical properties.
"Physical properties" are interesting characteristics of an expression that impact its layout, presentation, or location, but not its logical content. Examples include row order, column naming, and data distribution (physical location of data ranges). Physical properties exist outside of the relational algebra, and arise from both the SQL query itself (e.g. the non-relational ORDER BY operator) and by the selection of specific implementations during optimization (e.g. a merge join requires the inputs to be sorted in a particular order).
Properties can be "required" or "derived". A required property is one specified by the SQL query text. For example, a DISTINCT clause is a required property on the set of columns of the corresponding projection -- that the tuple of columns forms a key (unique values) in the results. A derived property is one derived by the optimizer for an expression based on the properties of the child expressions. For example:
SELECT k+1 FROM kv
Once the ordering of "k" is known from kv's descriptor, the same ordering property can be derived for k+1. During optimization, for each expression with required properties, the optimizer will look at child expressions to check whether their actual properties (which can be derived) match the requirement. If they don't, the optimizer must introduce an "enforcer" operator in the plan that provides the required property. For example, an ORDER BY clause creates a required ordering property that can cause the optimizer to add a Sort operator as an enforcer of that property.
Stats ¶
Table statistics power both the cost model and the search of alternate query plans. A simple example of where statistics guide the search of alternate query plans is in join ordering:
SELECT * FROM a JOIN b
In the absence of other opportunities, this might be implemented as a hash join. With a hash join, we want to load the smaller set of rows (either from a or b) into the hash table and then query that table while looping through the larger set of rows. How do we know whether a or b is larger? We keep statistics about the cardinality of a and b, i.e. the (approximate) number of different values.
Simple table cardinality is sufficient for the above query but fails in other queries. Consider:
SELECT * FROM a JOIN b ON a.x = b.x WHERE a.y > 10
Table statistics might indicate that a contains 10x more data than b, but the predicate a.y > 10 is filtering a chunk of the table. What we care about is whether the result of the scan *after* filtering returns more rows than the scan of b. This can be accomplished by making a determination of the selectivity of the predicate a.y > 10 (the % of rows it will filter) and then multiplying that selectivity by the cardinality of a. The common technique for estimating selectivity is to collect a histogram on a.y.
The collection of table statistics occurs prior to receiving the query. As such, the statistics are necessarily out of date and may be inaccurate. The system may bound the inaccuracy by recomputing the stats based on how fast a table is being modified. Or the system may notice when stat estimations are inaccurate during query execution.
Cost Model ¶
The cost model takes an expression as input and computes an estimated "cost" to execute that expression. The unit of "cost" can be arbitrary, though it is desirable if it has some real world meaning such as expected execution time. What is required is for the costs of different query plans to be comparable. The optimizer seeks to find the shortest expected execution time for a query and uses cost as a proxy for execution time.
Cost is roughly calculated by estimating how much time each node in the expression tree will use to process all results and modeling how data flows through the expression tree. Table statistics are used to power cardinality estimates of base relations which in term power cardinality estimates of intermediate relations. This is accomplished by propagating histograms of column values from base relations up through intermediate nodes (e.g. combining histograms from the two join inputs into a single histogram). Operator-specific computations model the network, disk and CPU costs. The cost model should include data layout and the specific operating environment. For example, network RTT in one cluster might be vastly different than another.
Because the cost for a query plan is an estimate, there is an associated error. This error might be implicit in the cost, or could be explicitly tracked. One advantage to explicitly tracking the expected error is that it can allow selecting a higher cost but lower expected error plan over a lower cost but higher expected error plan. Where does the error come from? One source is the innate inaccuracy of stats: selectivity estimation might be wildly off due to an outlier value. Another source is the accumulated build up of estimation errors the higher up in the query tree. Lastly, the cost model is making an estimation for the execution time of an operation such as a network RTT. This estimate can also be wildly inaccurate due to bursts of activity.
Search finds the lowest cost plan using dynamic programming. That imposes a restriction on the cost model: it must exhibit optimal substructure. An optimal solution can be constructed from optimal solutions of its sub-problems.
Memo ¶
Memo is a data structure for efficiently storing a forest of query plans. Conceptually, the memo is composed of a numbered set of equivalency classes called groups where each group contains a set of logically equivalent expressions. The different expressions in a single group are called memo expressions (memo-ized expressions). A memo expression has a list of child groups as its children rather than a list of individual expressions. The forest is composed of every possible combination of parent expression with its children, recursively applied.
Memo expressions can be relational (e.g. join) or scalar (e.g. <). Operators are always both logical (specify results) and physical (specify results and a particular implementation). This means that even a "raw" unoptimized expression tree can be executed (naively). Both relational and scalar operators are uniformly represented as nodes in memo expression trees, which facilitates tree pattern matching and replacement.
Because memo groups contain logically equivalent expressions, all the memo expressions in a group share the same logical properties. However, it's possible for two logically equivalent expressions to be placed in different memo groups. This occurs because determining logical equivalency of two relational expressions is too complex to perform 100% correctly. A correctness failure (i.e. considering two expressions logically equivalent when they are not) results in invalid transformations and invalid plans. But placing two logically equivalent expressions in different groups has a much gentler failure mode: the memo and transformations are less efficient. Expressions within the memo may have different physical properties. For example, a memo group might contain both hash join and merge join expressions which produce the same set of output rows, but produce them in different orders.
Expressions are inserted into the memo by the factory, which ensure that expressions have been fully normalized before insertion (see the Prep section for more details). A new group is created only when unique normalized expressions are created by the factory during construction or rewrite of the tree. Uniqueness is determined by computing the fingerprint for a memo expression, which is simply the expression operator and its list of child groups. For example, consider this query:
SELECT * FROM a, b WHERE a.x = b.x
After insertion into the memo, the memo would contain these six groups:
6: [inner-join [1 2 5]] 5: [eq [3 4]] 4: [variable b.x] 3: [variable a.x] 2: [scan b] 1: [scan a]
The fingerprint for the inner-join expression is [inner-join [1 2 5]]. The memo maintains a map from expression fingerprint to memo group which allows quick determination of whether the normalized form of an expression already exists in the memo.
The normalizing factory will never add more than one expression to a memo group. But the explorer (see Search section for more details) does add denormalized expressions to existing memo groups, since oftentimes one of these equivalent, but denormalized expressions will have a lower cost than the initial normalized expression added by the factory. For example, the join commutativity transformation expands the memo like this:
6: [inner-join [1 2 5]] [inner-join [2 1 5]] 5: [eq [3 4]] 4: [variable b.x] 3: [variable a.x] 2: [scan b] 1: [scan a]
Notice that there are now two expressions in memo group 6. The coster (see Cost Model section for more details) will estimate the execution cost of each expression, and the optimizer will select the lowest cost alternative.
Transforms ¶
Transforms convert an input expression tree into zero or more logically equivalent trees. Transforms consist of two parts: a "match pattern" and a "replace pattern". Together, the match pattern and replace pattern are called a "rule". Transform rules are categorized as "normalization" or "exploration" rules.
If an expression in the tree matches the match pattern, then a new expression will be constructed according to the replace pattern. Note that "replace" means the new expression is a logical replacement for the existing expression, not that the existing expression needs to physically be replaced. Depending on the context, the existing expression may be discarded, or it may be retained side- by-side with the new expression in the memo group.
Normalization rules are cost-agnostic, as they are always considered to be beneficial. All normalization rules are implemented by the normalizing factory, which does its best to map all logically equivalent expression trees to a single canonical form from which searches can branch out. See the Prep section for more details.
Exploration rules generate equivalent expression trees that must be costed in order to determine the lowest cost alternative. All exploration rules are implemented by the explorer, which is optimized to efficiently enumerate all possible expression tree combinations in the memo in order to look for rule matches. When it finds a match, the explorer applies the rule and adds an equivalent expression to the existing memo group. See the Search section for more details.
Some examples of transforms:
Join commutativity Swaps the order of the inputs to an inner join. SELECT * FROM a, b => SELECT * FROM b, a Join associativity Reorders the children of a parent and child join SELECT * FROM (SELECT * FROM a, b), c => SELECT * FROM (SELECT * FROM a, c), b Predicate pushdown Moves predicates below joins SELECT * FROM a, b USING (x) WHERE a.x < 10 => SELECT * FROM (SELECT * FROM a WHERE a.x < 10), b USING (x) Join elimination Removes unnecessary joins based on projected columns and foreign keys. SELECT a.x FROM a, b USING (x) => SELECT a.x FROM a Distinct/group-by elimination Removes unnecessary distinct/group-by operations based on keys. SELECT DISTINCT a.x FROM a => SELECT a.x FROM a Predicate inference Adds predicates based on filter conditions. SELECT * FROM a, b USING (x) => SELECT * FROM a, b USING (x) WHERE a.x IS NOT NULL AND b.x IS NOT NULL Decorrelation Replaces correlated subqueries with semi-join, anti-join and apply ops. Scan to index scan Transforms scan operator into one or more index scans on covering indexes. Inner join to merge join Generates alternate merge-join operator from default inner-join operator.
Much of the optimizer's rule matching and application code is generated by a tool called Optgen, short for "optimizer generator". Optgen is a domain- specific language (DSL) that provides an intuitive syntax for defining transform rules. Here is an example:
[NormalizeEq] (Eq $left:^(Variable) $right:(Variable) ) => (Eq $right $left)
The expression above the arrow is the match pattern and the expression below the arrow is the replace pattern. This example rule will match Eq expressions which have a left input which is not a Variable operator and a right input which is a Variable operator. The replace pattern will trigger a replacement that reverses the two inputs. In addition, custom match and replace functions can be defined in order to run arbitrary Go code.
Prep ¶
Prep (short for "prepare") is the first phase of query optimization, in which the annotated AST is transformed into a single normalized "expression tree". The optimizer directly creates the expression tree in the memo data structure rather than first constructing an intermediate data structure. A forest of equivalent trees will be generated in later phases, but at the end of the prep phase, the memo contains just one normalized tree that is logically equivalent to the SQL query.
During the prep phase, placeholder values are not yet known, so normalization cannot go as far as it can during later phases. However, this also means that the resulting expression tree can be cached in response to a PREPARE statement, and then be reused as a starting point each time an EXECUTE statement provides new placeholder values.
The memo expression tree is constructed by the normalizing factory, which does its best to map all logically equivalent expression trees to a single canonical form from which searches can branch out. The factory has an interface similar to this:
ConstructConst(value PrivateID) GroupID ConstructAnd(conditions ListID) GroupID ConstructInnerJoin(left GroupID, right GroupID, on GroupID) GroupID
The factory methods construct a memo expression tree bottom-up, with each memo group becoming an input to operators higher in the tree.
As each expression is constructed by the factory, it transitively applies normalization rules defined for that expression type. This may result in the construction of a different type of expression than what was requested. If, after normalization, the expression is already part of the memo, then construction is a no-op. Otherwise, a new memo group is created, with the normalized expression as its first and only expression.
By applying normalization rules as the expression tree is constructed, the factory can avoid creating intermediate expressions; often, "replacement" of an existing expression means it's never created to begin with.
During Prep, all columns used by the SQL query are given a numeric index that is unique across the query. Column numbering involves assigning every base column and non-trivial projection in a query a unique, query-specific index. Giving each column a unique index allows the expression nodes mentioned above to track input and output columns, or really any set of columns during Prep and later phases, using a bitmap (FastIntSet). The bitmap representation allows fast determination of compatibility between expression nodes and is utilized by transforms to determine the legality of such operations.
The Prep phase also computes logical properties, such as the input and output columns of each (sub-)expression, equivalent columns, not-null columns and functional dependencies. These properties are computed bottom-up as part of constructing the expression tree.
Rewrite ¶
Rewrite is the second phase of query optimization. Placeholder values are available starting at this phase, so new normalization rules will typically match once constant values are substituted for placeholders. As mentioned in the previous section, the expression tree produced by the Prep phase can be cached and serve as the starting point for the Rewrite phase. In addition, the Rewrite phase takes a set of physical properties that are required from the result, such as row ordering and column naming.
The Rewrite and Search phases have significant overlap. Both phases perform transformations on the expression tree. However, Search preserves the matched expression side-by-side with the new expression, while Rewrite simply discards the matched expression, since the new expression is assumed to always be better. In addition, the application of exploration rules may trigger additional normalization rules, which may in turn trigger additional exploration rules.
Together, the Rewrite and Search phases are responsible for finding the expression that can provide the required set of physical properties at the lowest possible execution cost. That mandate is recursively applied; in other words, each subtree is also optimized with respect to a set of physical properties required by its parent, and the goal is to find the lowest cost equivalent expression. An example of an "interior" optimization goal is a merge join that requires its inner child to return its rows in a specific order. The same group can be (and sometimes is) optimized multiple times, but with different required properties each time.
Search ¶
Search is the final phase of optimization. Search begins with a single normalized tree that was created by the earlier phases. For each group, the "explorer" component generates alternative expressions that are logically equivalent to the normalized expression, but which may have very different execution plans. The "coster" component computes the estimated cost for each alternate expression. The optimizer remembers the "best expression" for each group, for each set of physical properties required of that group.
Optimization of a group proceeds in two phases:
1. Compute the cost of any previously generated expressions. That set initially contains only the group's normalized expression, but exploration may yield additional expressions. Costing a parent expression requires that the children first be costed, so costing triggers a recursive traversal of the memo groups.
2. Invoke the explorer to generate new equivalent expressions for the group. Those new expressions are costed once the optimizer loops back to the first phase.
In order to avoid a combinatorial explosion in the number of expression trees, the optimizer utilizes the memo structure. Due to the large number of possible plans for some queries, the optimizer cannot always explore all of them. Therefore, it proceeds in multiple iterative "passes", until either it hits some configured time or resource limit, or until an exhaustive search is complete. As long as the search is allowed to complete, the best plan will be found, just as in Volcano and Cascades.
The optimizer uses several techniques to maximize the chance that it finds the best plan early on:
- As with Cascades, the search is highly directed, interleaving exploration with costing in order to prune parts of the tree that cannot yield a better plan. This contrasts with Volcano, which first generates all possible plans in one global phase (exploration), and then determines the lowest cost plan in another global phase (costing).
- The optimizer uses a simple hill climbing heuristic to make greedy progress towards the best plan. During a given pass, the optimizer visits each group and performs costing and exploration for that group. As long as doing that yields a lower cost expression for the group, the optimizer will repeat those steps. This finds a local maxima for each group during the current pass.
In order to avoid costing or exploring parts of the search space that cannot yield a better plan, the optimizer performs aggressive "branch and bound pruning". Each group expression is optimized with respect to a "budget" parameter. As soon as this budget is exceeded, optimization of that expression terminates. It's not uncommon for large sections of the search space to never be costed or explored due to this pruning. Example:
innerJoin left: cost = 50 right: cost = 75 on: cost = 25
If the current best expression for the group has a cost of 100, then the optimizer does not need to cost or explore the "on" child of the join, and does not need to cost the join itself. This is because the combined cost of the left and right children already exceeds 100.
Index ¶
- Constants
- Variables
- func AggregateIgnoresDuplicates(op Operator) bool
- func AggregateIgnoresNulls(op Operator) bool
- func AggregateIsNeverNull(op Operator) bool
- func AggregateIsNeverNullOnNonNullInput(op Operator) bool
- func AggregateIsNullOnEmpty(op Operator) bool
- func AggregatesCanMerge(inner, outer Operator) bool
- func BoolOperatorRequiresNotNullArgs(op Operator) bool
- func IsAggregateOp(e Expr) bool
- func IsBinaryOp(e Expr) bool
- func IsBoolOp(e Expr) bool
- func IsComparisonOp(e Expr) bool
- func IsCompositeInsensitiveOp(e Expr) bool
- func IsConstValueOp(e Expr) bool
- func IsDDLOp(e Expr) bool
- func IsEnforcerOp(e Expr) bool
- func IsFloatOp(e Expr) bool
- func IsGroupingOp(e Expr) bool
- func IsIntOp(e Expr) bool
- func IsJoinApplyOp(e Expr) bool
- func IsJoinNonApplyOp(e Expr) bool
- func IsJoinOp(e Expr) bool
- func IsListItemOp(e Expr) bool
- func IsListOp(e Expr) bool
- func IsMutationOp(e Expr) bool
- func IsPrivateOp(e Expr) bool
- func IsRelationalOp(e Expr) bool
- func IsScalarOp(e Expr) bool
- func IsScalarPropsOp(e Expr) bool
- func IsSetOp(e Expr) bool
- func IsTelemetryOp(e Expr) bool
- func IsUnaryOp(e Expr) bool
- func IsWindowOp(e Expr) bool
- func IsWithBindingOp(e Expr) bool
- func JoinTypeToUseCounter(op Operator) telemetry.Counter
- func ScalarOperatorTransmitsNulls(op Operator) bool
- type AliasedColumn
- type ColList
- type ColMap
- type ColSet
- func (s *ColSet) Add(col ColumnID)
- func (s ColSet) Contains(col ColumnID) bool
- func (s ColSet) Copy() ColSet
- func (s ColSet) Difference(rhs ColSet) ColSet
- func (s *ColSet) DifferenceWith(rhs ColSet)
- func (s ColSet) Empty() bool
- func (s ColSet) Equals(rhs ColSet) bool
- func (s ColSet) ForEach(f func(col ColumnID))
- func (s ColSet) Intersection(rhs ColSet) ColSet
- func (s *ColSet) IntersectionWith(rhs ColSet)
- func (s ColSet) Intersects(rhs ColSet) bool
- func (s ColSet) Len() int
- func (s ColSet) Next(startVal ColumnID) (ColumnID, bool)
- func (s *ColSet) Remove(col ColumnID)
- func (s ColSet) SingleColumn() ColumnID
- func (s ColSet) String() string
- func (s ColSet) SubsetOf(rhs ColSet) bool
- func (s ColSet) ToList() ColList
- func (s ColSet) Union(rhs ColSet) ColSet
- func (s *ColSet) UnionWith(rhs ColSet)
- type ColumnID
- type ColumnMeta
- type Expr
- type MDDepName
- type Metadata
- func (md *Metadata) AddColumn(alias string, typ *types.T) ColumnID
- func (md *Metadata) AddDependency(name MDDepName, ds cat.DataSource, priv privilege.Kind)
- func (md *Metadata) AddSchema(sch cat.Schema) SchemaID
- func (md *Metadata) AddSequence(seq cat.Sequence) SequenceID
- func (md *Metadata) AddTable(tab cat.Table, alias *tree.TableName) TableID
- func (md *Metadata) AddUserDefinedType(typ *types.T)
- func (md *Metadata) AddView(v cat.View)
- func (md *Metadata) AddWithBinding(id WithID, expr Expr)
- func (md *Metadata) AllDataSourceNames(fullyQualifiedName func(ds cat.DataSource) (cat.DataSourceName, error)) (tables, sequences, views []tree.TableName, _ error)
- func (md *Metadata) AllTables() []TableMeta
- func (md *Metadata) AllUserDefinedTypes() []*types.T
- func (md *Metadata) AllViews() []cat.View
- func (md *Metadata) CheckDependencies(ctx context.Context, catalog cat.Catalog) (upToDate bool, err error)
- func (md *Metadata) ColumnMeta(colID ColumnID) *ColumnMeta
- func (md *Metadata) CopyFrom(from *Metadata, copyScalarFn func(Expr) Expr)
- func (md *Metadata) DuplicateTable(tabID TableID, remapColumnIDs func(e ScalarExpr, colMap ColMap) ScalarExpr) TableID
- func (md *Metadata) Init()
- func (md *Metadata) NextUniqueID() UniqueID
- func (md *Metadata) NumColumns() int
- func (md *Metadata) QualifiedAlias(colID ColumnID, fullyQualify bool, catalog cat.Catalog) string
- func (md *Metadata) Schema(schID SchemaID) cat.Schema
- func (md *Metadata) Sequence(seqID SequenceID) cat.Sequence
- func (md *Metadata) SetTableAnnotation(tabID TableID, tabAnnID TableAnnID, ann interface{})
- func (md *Metadata) Table(tabID TableID) cat.Table
- func (md *Metadata) TableAnnotation(tabID TableID, annID TableAnnID) interface{}
- func (md *Metadata) TableMeta(tabID TableID) *TableMeta
- func (md *Metadata) UpdateTableMeta(tables map[cat.StableID]cat.Table)
- func (md *Metadata) WithBinding(id WithID) Expr
- type MutableExpr
- type OpaqueMetadata
- type Operator
- type OptionalColList
- type Ordering
- type OrderingColumn
- type RuleName
- type ScalarExpr
- type ScalarRank
- type SchemaID
- type SequenceID
- type TableAnnID
- type TableID
- type TableMeta
- func (tm *TableMeta) AddCheckConstraintsStats(colID ColumnID, colStats interface{})
- func (tm *TableMeta) AddComputedCol(colID ColumnID, computedCol ScalarExpr)
- func (tm *TableMeta) AddIndexPartitionLocality(ord cat.IndexOrdinal, ps *partition.PrefixSorter)
- func (tm *TableMeta) AddPartialIndexPredicate(ord cat.IndexOrdinal, pred ScalarExpr)
- func (tm *TableMeta) ComputedColExpr(id ColumnID) (_ ScalarExpr, ok bool)
- func (tm *TableMeta) IndexColumns(indexOrd int) ColSet
- func (tm *TableMeta) IndexColumnsMapInverted(indexOrd int) ColSet
- func (tm *TableMeta) IndexKeyColumns(indexOrd int) ColSet
- func (tm *TableMeta) IndexKeyColumnsMapInverted(indexOrd int) ColSet
- func (tm *TableMeta) IndexPartitionLocality(ord cat.IndexOrdinal, index cat.Index, evalCtx *tree.EvalContext) (ps *partition.PrefixSorter, ok bool)
- func (tm *TableMeta) OrigCheckConstraintsStats(colID ColumnID) (origColumnStatistic interface{}, ok bool)
- func (tm *TableMeta) PartialIndexPredicate(ord cat.IndexOrdinal) (pred ScalarExpr, ok bool)
- func (tm *TableMeta) PartialIndexPredicatesUnsafe() map[cat.IndexOrdinal]ScalarExpr
- func (tm *TableMeta) SetConstraints(constraints ScalarExpr)
- func (tm *TableMeta) VirtualComputedColumns() ColSet
- type UniqueID
- type ViewDep
- type ViewDeps
- type ViewTypeDeps
- type WithID
Constants ¶
const DefaultJoinOrderLimit = 8
DefaultJoinOrderLimit denotes the default limit on the number of joins to reorder.
const MaxReorderJoinsLimit = 63
MaxReorderJoinsLimit is the maximum number of joins which can be reordered.
const SaveTablesDatabase = "savetables"
SaveTablesDatabase is the name of the database where tables created by the saveTableNode are stored.
Variables ¶
var AggregateOpReverseMap = map[Operator]string{
ArrayAggOp: "array_agg",
AvgOp: "avg",
BitAndAggOp: "bit_and",
BitOrAggOp: "bit_or",
BoolAndOp: "bool_and",
BoolOrOp: "bool_or",
ConcatAggOp: "concat_agg",
CountOp: "count",
CorrOp: "corr",
CountRowsOp: "count_rows",
CovarPopOp: "covar_pop",
CovarSampOp: "covar_samp",
RegressionAvgXOp: "regr_avgx",
RegressionAvgYOp: "regr_avgy",
RegressionInterceptOp: "regr_intercept",
RegressionR2Op: "regr_r2",
RegressionSlopeOp: "regr_slope",
RegressionSXXOp: "regr_sxx",
RegressionSXYOp: "regr_sxy",
RegressionSYYOp: "regr_syy",
RegressionCountOp: "regr_count",
MaxOp: "max",
MinOp: "min",
SumIntOp: "sum_int",
SumOp: "sum",
SqrDiffOp: "sqrdiff",
VarianceOp: "variance",
StdDevOp: "stddev",
XorAggOp: "xor_agg",
JsonAggOp: "json_agg",
JsonbAggOp: "jsonb_agg",
JsonObjectAggOp: "json_object_agg",
JsonbObjectAggOp: "jsonb_object_agg",
StringAggOp: "string_agg",
ConstAggOp: "any_not_null",
ConstNotNullAggOp: "any_not_null",
AnyNotNullAggOp: "any_not_null",
PercentileDiscOp: "percentile_disc_impl",
PercentileContOp: "percentile_cont_impl",
VarPopOp: "var_pop",
StdDevPopOp: "stddev_pop",
STMakeLineOp: "st_makeline",
STUnionOp: "st_union",
STCollectOp: "st_collect",
STExtentOp: "st_extent",
}
AggregateOpReverseMap maps from an optimizer operator type to the name of an aggregation function.
var AggregateOperators = [...]Operator{ AnyNotNullAggOp, ArrayAggOp, AvgOp, BitAndAggOp, BitOrAggOp, BoolAndOp, BoolOrOp, ConcatAggOp, ConstAggOp, ConstNotNullAggOp, CorrOp, CountOp, CountRowsOp, CovarPopOp, CovarSampOp, FirstAggOp, JsonAggOp, JsonObjectAggOp, JsonbAggOp, JsonbObjectAggOp, MaxOp, MinOp, PercentileContOp, PercentileDiscOp, RegressionAvgXOp, RegressionAvgYOp, RegressionCountOp, RegressionInterceptOp, RegressionR2Op, RegressionSXXOp, RegressionSXYOp, RegressionSYYOp, RegressionSlopeOp, STCollectOp, STExtentOp, STMakeLineOp, STUnionOp, SqrDiffOp, StdDevOp, StdDevPopOp, StringAggOp, SumOp, SumIntOp, VarPopOp, VarianceOp, XorAggOp, }
var BinaryOpReverseMap = map[Operator]treebin.BinaryOperatorSymbol{ BitandOp: treebin.Bitand, BitorOp: treebin.Bitor, BitxorOp: treebin.Bitxor, PlusOp: treebin.Plus, MinusOp: treebin.Minus, MultOp: treebin.Mult, DivOp: treebin.Div, FloorDivOp: treebin.FloorDiv, ModOp: treebin.Mod, PowOp: treebin.Pow, ConcatOp: treebin.Concat, LShiftOp: treebin.LShift, RShiftOp: treebin.RShift, FetchValOp: treebin.JSONFetchVal, FetchTextOp: treebin.JSONFetchText, FetchValPathOp: treebin.JSONFetchValPath, FetchTextPathOp: treebin.JSONFetchTextPath, }
BinaryOpReverseMap maps from an optimizer operator type to a semantic tree binary operator type.
var BinaryOperators = [...]Operator{ BitandOp, BitorOp, BitxorOp, ConcatOp, DivOp, FetchTextOp, FetchTextPathOp, FetchValOp, FetchValPathOp, FloorDivOp, LShiftOp, MinusOp, ModOp, MultOp, PlusOp, PowOp, RShiftOp, }
var BoolOperators = [...]Operator{ AndOp, AnyOp, AnyScalarOp, BBoxCoversOp, BBoxIntersectsOp, ContainedByOp, ContainsOp, EqOp, ExistsOp, FalseOp, FiltersOp, FiltersItemOp, GeOp, GtOp, ILikeOp, InOp, IsOp, IsNotOp, IsTupleNotNullOp, IsTupleNullOp, JsonAllExistsOp, JsonExistsOp, JsonSomeExistsOp, LeOp, LikeOp, LtOp, NeOp, NotOp, NotILikeOp, NotInOp, NotLikeOp, NotRegIMatchOp, NotRegMatchOp, NotSimilarToOp, OrOp, OverlapsOp, RangeOp, RegIMatchOp, RegMatchOp, SimilarToOp, TrueOp, }
var ComparisonOpMap [treecmp.NumComparisonOperatorSymbols]Operator
ComparisonOpMap maps from a semantic tree comparison operator type to an optimizer operator type.
var ComparisonOpReverseMap = map[Operator]treecmp.ComparisonOperatorSymbol{ EqOp: treecmp.EQ, LtOp: treecmp.LT, GtOp: treecmp.GT, LeOp: treecmp.LE, GeOp: treecmp.GE, NeOp: treecmp.NE, InOp: treecmp.In, NotInOp: treecmp.NotIn, LikeOp: treecmp.Like, NotLikeOp: treecmp.NotLike, ILikeOp: treecmp.ILike, NotILikeOp: treecmp.NotILike, SimilarToOp: treecmp.SimilarTo, NotSimilarToOp: treecmp.NotSimilarTo, RegMatchOp: treecmp.RegMatch, NotRegMatchOp: treecmp.NotRegMatch, RegIMatchOp: treecmp.RegIMatch, NotRegIMatchOp: treecmp.NotRegIMatch, IsOp: treecmp.IsNotDistinctFrom, IsNotOp: treecmp.IsDistinctFrom, ContainsOp: treecmp.Contains, ContainedByOp: treecmp.ContainedBy, JsonExistsOp: treecmp.JSONExists, JsonSomeExistsOp: treecmp.JSONSomeExists, JsonAllExistsOp: treecmp.JSONAllExists, OverlapsOp: treecmp.Overlaps, BBoxCoversOp: treecmp.RegMatch, BBoxIntersectsOp: treecmp.Overlaps, }
ComparisonOpReverseMap maps from an optimizer operator type to a semantic tree comparison operator type.
var ComparisonOperators = [...]Operator{ BBoxCoversOp, BBoxIntersectsOp, ContainedByOp, ContainsOp, EqOp, GeOp, GtOp, ILikeOp, InOp, IsOp, IsNotOp, JsonAllExistsOp, JsonExistsOp, JsonSomeExistsOp, LeOp, LikeOp, LtOp, NeOp, NotILikeOp, NotInOp, NotLikeOp, NotRegIMatchOp, NotRegMatchOp, NotSimilarToOp, OverlapsOp, RegIMatchOp, RegMatchOp, SimilarToOp, }
var CompositeInsensitiveOperators = [...]Operator{ ContainedByOp, ContainsOp, DivOp, EqOp, FloorDivOp, GeOp, GtOp, InOp, IsOp, IsNotOp, LeOp, LtOp, MinusOp, ModOp, MultOp, NeOp, NotInOp, PlusOp, PowOp, ScalarListOp, UnaryCbrtOp, UnaryMinusOp, UnaryPlusOp, UnarySqrtOp, VariableOp, }
var DDLOperators = [...]Operator{ CreateTableOp, CreateViewOp, OpaqueDDLOp, }
var EnforcerOperators = [...]Operator{ DistributeOp, SortOp, }
var FloatOperators = [...]Operator{ CumeDistOp, PercentRankOp, }
var GroupingOperators = [...]Operator{ DistinctOnOp, EnsureDistinctOnOp, EnsureUpsertDistinctOnOp, GroupByOp, ScalarGroupByOp, UpsertDistinctOnOp, }
var IntOperators = [...]Operator{ CountRowsOp, DenseRankOp, NtileOp, RankOp, RowNumberOp, }
var JoinApplyOperators = [...]Operator{ AntiJoinApplyOp, InnerJoinApplyOp, LeftJoinApplyOp, SemiJoinApplyOp, }
var JoinNonApplyOperators = [...]Operator{ AntiJoinOp, FullJoinOp, InnerJoinOp, LeftJoinOp, RightJoinOp, SemiJoinOp, }
var JoinOperators = [...]Operator{ AntiJoinOp, AntiJoinApplyOp, FullJoinOp, InnerJoinOp, InnerJoinApplyOp, LeftJoinOp, LeftJoinApplyOp, RightJoinOp, SemiJoinOp, SemiJoinApplyOp, }
var ListItemOperators = [...]Operator{ AggregationsItemOp, FKChecksItemOp, FiltersItemOp, KVOptionsItemOp, ProjectionsItemOp, UniqueChecksItemOp, WindowsItemOp, ZipItemOp, }
var ListOperators = [...]Operator{ AggregationsOp, FKChecksOp, FiltersOp, KVOptionsOp, ProjectionsOp, ScalarListOp, UniqueChecksOp, WindowsOp, ZipOp, }
var MutationOperators = [...]Operator{ AlterRangeRelocateOp, AlterTableRelocateOp, AlterTableSplitOp, AlterTableUnsplitOp, AlterTableUnsplitAllOp, CreateTableOp, CreateViewOp, DeleteOp, InsertOp, OpaqueDDLOp, OpaqueMutationOp, UpdateOp, UpsertOp, }
var NegateOpMap = map[Operator]Operator{ EqOp: NeOp, LtOp: GeOp, GtOp: LeOp, LeOp: GtOp, GeOp: LtOp, NeOp: EqOp, InOp: NotInOp, NotInOp: InOp, LikeOp: NotLikeOp, NotLikeOp: LikeOp, ILikeOp: NotILikeOp, NotILikeOp: ILikeOp, SimilarToOp: NotSimilarToOp, NotSimilarToOp: SimilarToOp, RegMatchOp: NotRegMatchOp, NotRegMatchOp: RegMatchOp, RegIMatchOp: NotRegIMatchOp, NotRegIMatchOp: RegIMatchOp, IsOp: IsNotOp, IsNotOp: IsOp, }
NegateOpMap maps from a comparison operator type to its negated operator type, as if the Not operator was applied to it. Some comparison operators, like Contains and JsonExists, do not have negated versions.
var OpTelemetryCounters [NumOperators]telemetry.Counter
OpTelemetryCounters stores telemetry counters for operators marked with the "Telemetry" tag. All other operators have nil values.
var PrivateOperators = [...]Operator{ AlterRangeRelocatePrivateOp, AlterTableRelocatePrivateOp, AlterTableSplitPrivateOp, CancelPrivateOp, ControlJobsPrivateOp, ControlSchedulesPrivateOp, CreateStatisticsPrivateOp, CreateTablePrivateOp, CreateViewPrivateOp, ExplainPrivateOp, ExportPrivateOp, FKChecksItemPrivateOp, FakeRelPrivateOp, FunctionPrivateOp, GroupingPrivateOp, IndexJoinPrivateOp, InvertedFilterPrivateOp, InvertedJoinPrivateOp, JoinPrivateOp, LookupJoinPrivateOp, MergeJoinPrivateOp, MutationPrivateOp, OpaqueRelPrivateOp, OrdinalityPrivateOp, RecursiveCTEPrivateOp, ScanPrivateOp, SequenceSelectPrivateOp, SetPrivateOp, ShowTracePrivateOp, SubqueryPrivateOp, TopKPrivateOp, UniqueChecksItemPrivateOp, ValuesPrivateOp, WindowPrivateOp, WindowsItemPrivateOp, WithPrivateOp, WithScanPrivateOp, ZigzagJoinPrivateOp, }
var RelationalOperators = [...]Operator{ AlterRangeRelocateOp, AlterTableRelocateOp, AlterTableSplitOp, AlterTableUnsplitOp, AlterTableUnsplitAllOp, AntiJoinOp, AntiJoinApplyOp, CancelQueriesOp, CancelSessionsOp, ControlJobsOp, ControlSchedulesOp, CreateStatisticsOp, CreateTableOp, CreateViewOp, DeleteOp, DistinctOnOp, EnsureDistinctOnOp, EnsureUpsertDistinctOnOp, ExceptOp, ExceptAllOp, ExplainOp, ExportOp, FakeRelOp, FullJoinOp, GroupByOp, IndexJoinOp, InnerJoinOp, InnerJoinApplyOp, InsertOp, IntersectOp, IntersectAllOp, InvertedFilterOp, InvertedJoinOp, LeftJoinOp, LeftJoinApplyOp, LimitOp, LocalityOptimizedSearchOp, LookupJoinOp, Max1RowOp, MemoCycleTestRelOp, MergeJoinOp, NormCycleTestRelOp, OffsetOp, OpaqueDDLOp, OpaqueMutationOp, OpaqueRelOp, OrdinalityOp, PlaceholderScanOp, ProjectOp, ProjectSetOp, RecursiveCTEOp, RightJoinOp, ScalarGroupByOp, ScanOp, SelectOp, SemiJoinOp, SemiJoinApplyOp, SequenceSelectOp, ShowTraceForSessionOp, TopKOp, UnionOp, UnionAllOp, UpdateOp, UpsertOp, UpsertDistinctOnOp, ValuesOp, WindowOp, WithOp, WithScanOp, ZigzagJoinOp, }
var ScalarOperators = [...]Operator{}/* 157 elements not displayed */
var ScalarPropsOperators = [...]Operator{ AggregationsItemOp, FiltersItemOp, ProjectionsItemOp, WindowsItemOp, ZipItemOp, }
var SetOperators = [...]Operator{ ExceptOp, ExceptAllOp, IntersectOp, IntersectAllOp, LocalityOptimizedSearchOp, UnionOp, UnionAllOp, }
var TelemetryOperators = [...]Operator{ AntiJoinApplyOp, DistinctOnOp, DistributeOp, EnsureDistinctOnOp, EnsureUpsertDistinctOnOp, GroupByOp, InnerJoinApplyOp, LeftJoinApplyOp, ProjectSetOp, ScalarGroupByOp, SemiJoinApplyOp, SortOp, UpsertDistinctOnOp, ZigzagJoinOp, }
var UnaryOpReverseMap = map[Operator]tree.UnaryOperatorSymbol{ UnaryMinusOp: tree.UnaryMinus, UnaryComplementOp: tree.UnaryComplement, UnarySqrtOp: tree.UnarySqrt, UnaryCbrtOp: tree.UnaryCbrt, UnaryPlusOp: tree.UnaryPlus, }
UnaryOpReverseMap maps from an optimizer operator type to a semantic tree unary operator type.
var UnaryOperators = [...]Operator{ UnaryCbrtOp, UnaryComplementOp, UnaryMinusOp, UnaryPlusOp, UnarySqrtOp, }
var WindowOpReverseMap = map[Operator]string{
RankOp: "rank",
RowNumberOp: "row_number",
DenseRankOp: "dense_rank",
PercentRankOp: "percent_rank",
CumeDistOp: "cume_dist",
NtileOp: "ntile",
LagOp: "lag",
LeadOp: "lead",
FirstValueOp: "first_value",
LastValueOp: "last_value",
NthValueOp: "nth_value",
}
WindowOpReverseMap maps from an optimizer operator type to the name of a window function.
var WindowOperators = [...]Operator{ CumeDistOp, DenseRankOp, FirstValueOp, LagOp, LastValueOp, LeadOp, NthValueOp, NtileOp, PercentRankOp, RankOp, RowNumberOp, }
var WithBindingOperators = [...]Operator{ DeleteOp, InsertOp, RecursiveCTEOp, UpdateOp, UpsertOp, WithOp, }
Functions ¶
func AggregateIgnoresDuplicates ¶
AggregateIgnoresDuplicates returns true if the output of the given aggregate operator does not change when duplicate rows are added to the input.
func AggregateIgnoresNulls ¶
AggregateIgnoresNulls returns true if the given aggregate operator ignores rows where its first argument evaluates to NULL. In other words, it always evaluates to the same result even if those rows are filtered. For example:
SELECT string_agg(x, y)
FROM (VALUES ('foo', ','), ('bar', ','), (NULL, ',')) t(x, y)
In this example, the NULL row can be removed from the input, and the string_agg function still returns the same result. Contrast this to the array_agg function:
SELECT array_agg(x)
FROM (VALUES ('foo'), (NULL), ('bar')) t(x)
If the NULL row is removed here, array_agg returns {foo,bar} instead of {foo,NULL,bar}.
func AggregateIsNeverNull ¶
AggregateIsNeverNull returns true if the given aggregate operator never returns NULL, even if the input is empty, or one more more inputs are NULL.
func AggregateIsNeverNullOnNonNullInput ¶
AggregateIsNeverNullOnNonNullInput returns true if the given aggregate operator never returns NULL when the input set contains at least one non-NULL value. This is true of most aggregates.
For multi-input aggregations, returns true if the aggregate is never NULL when all inputs have at least a non-NULL value (though not necessarily on the same input row).
func AggregateIsNullOnEmpty ¶
AggregateIsNullOnEmpty returns true if the given aggregate operator returns NULL when the input set contains no values. This group of aggregates turns out to be the inverse of AggregateIsNeverNull in practice.
func AggregatesCanMerge ¶
AggregatesCanMerge returns true if the given inner and outer operators can be replaced with a single equivalent operator, assuming the outer operator is aggregating on the inner and that both operators are unordered. In other words, the inner-outer aggregate pair forms a valid "decomposition" of a single aggregate. For example, the following pairs of queries are equivalent:
SELECT sum(s) FROM (SELECT sum(y) FROM xy GROUP BY x) AS f(s); SELECT sum(y) FROM xy; SELECT sum_int(c) FROM (SELECT count(y) FROM xy GROUP BY x) AS f(c); SELECT count(y) FROM xy;
Note: some aggregates like StringAggOp are decomposable in theory, but in practice can not be easily merged as in the examples above.
func BoolOperatorRequiresNotNullArgs ¶
BoolOperatorRequiresNotNullArgs returns true if the operator can never evaluate to true if one of its children is NULL.
func IsAggregateOp ¶
func IsBinaryOp ¶
func IsComparisonOp ¶
func IsConstValueOp ¶
func IsEnforcerOp ¶
func IsGroupingOp ¶
func IsJoinApplyOp ¶
func IsJoinNonApplyOp ¶
func IsListItemOp ¶
func IsMutationOp ¶
func IsPrivateOp ¶
func IsRelationalOp ¶
func IsScalarOp ¶
func IsScalarPropsOp ¶
func IsTelemetryOp ¶
func IsWindowOp ¶
func IsWithBindingOp ¶
func JoinTypeToUseCounter ¶
JoinTypeToUseCounter returns the JoinTypeXyzUseCounter for the given join operator.
func ScalarOperatorTransmitsNulls ¶
ScalarOperatorTransmitsNulls returns true if the given scalar operator always returns NULL when at least one of its inputs is NULL.
Types ¶
type AliasedColumn ¶
AliasedColumn specifies the label and id of a column.
type ColList ¶
type ColList []ColumnID
ColList is a list of column IDs.
TODO(radu): perhaps implement a FastIntList with the same "small" representation as FastIntMap but with a slice for large cases.
func (ColList) Equals ¶
Equals returns true if this column list has the same columns as the given column list, in the same order.
type ColMap ¶
type ColMap = util.FastIntMap
ColMap provides a 1:1 mapping from one column id to another. It is used by operators that need to match columns from its inputs.
type ColSet ¶
type ColSet struct {
// contains filtered or unexported fields
}
ColSet efficiently stores an unordered set of column ids.
func MakeColSet ¶
MakeColSet returns a set initialized with the given values.
func TranslateColSet ¶
TranslateColSet is used to translate a ColSet from one set of column IDs to an equivalent set. This is relevant for set operations such as UNION, INTERSECT and EXCEPT, and can be used to map a ColSet defined on the left relation to an equivalent ColSet on the right relation (or between any two relations with a defined column mapping).
For example, suppose we have a UNION with the following column mapping:
Left: 1, 2, 3 Right: 4, 5, 6 Out: 7, 8, 9
Here are some possible calls to TranslateColSet and their results:
TranslateColSet(ColSet{1, 2}, Left, Right) -> ColSet{4, 5}
TranslateColSet(ColSet{5, 6}, Right, Out) -> ColSet{8, 9}
TranslateColSet(ColSet{9}, Out, Right) -> ColSet{6}
Any columns in the input set that do not appear in the from list are ignored.
Even when all the columns in the input set appear in the from list, it is possible for the input and output sets to have different cardinality. Consider the following case:
SELECT x, x, y FROM xyz UNION SELECT a, b, c FROM abc
TranslateColSet(ColSet{x, y}, {x, x, y}, {a, b, c}) returns ColSet{a, b, c}.
Conversely, TranslateColSet(ColSet{a, b, c}, {a, b, c}, {x, x, y}) returns ColSet{x, y}.
func TranslateColSetStrict ¶
TranslateColSetStrict is a version of TranslateColSet which requires that all columns in the input set appear in the from list.
func (ColSet) Difference ¶
Difference returns the elements of s that are not in rhs as a new set.
func (*ColSet) DifferenceWith ¶
DifferenceWith removes any elements in rhs from this set.
func (ColSet) Intersection ¶
Intersection returns the intersection of s and rhs as a new set.
func (*ColSet) IntersectionWith ¶
IntersectionWith removes any columns not in rhs from this set.
func (ColSet) Intersects ¶
Intersects returns true if s has any elements in common with rhs.
func (ColSet) Next ¶
Next returns the first value in the set which is >= startVal. If there is no such column, the second return value is false.
func (*ColSet) Remove ¶
Remove removes a column from the set. No-op if the column is not in the set.
func (ColSet) SingleColumn ¶
SingleColumn returns the single column in s. Panics if s does not contain exactly one column.
func (ColSet) String ¶
String returns a list representation of elements. Sequential runs of positive numbers are shown as ranges. For example, for the set {1, 2, 3 5, 6, 10}, the output is "(1-3,5,6,10)".
type ColumnID ¶
type ColumnID int32
ColumnID uniquely identifies the usage of a column within the scope of a query. ColumnID 0 is reserved to mean "unknown column". See the comment for Metadata for more details.
type ColumnMeta ¶
type ColumnMeta struct {
// MetaID is the identifier for this column that is unique within the query
// metadata.
MetaID ColumnID
// Alias is the best-effort name of this column. Since the same column in a
// query can have multiple names (using aliasing), one of those is chosen to
// be used for pretty-printing and debugging. This might be different than
// what is stored in the physical properties and is presented to end users.
Alias string
// Type is the scalar SQL type of this column.
Type *types.T
// Table is the base table to which this column belongs.
// If the column was synthesized (i.e. no base table), then it is 0.
Table TableID
}
ColumnMeta stores information about one of the columns stored in the metadata.
type Expr ¶
type Expr interface {
// Op returns the operator type of the expression.
Op() Operator
// ChildCount returns the number of children of the expression.
ChildCount() int
// Child returns the nth child of the expression.
Child(nth int) Expr
// Private returns operator-specific data. Callers are expected to know the
// type and format of the data, which will differ from operator to operator.
// For example, an operator may choose to return one of its fields, or perhaps
// a pointer to itself, or nil if there is nothing useful to return.
Private() interface{}
// String returns a human-readable string representation for the expression
// that can be used for debugging and testing.
String() string
}
Expr is a node in an expression tree. It offers methods to traverse and inspect the tree. Each node in the tree has an enumerated operator type, zero or more children, and an optional private value. The entire tree can be easily visited using a pattern like this:
var visit func(e Expr)
visit := func(e Expr) {
for i, n := 0, e.ChildCount(); i < n; i++ {
visit(e.Child(i))
}
}
type MDDepName ¶
type MDDepName struct {
// contains filtered or unexported fields
}
MDDepName stores either the unresolved DataSourceName or the StableID from the query that was used to resolve a data source.
func DepByName ¶
func DepByName(name *cat.DataSourceName) MDDepName
DepByName is used with AddDependency when the data source was looked up using a data source name.
type Metadata ¶
type Metadata struct {
// contains filtered or unexported fields
}
Metadata assigns unique ids to the columns, tables, and other metadata used for global identification within the scope of a particular query. These ids tend to be small integers that can be efficiently stored and manipulated.
Within a query, every unique column and every projection should be assigned a unique column id. Additionally, every separate reference to a table in the query should get a new set of output column ids.
For example, consider the query:
SELECT x FROM a WHERE y > 0
There are 2 columns in the above query: x and y. During name resolution, the above query becomes:
SELECT [0] FROM a WHERE [1] > 0 -- [0] -> x -- [1] -> y
An operator is allowed to reuse some or all of the column ids of an input if:
- For every output row, there exists at least one input row having identical values for those columns.
- OR if no such input row exists, there is at least one output row having NULL values for all those columns (e.g. when outer join NULL-extends).
For example, is it safe for a Select to use its input's column ids because it only filters rows. Likewise, pass-through column ids of a Project can be reused.
For an example where columns cannot be reused, consider the query:
SELECT * FROM a AS l JOIN a AS r ON (l.x = r.y)
In this query, `l.x` is not equivalent to `r.x` and `l.y` is not equivalent to `r.y`. Therefore, we need to give these columns different ids.
func (*Metadata) AddColumn ¶
AddColumn assigns a new unique id to a column within the query and records its alias and type. If the alias is empty, a "column<ID>" alias is created.
func (*Metadata) AddDependency ¶
AddDependency tracks one of the catalog data sources on which the query depends, as well as the privilege required to access that data source. If the Memo using this metadata is cached, then a call to CheckDependencies can detect if the name resolves to a different data source now, or if changes to schema or permissions on the data source has invalidated the cached metadata.
func (*Metadata) AddSequence ¶
func (md *Metadata) AddSequence(seq cat.Sequence) SequenceID
AddSequence adds the sequence to the metadata, returning a SequenceID that can be used to retrieve it.
func (*Metadata) AddTable ¶
AddTable indexes a new reference to a table within the query. Separate references to the same table are assigned different table ids (e.g. in a self-join query). All columns are added to the metadata. If mutation columns are present, they are added after active columns.
The ExplicitCatalog/ExplicitSchema fields of the table's alias are honored so that its original formatting is preserved for error messages, pretty-printing, etc.
func (*Metadata) AddUserDefinedType ¶
AddUserDefinedType adds a user defined type to the metadata for this query.
func (*Metadata) AddWithBinding ¶
AddWithBinding associates a WithID to its bound expression.
func (*Metadata) AllDataSourceNames ¶
func (md *Metadata) AllDataSourceNames( fullyQualifiedName func(ds cat.DataSource) (cat.DataSourceName, error), ) (tables, sequences, views []tree.TableName, _ error)
AllDataSourceNames returns the fully qualified names of all datasources referenced by the metadata.
func (*Metadata) AllTables ¶
AllTables returns the metadata for all tables. The result must not be modified.
func (*Metadata) AllUserDefinedTypes ¶
AllUserDefinedTypes returns all user defined types contained in this query.
func (*Metadata) AllViews ¶
AllViews returns the metadata for all views. The result must not be modified.
func (*Metadata) CheckDependencies ¶
func (md *Metadata) CheckDependencies( ctx context.Context, catalog cat.Catalog, ) (upToDate bool, err error)
CheckDependencies resolves (again) each data source on which this metadata depends, in order to check that all data source names resolve to the same objects, and that the user still has sufficient privileges to access the objects. If the dependencies are no longer up-to-date, then CheckDependencies returns false.
This function cannot swallow errors and return only a boolean, as it may perform KV operations on behalf of the transaction associated with the provided catalog, and those errors are required to be propagated.
func (*Metadata) ColumnMeta ¶
func (md *Metadata) ColumnMeta(colID ColumnID) *ColumnMeta
ColumnMeta looks up the metadata for the column associated with the given column id. The same column can be added multiple times to the query metadata and associated with multiple column ids.
func (*Metadata) CopyFrom ¶
CopyFrom initializes the metadata with a copy of the provided metadata. This metadata can then be modified independent of the copied metadata.
Table annotations are not transferred over; all annotations are unset on the copy.
copyScalarFn must be a function that returns a copy of the given scalar expression.
func (*Metadata) DuplicateTable ¶
func (md *Metadata) DuplicateTable( tabID TableID, remapColumnIDs func(e ScalarExpr, colMap ColMap) ScalarExpr, ) TableID
DuplicateTable creates a new reference to the table with the given ID. All columns are added to the metadata with new column IDs. If mutation columns are present, they are added after active columns. The ID of the new table reference is returned. This function panics if a table with the given ID does not exists in the metadata.
remapColumnIDs must be a function that remaps the column IDs within a ScalarExpr to new column IDs. It takes as arguments a ScalarExpr and a mapping of old column IDs to new column IDs, and returns a new ScalarExpr. This function is used when duplicating Constraints, ComputedCols, and partialIndexPredicates. DuplicateTable requires this callback function, rather than performing the remapping itself, because remapping column IDs requires constructing new expressions with norm.Factory. The norm package depends on opt, and cannot be imported here.
The ExplicitCatalog/ExplicitSchema fields of the table's alias are honored so that its original formatting is preserved for error messages, pretty-printing, etc.
func (*Metadata) NextUniqueID ¶
NextUniqueID returns a fresh UniqueID which is guaranteed to never have been previously allocated in this memo.
func (*Metadata) NumColumns ¶
NumColumns returns the count of columns tracked by this Metadata instance.
func (*Metadata) QualifiedAlias ¶
QualifiedAlias returns the column alias, possibly qualified with the table, schema, or database name:
If fullyQualify is true, then the returned alias is prefixed by the original, fully qualified name of the table: tab.Name().FQString().
If there's another column in the metadata with the same column alias but a different table name, then prefix the column alias with the table name: "tabName.columnAlias".
func (*Metadata) Schema ¶
Schema looks up the metadata for the schema associated with the given schema id.
func (*Metadata) Sequence ¶
func (md *Metadata) Sequence(seqID SequenceID) cat.Sequence
Sequence looks up the catalog sequence associated with the given metadata id. The same sequence can be associated with multiple metadata ids.
func (*Metadata) SetTableAnnotation ¶
func (md *Metadata) SetTableAnnotation(tabID TableID, tabAnnID TableAnnID, ann interface{})
SetTableAnnotation associates the given annotation with the given table. The annotation is associated by the given ID, which was allocated by calling NewTableAnnID. If an annotation with the ID already exists on the table, then it is overwritten.
See the TableAnnID comment for more details and a usage example.
func (*Metadata) Table ¶
Table looks up the catalog table associated with the given metadata id. The same table can be associated with multiple metadata ids.
func (*Metadata) TableAnnotation ¶
func (md *Metadata) TableAnnotation(tabID TableID, annID TableAnnID) interface{}
TableAnnotation returns the given annotation that is associated with the given table. If the table has no such annotation, TableAnnotation returns nil.
func (*Metadata) TableMeta ¶
TableMeta looks up the metadata for the table associated with the given table id. The same table can be added multiple times to the query metadata and associated with multiple table ids.
func (*Metadata) UpdateTableMeta ¶
UpdateTableMeta allows the caller to replace the cat.Table struct that a TableMeta instance stores.
func (*Metadata) WithBinding ¶
WithBinding returns the bound expression for the given WithID. Panics with an assertion error if there is none.
type MutableExpr ¶
type MutableExpr interface {
// SetChild updates the nth child of the expression to instead be the given
// child expression.
SetChild(nth int, child Expr)
}
MutableExpr is implemented by expressions that allow their children to be updated.
type OpaqueMetadata ¶
type OpaqueMetadata interface {
ImplementsOpaqueMetadata()
// String is a short description used when printing optimizer trees and when
// forming error messages; it should be the SQL statement tag.
String() string
// Columns returns the columns that are produced by this operator.
Columns() colinfo.ResultColumns
}
OpaqueMetadata is an object stored in OpaqueRelExpr and passed through to the exec factory.
type Operator ¶
type Operator uint16
Operator describes the type of operation that a memo expression performs. Some operators are relational (join, select, project) and others are scalar (and, or, plus, variable).
const ( UnknownOp Operator = iota // AggDistinct is used as a modifier that wraps an aggregate function. It causes // the respective aggregation to only process each distinct value once. AggDistinctOp // AggFilter is used as a modifier that wraps an aggregate function (or an // AggDistinct operator that wraps an aggregate function). It causes only rows // for which the filter expression is true to be processed. AggFilter should // always occur on top of AggDistinct if they are both present. AggFilterOp // Aggregations is a set of AggregationsItem expressions that specify the // ColumnIDs and aggregation expression for output columns projected by a // containing grouping operator (GroupBy, ScalarGroupBy, or DistinctOn). It is // legal for the set to be empty. See the AggregationsItem header for more // details. AggregationsOp // AggregationsItem encapsulates the information for constructing an aggregate // output column, including its ColumnID and the aggregate expression that // produces its value. In addition, the AggregationsItem caches a set of scalar // properties that are lazily calculated by traversing the Agg scalar expression. // This allows the properties for the aggregate expression to be calculated once // and then repeatedly reused. // // The aggregate expression can only consist of aggregate functions, variable // references, and modifiers like AggDistinct. Examples of valid expressions: // // (Min (Variable 1)) // (Count (AggDistinct (Variable 1))) // // More complex arguments must be formulated using a Project operator as input to // the grouping operator. AggregationsItemOp // AlterRangeRelocate represents an `ALTER RANGE .. RELOCATE ..` statement. AlterRangeRelocateOp AlterRangeRelocatePrivateOp // AlterTableRelocate represents an `ALTER TABLE/INDEX .. SPLIT AT ..` statement. AlterTableRelocateOp AlterTableRelocatePrivateOp // AlterTableSplit represents an `ALTER TABLE/INDEX .. SPLIT AT ..` statement. AlterTableSplitOp AlterTableSplitPrivateOp // AlterTableUnsplit represents an `ALTER TABLE/INDEX .. UNSPLIT AT ..` // statement. AlterTableUnsplitOp // AlterTableUnsplit represents an `ALTER TABLE/INDEX .. UNSPLIT ALL` statement. AlterTableUnsplitAllOp // And is the boolean conjunction operator that evaluates to true only if both of // its conditions evaluate to true. AndOp AntiJoinOp AntiJoinApplyOp // Any is a SQL operator that applies a comparison to every row of an input // subquery and returns true if any of the comparisons are true, else returns // null if any of the comparisons are null, else returns false. The following // transformations map from various SQL operators into the Any operator: // // <scalar> IN (<subquery>) // ==> (Any <subquery> <scalar> EqOp) // // <scalar> NOT IN (<subquery>) // ==> (Not (Any <subquery> <scalar> EqOp)) // // <scalar> <cmp> {SOME|ANY}(<subquery>) // ==> (Any <subquery> <scalar> <cmp>) // // <scalar> <cmp> ALL(<subquery>) // ==> (Not (Any <subquery> <scalar> <negated-cmp>)) // // Any expects the input subquery to return a single column of any data type. The // scalar value is compared with that column using the specified comparison // operator. AnyOp // AnyNotNullAgg returns any non-NULL value it receives, with no other guarantees. // If it does not receive any values, it returns NULL. // // AnyNotNullAgg is not part of SQL, but it's used internally to rewrite // correlated subqueries into an efficient and convenient form. AnyNotNullAggOp // AnyScalar is the form of ANY which refers to an ANY operation on a // tuple or array, as opposed to Any which operates on a subquery. AnyScalarOp // Array is an ARRAY literal of the form ARRAY[<expr1>, <expr2>, ..., <exprN>]. ArrayOp ArrayAggOp // ArrayFlatten is an ARRAY(<subquery>) expression. ArrayFlatten takes as input // a subquery which returns a single column and constructs a scalar array as the // output. Any NULLs are included in the results, and if the subquery has an // ORDER BY clause that ordering will be respected by the resulting array. ArrayFlattenOp // AssignmentCast is similar to CastExpr, but is performed in the context of an // INSERT, UPDATE, or UPSERT to match the type of a mutation value to the type of // the target column. An expression separate from CastExpr is required because it // behaves slightly differently than an explicit cast. For example, while an // explicit cast will truncate a value to fit the width of a type, an assignment // cast will error instead if the value does not fit the type. See // tree.CastContext for more details. // // An assignment cast is represented as a distinct expression within the // optimizer, but is built into a crdb_internal.assignment_cast function call in // execbuilder. AssignmentCastOp AvgOp // BBoxCovers is the ~ operator when used with geometry or bounding box // operands. It maps to tree.RegMatch. BBoxCoversOp // BBoxIntersects is the && operator when used with geometry or bounding box // operands. It maps to tree.Overlaps. BBoxIntersectsOp BitAndAggOp BitOrAggOp BitandOp BitorOp BitxorOp BoolAndOp BoolOrOp CancelPrivateOp // CancelQueries represents a `CANCEL QUERIES` statement. CancelQueriesOp // CancelSessions represents a `CANCEL SESSIONS` statement. CancelSessionsOp // Case is a CASE statement of the form: // // CASE [ <Input> ] // WHEN <condval1> THEN <expr1> // [ WHEN <condval2> THEN <expr2> ] ... // [ ELSE <expr> ] // END // // The Case operator evaluates <Input> (if not provided, Input is set to True), // then picks the WHEN branch where <condval> is equal to <Input>, then evaluates // and returns the corresponding THEN expression. If no WHEN branch matches, the // ELSE expression is evaluated and returned, if any. Otherwise, NULL is // returned. // // Note that the Whens list inside Case is used to represent all the WHEN // branches. It is of the form: // // [(When <condval1> <expr1>),(When <condval2> <expr2>),...] // CaseOp // Cast converts the input expression into an expression of the target type. Note // that the conversion may cause truncation based on the target types' width, // such as in this example: // // 'hello'::VARCHAR(2) // // That expression has the effect of truncating the string to just 'he', since // the target data type allows a maximum of two characters. This is one example // of a "lossy" cast. CastOp CoalesceOp // Collate is an expression of the form // // x COLLATE y // // Where x is a "string type" (meaning either a normal string or a collated string), // and y is a locale. It evaluates to the string collated to the given locale. CollateOp // ColumnAccess is a scalar expression that returns a column from the given // input expression (which is assumed to be of type Tuple). Idx is the ordinal // index of the column in Input. ColumnAccessOp ConcatOp ConcatAggOp // Const is a typed scalar constant value. The Value field is a tree.Datum value // having any datum type that's legal in the expression's context. Do NOT call // ConstructConst directly; use ConstructConstVal instead. ConstOp // ConstAgg is used in the special case when the value of a column is known to be // constant within a grouping set; it returns that value. If there are no rows // in the grouping set, then ConstAgg returns NULL. // // ConstAgg is not part of SQL, but it's used internally to rewrite correlated // subqueries into an efficient and convenient form. ConstAggOp // ConstNotNullAgg is used in the special case when the value of a column is // known to be constant within a grouping set, except on some rows where it can // have a NULL value; it returns the non-NULL constant value. If there are no // rows in the grouping set, or all rows have a NULL value, then ConstNotNullAgg // returns NULL. // // ConstNotNullAgg is not part of SQL, but it's used internally to rewrite // correlated subqueries into an efficient and convenient form. ConstNotNullAggOp ContainedByOp ContainsOp // ControlJobs represents a `PAUSE/CANCEL/RESUME JOBS` statement. ControlJobsOp ControlJobsPrivateOp // ControlSchedules represents a `PAUSE/CANCEL/RESUME SCHEDULES` statement. ControlSchedulesOp ControlSchedulesPrivateOp CorrOp CountOp CountRowsOp CovarPopOp CovarSampOp // CreateStatistics represents a CREATE STATISTICS or ANALYZE statement. CreateStatisticsOp CreateStatisticsPrivateOp // CreateTable represents a CREATE TABLE statement. CreateTableOp CreateTablePrivateOp CreateViewOp CreateViewPrivateOp // CumeDist is the relative rank of the current row: // (number of rows preceding or peer with current row) / (total rows) CumeDistOp // Delete is an operator used to delete all rows that are selected by a // relational input expression: // // DELETE FROM abc WHERE a>0 ORDER BY b LIMIT 10 // DeleteOp // DenseRank is like Rank, but without gaps. Instead of 1, 1, 3, it gives 1, 1, 2. DenseRankOp // DistinctOn filters out rows that are identical on the set of grouping columns; // only the first row (according to an ordering) is kept for each set of possible // values. It is roughly equivalent with a GroupBy on the same grouping columns // except that it uses FirstAgg functions that ensure the value on the first row // is chosen (across all aggregations). // // In addition, the value on that first row must be chosen for all the grouping // columns as well; this is relevant in the case of equal but non-identical // values, like decimals. For example, if we have rows (1, 2.0) and (1.0, 2) and // we are grouping on these two columns, the values output can be either (1, 2.0) // or (1.0, 2), but not (1.0, 2.0). // // The execution of DistinctOn resembles that of Select more than that of // GroupBy: each row is tested against a map of what groups we have seen already, // and is either passed through or discarded. In particular, note that this // preserves the input ordering. // // The ordering in the GroupingPrivate field will be required of the input; it // determines which row can get "chosen" for each group of values on the grouping // columns. There is no restriction on the ordering; but note that grouping // columns are inconsequential - they can appear anywhere in the ordering and // they won't change the results (other than the result ordering). // // Currently when we build DistinctOn, we set all grouping columns as optional // cols in Ordering (but this is not required by the operator). // // TODO(radu): in the future we may want an exploration transform to try out more // specific interesting orderings because execution is more efficient when we can // rely on an ordering on the grouping columns (or a subset of them). // // DistinctOn uses an Aggregations child and the GroupingPrivate struct so that // it's polymorphic with GroupBy and can be used in the same rules (when // appropriate). In the DistinctOn case, the aggregations can be only FirstAgg or // ConstAgg. DistinctOnOp // Distribute enforces the physical distribution of rows returned by its input // expression. Currently, it is only used to re-distribute data across different // sets of regions in a multi-region cluster. For example, if rows are spread // across multiple regions, a Distribute enforcer can be used to route the rows // to the gateway region. See the Distribution field in the PhysicalProps struct. // TODO(rytaft): We should probably include the input distribution here so we can // accurately cost the Distribute operator. This will likely require calculating // "interesting distributions", similar to "interesting orderings". DistributeOp DivOp // EnsureDistinctOn is a variation on DistinctOn that is only used to replace a // Max1Row operator in a decorrelation attempt. It raises an error if any // distinct grouping contains more than one row. Or in other words, it "ensures" // that the input is distinct on the grouping columns. // // EnsureDistinctOn is used when nulls are not considered distinct for grouping // purposes and an error should be raised when duplicates are detected. // // Rules should only "push through" or eliminate an EnsureDistinctOn if they // preserve the expected error behavior. For example, it would be invalid to // push a Select filter into an EnsureDistinctOn, as it might eliminate rows // that would otherwise trigger the EnsureDistinctOn error. EnsureDistinctOnOp // EnsureUpsertDistinctOn is a variation on UpsertDistinctOn that is only used // with UPSERT and INSERT..ON CONFLICT statements. Like UpsertDistinctOn, // EnsureUpsertDistinctOn treats NULL values as not equal to one another for // purposes of grouping. Unlike UpsertDistinctOn, it raises an error if any // distinct grouping contains more than one row. Or in other words, it "ensures" // that the input is distinct on the grouping columns. // // EnsureUpsertDistinctOn is used when nulls are considered distinct for grouping // purposes and an error should be raised when duplicates are detected. // // Rules should only "push through" or eliminate an EnsureUpsertDistinctOn if // they preserve the expected error behavior. For example, it would be invalid to // push a Select filter into an EnsureUpsertDistinctOn, as it might eliminate // rows that would otherwise trigger the EnsureUpsertDistinctOn error. EnsureUpsertDistinctOnOp EqOp // Except is an operator used to perform a set difference between the Left and // Right input relations. The result consists only of rows in the Left relation // that are not present in the Right relation. Duplicate rows are discarded. // The SetPrivate field matches columns from the Left and Right inputs of the Except // with the output columns. See the comment above SetPrivate for more details. ExceptOp // ExceptAll is an operator used to perform a set difference between the Left // and Right input relations. The result consists only of rows in the Left // relation that do not have a corresponding row in the Right relation. // Duplicate rows are not discarded. This effectively creates a one-to-one // mapping between the Left and Right rows. For example: // SELECT x FROM xx EXCEPT ALL SELECT y FROM yy // x y out // ----- ----- ----- // 1 1 -> 1 // 1 1 4 // 1 2 // 2 2 // 2 3 // 4 // // The SetPrivate field matches columns from the Left and Right inputs of the // ExceptAll with the output columns. See the comment above SetPrivate for more // details. ExceptAllOp // Exists takes a relational query as its input, and evaluates to true if the // query returns at least one row. ExistsOp // Explain returns information about the execution plan of the "input" // expression. ExplainOp ExplainPrivateOp // Export represents an `EXPORT` statement. ExportOp ExportPrivateOp // FKChecks is a list of foreign key check queries, to be run after the main // query. FKChecksOp // FKChecksItem is a foreign key check query, to be run after the main query. // An execution error will be generated if the query returns any results. FKChecksItemOp FKChecksItemPrivateOp // FakeRel is a mock relational operator used for testing and as a dummy binding // relation for building cascades; its logical properties are pre-determined and // stored in the private. It can be used as the child of an operator for which we // are calculating properties or statistics. FakeRelOp FakeRelPrivateOp // False is the boolean false value that is equivalent to the tree.DBoolFalse // datum value. It is a separate operator to make matching and replacement // simpler and more efficient, as patterns can contain (False) expressions. FalseOp FetchTextOp FetchTextPathOp FetchValOp FetchValPathOp // Filters is a set of FiltersItem expressions that specify a set of conjuncts // that filter rows selected by a containing Select or Join operator. A row is // filtered only if all conditions evaluate to true. If the set is empty, then // it never filters rows. See the Select and FiltersItem headers for more // details. FiltersOp // FiltersItem contains a filter condition that's evaluated to determine whether // Select or Join rows should be filtered. In addition, the FiltersItem caches a // set of scalar properties that are lazily calculated by traversing the // Condition scalar expression. This allows the properties for the entire // expression subtree to be calculated once and then repeatedly reused. FiltersItemOp // FirstAgg is used only by DistinctOn; it returns the value on the first row // according to an ordering; if the ordering is unspecified (or partially // specified), it is an arbitrary ordering but it must be the same across all // FirstAggs in a DistinctOn. FirstAggOp // FirstValue returns Value evaluated at the first row in the row's frame. // TODO(justin): can this be unified with FirstAgg? FirstValueOp FloorDivOp FullJoinOp // Function invokes a builtin SQL function like CONCAT or NOW, passing the given // arguments. The FunctionPrivate field contains the name of the function as well // as pointers to its type and properties. FunctionOp FunctionPrivateOp GeOp // GroupBy computes aggregate functions over groups of input rows. Input rows // that are equal on the grouping columns are grouped together. The set of // computed aggregate functions is described by the Aggregations field (which is // always an Aggregations operator). // // The arguments of the aggregate functions are columns from the input // (i.e. Variables), possibly wrapped in aggregate modifiers like AggDistinct. // // If the set of input rows is empty, then the output of the GroupBy operator // will also be empty. If the grouping columns are empty, then all input rows // form a single group. GroupBy is used for queries with aggregate functions, // HAVING clauses and/or GROUP BY expressions. // // The GroupingPrivate field contains an ordering; this ordering serves a // dual-purpose: // - if we ignore any grouping columns, the remaining columns indicate an // intra-group ordering; this is useful if there is an order-dependent // aggregation (like ARRAY_AGG). // - any prefix containing only grouping columns is used to execute the // aggregation in a streaming fashion. // // Currently, the initially built GroupBy has all grouping columns as "optional" // in the ordering (we call this the "canonical" variant). Subsequently, the // GenerateStreamingGroupBy exploration rule can add more variants, based on // interesting orderings. GroupByOp // GroupingPrivate is shared between the grouping-related operators: GroupBy // ScalarGroupBy, DistinctOn, EnsureDistinctOn, UpsertDistinctOn, and // EnsureUpsertDistinctOn. This allows the operators to be treated // polymorphically. GroupingPrivateOp GtOp ILikeOp // IfErr is roughly a runtime try-catch operator. It has different semantics // depending on which of its fields are set. // // If ErrCode is set, only errors which match the given error code will be // caught. If ErrCode is not set, all errors will be caught. // // If OrElse is not set, IfErr evaluates to true or false indicating whether an // error was caught. If OrElse is set, IfErr evaluates to Cond if no error was // caught and to OrElse if an error was caught. // // TODO(justin): The implementation here is a hack: ErrCode and OrElse are // optional, so we repurpose lists as an optional field (since it's not // valid to use nil). If this comes up again, we might want to consider // adding an explicit Option type. IfErrOp InOp // IndexJoin represents an inner join between an input expression and a primary // index. It is a special case of LookupJoin where the input columns are the PK // columns of the table we are looking up into, and every input row results in // exactly one output row. // // IndexJoin operators are created from Scan operators (unlike lookup joins which // are created from Join operators). IndexJoinOp IndexJoinPrivateOp // Indirection is a subscripting expression of the form <expr>[<index>]. // Input must be an Array type and Index must be an int. Multiple indirections // and slicing are not supported. IndirectionOp // InnerJoin creates a result set that combines columns from its left and right // inputs, based upon its "on" join predicate. Rows which do not match the // predicate are filtered. While expressions in the predicate can refer to // columns projected by either the left or right inputs, the inputs are not // allowed to refer to the other's projected columns. InnerJoinOp // InnerJoinApply has the same join semantics as InnerJoin. However, unlike // InnerJoin, it allows the right input to refer to columns projected by the // left input. InnerJoinApplyOp // Insert evaluates a relational input expression, and inserts values from it // into a target table. The input may be an arbitrarily complex expression: // // INSERT INTO ab SELECT x, y+1 FROM xy ORDER BY y // // It can also be a simple VALUES clause: // // INSERT INTO ab VALUES (1, 2) // // It may also return rows, which can be further composed: // // SELECT a + b FROM [INSERT INTO ab VALUES (1, 2) RETURNING a, b] // // The Insert operator is capable of inserting values into computed columns and // mutation columns, which are not writable (or even visible in the case of // mutation columns) by SQL users. InsertOp // Intersect is an operator used to perform an intersection between the Left // and Right input relations. The result consists only of rows in the Left // relation that are also present in the Right relation. Duplicate rows are // discarded. // The SetPrivate field matches columns from the Left and Right inputs of the // Intersect with the output columns. See the comment above SetPrivate for more // details. // Note that Intersect is symmetric in most cases, but there are exceptions: // some types allow values that are equal but not identical (e.g. collated // strings) in which case it could be visible which side a row is coming from. IntersectOp // IntersectAll is an operator used to perform an intersection between the Left // and Right input relations. The result consists only of rows in the Left // relation that have a corresponding row in the Right relation. Duplicate rows // are not discarded. This effectively creates a one-to-one mapping between the // Left and Right rows. For example: // // SELECT x FROM xx INTERSECT ALL SELECT y FROM yy // x y out // ----- ----- ----- // 1 1 1 // 1 1 -> 1 // 1 2 2 // 2 2 2 // 2 3 // 4 // // The SetPrivate field matches columns from the Left and Right inputs of the // IntersectAll with the output columns. See the comment above SetPrivate for more // details. // Note that IntersectAll is symmetric in most cases, but there are exceptions: // some types allow values that are equal but not identical (e.g. collated // strings) in which case it could be visible which side a row is coming from. IntersectAllOp // InvertedFilter filters rows from its input result set, based on the // InvertedExpression predicate (which is defined in InvertedFilterPrivate). // Rows which do not match the filter are discarded. The input should be a // constrained scan of an inverted index, possibly wrapped in other operators // such as Select. InvertedFilterOp InvertedFilterPrivateOp // InvertedJoin represents a join between an input expression and an inverted // index. The type of join is in the InvertedJoinPrivate field. InvertedJoinOp InvertedJoinPrivateOp // Is maps to the IS NOT DISTINCT FROM operator which is equivalent to IS for // non-tuples. See IsTupleNull for the tuple-specific IS NULL operator. IsOp // IsNot is the inverse of Is. It maps to the IS DISTINCT FROM operator which is // equivalent to IS NOT for non-tuples. See IsTupleNotNull for the // tuple-specific IS NOT NULL operator. IsNotOp // IsTupleNotNull is the boolean expression with a tuple input that evaluates to // true if the tuple is not null and all elements in the tuple are not null. IsTupleNotNullOp // IsTupleNull is the boolean expression with a tuple input that evaluates to // true if the tuple is null or all elements in the tuple are null. IsTupleNullOp // JoinPrivate is shared between the various join operators including apply // variants, but excluding IndexJoin, LookupJoin, MergeJoin. JoinPrivateOp JsonAggOp JsonAllExistsOp JsonExistsOp JsonObjectAggOp JsonSomeExistsOp JsonbAggOp JsonbObjectAggOp // KVOptions is a set of KVOptionItems that specify arbitrary keys and values // that are used as modifiers for various statements (see tree.KVOptions). The // key is a constant string but the value can be a scalar expression. KVOptionsOp // KVOptionsItem is the key and value of an option (see tree.KVOption). For keys // that don't have values, the value is Null. KVOptionsItemOp LShiftOp // Lag returns Value evaluated at the row Offset rows before this one. If no // such row exists, returns Def. LagOp // LastValue returns Value evaluated at the last row in the row's frame. LastValueOp LeOp // Lead returns Value evaluated at the row Offset rows after this one. If no // such row exists, returns Def. LeadOp LeftJoinOp LeftJoinApplyOp LikeOp // Limit returns a limited subset of the results in the input relation. The limit // expression is a scalar value; the operator returns at most this many rows. The // Ordering field is a physical.OrderingChoice which indicates the row ordering // required from the input (the first rows with respect to this ordering are // returned). LimitOp // LocalityOptimizedSearch is similar to UnionAll, but it is designed to avoid // communicating with remote nodes (relative to the gateway region) if at all // possible. LocalityOptimizedSearch can be planned when a scan is known to // produce at most one row, but it is not known which region contains that row // (if any). In this case, the scan can be split in two, and the resulting scans // will become the children of the LocalityOptimizedSearch operator. The left // scan should only contain spans targeting partitions on local nodes, and the // right scan should contain the remaining spans. The LocalityOptimizedSearch // operator ensures that the right child (containing remote spans) is only // executed if the left child (containing local spans) does not return any rows. // // This is a useful optimization if there is locality of access in the workload, // such that rows tend to be accessed from the region where they are located. // If there is no locality of access, using LocalityOptimizedSearch could be a // slight pessimization, since rows residing in remote regions will be fetched // slightly more slowly than they would be otherwise. // // For example, suppose we have a multi-region database with regions 'us-east1', // 'us-west1' and 'europe-west1', and we have the following table and query, // issued from 'us-east1': // // CREATE TABLE tab ( // k INT PRIMARY KEY, // v INT // ) LOCALITY REGIONAL BY ROW; // // SELECT * FROM tab WHERE k = 10; // // Normally, this would produce the following plan: // // scan tab // └── constraint: /3/1 // ├── [/'europe-west1'/10 - /'europe-west1'/10] // ├── [/'us-east1'/10 - /'us-east1'/10] // └── [/'us-west1'/10 - /'us-west1'/10] // // but if the session setting locality_optimized_partitioned_index_scan is enabled, // the optimizer will produce this plan, using locality optimized search: // // locality-optimized-search // ├── scan tab // │ └── constraint: /9/7: [/'us-east1'/10 - /'us-east1'/10] // └── scan tab // └── constraint: /14/12 // ├── [/'europe-west1'/10 - /'europe-west1'/10] // └── [/'us-west1'/10 - /'us-west1'/10] // // As long as k = 10 is located in 'us-east1', the second plan will be much faster. // But if k = 10 is located in one of the other regions, the first plan would be // slightly faster. LocalityOptimizedSearchOp // LookupJoin represents a join between an input expression and an index. The // type of join is in the LookupJoinPrivate field. LookupJoinOp LookupJoinPrivateOp LtOp MaxOp // Max1Row enforces that its input must return at most one row. If the input // has more than one row, Max1Row raises an error with the specified error text. // // Max1Row is most often used as input to the Subquery operator. See the comment // above Subquery for more details. Max1RowOp // MemoCycleTestRel is a relational expression for testing that memo cycles are // detected by the optimizer and a stack overflow is prevented. A cycle in the // memo occurs when there is a path from a group member's children back to the // group member's group. MemoCycleTestRel is similar in structure to the Select // expression, but matches a rule, MemoCycleTestRelRule, that creates a memo // cycle. MemoCycleTestRelOp // MergeJoin represents a join that is executed using merge-join. // MergeOn is a scalar which contains the ON condition and merge-join ordering // information; see the MergeOn scalar operator. // It can be any type of join (identified in the MergeJoinPrivate field). MergeJoinOp MergeJoinPrivateOp MinOp MinusOp ModOp MultOp MutationPrivateOp NeOp // NormCycleTestRel is a relational operator for testing that normalization rule // cycles are detected by the Factory and a stack overflow is prevented. Two // rules for this expression, NormCycleTestRelTrueToFalse and // NormCycleTestRelFalseToTrue, create a normalization rule cycle. See the cycle // test file for tests that use this expression. NormCycleTestRelOp // Not is the boolean negation operator that evaluates to true if its input // evaluates to false. NotOp NotILikeOp NotInOp NotLikeOp NotRegIMatchOp NotRegMatchOp NotSimilarToOp // NthValue returns Value evaluated at the nth row in the row's frame. // Out-of-bounds references evaluate to NULL. NthValueOp // Ntile builds a histogram with the specified number of buckets and evaluates // to which bucket the row falls in. NtileOp // Null is the constant SQL null value that has "unknown value" semantics. If // the Typ field is not types.Unknown, then the value is known to be in the // domain of that type. This is important for preserving correct types in // replacement patterns. For example: // (Plus (Function ...) (Const 1)) // // If the function in that expression has a static type of Int, but then it gets // constant folded to (Null), then its type must remain as Int. Any other type // violates logical equivalence of the expression, breaking type inference and // possibly changing the results of execution. The solution is to tag the null // with the correct type: // (Plus (Null (Int)) (Const 1)) // // Null is its own operator rather than a Const datum in order to make matching // and replacement easier and more efficient, as patterns can contain (Null) // expressions. NullOp // Offset filters out the first Offset rows of the input relation; used in // conjunction with Limit. OffsetOp // OpaqueMutation is a variant of OpaqueRel for operators that cause a schema // change and cannot be executed following a mutation in the same transaction. OpaqueDDLOp // OpaqueMutation is a variant of OpaqueRel for operators that can mutate data as // part of the transaction. OpaqueMutationOp // OpaqueRel is an opaque relational operator which is planned outside of the // optimizer. The operator contains an opaque metadata which is passed to the // exec factory. // // This is used for statements that are not directly supported by the optimizer, // and which don't use the result of other relational expressions (in other // words, they are a "leaf" operator). // // OpaqueRel can produce data and can be used as a data source as part of a // larger enclosing query. OpaqueRelOp OpaqueRelPrivateOp // Or is the boolean disjunction operator that evaluates to true if either one of // its conditions evaluates to true. OrOp // Ordinality adds a column to each row in its input containing a unique, // increasing number. OrdinalityOp OrdinalityPrivateOp OverlapsOp // PercentRank is (rank - 1) / (total rows - 1). PercentRankOp // PercentileCont returns a value such that N% of values are below it // in a given window, interpolating values if needed. Ignores nulls in the // input. PercentileContOp // PercentileDisc returns a value such that N% of values are below it // in a given window. Ignores nulls in the input. PercentileDiscOp PlaceholderOp // PlaceholderScan is a special variant of Scan. It scans exactly one span of a // non-inverted index, and the span has the same start and end key. The values // for the span key are child scalar operators which are either constants or // placeholders. This operator evaluates the placeholders at execbuild time. // // PlaceholderScan cannot have a Constraint or InvertedConstraint. PlaceholderScanOp PlusOp PowOp // Project modifies the set of columns returned by the input result set. Columns // can be removed, reordered, or renamed. In addition, new columns can be // synthesized. // // Projections describes the synthesized columns constructed by Project, and // Passthrough describes the input columns that are passed through as Project // output columns. ProjectOp // ProjectSet represents a relational operator which zips through a list of // generators for every row of the input. // // As a reminder, a functional zip over generators a,b,c returns tuples of // values from a,b,c picked "simultaneously". NULLs are used when a generator is // "shorter" than another. For example: // // zip([1,2,3], ['a','b']) = [(1,'a'), (2,'b'), (3, null)] // // ProjectSet corresponds to a relational operator project(R, a, b, c, ...) // which, for each row in R, produces all the rows produced by zip(a, b, c, ...) // with the values of R prefixed. Formally, this performs a lateral cross join // of R with zip(a,b,c). // // See the Zip header for more details. ProjectSetOp // Projections is a set of ProjectionsItem expressions that specify the ColumnIDs // and scalar expressions for the synthesized output columns projected by a // containing Project operator. It is legal for the set to be empty. See the // Project and ProjectionsItem headers for more details. ProjectionsOp // ProjectionsItem encapsulates the information needed to synthesize an output // column, including its ColumnID and the scalar expression that produces its // value. In addition, the ProjectionsItem caches a set of scalar properties that // are lazily calculated by traversing the Element scalar expression. This allows // the properties for the entire expression subtree to be calculated once and // then repeatedly reused. // // The Element scalar expression cannot contain a simple VariableOp with the same // ColumnID as the one stored in the Col field, since that would make it a // pass-through column. Pass-through columns are always stored on the containing // Project operator instead. However, the Element field can contain a VariableOp // when a new ColumnID is being assigned, such as in the case of an outer column // reference. ProjectionsItemOp RShiftOp // Range contains an And expression that constrains a single variable to a // range. For example, the And expression might be x > 5 AND x < 10. The // children of the And expression can be arbitrary expressions (including nested // And expressions), but they must all constrain the same variable, and the // constraints must be tight. // // Currently, Range expressions are only created by the ConsolidateSelectFilters // normalization rule. RangeOp // Rank computes the position of a row relative to an ordering, with same-valued // rows receiving the same value. RankOp // RecursiveCTE implements the logic of a recursive CTE: // * the Initial query is evaluated; the results are emitted and also saved into // a "working table". // * so long as the working table is not empty: // - the Recursive query (which refers to the working table using a specific // WithID) is evaluated; the results are emitted and also saved into a new // "working table" for the next iteration. RecursiveCTEOp RecursiveCTEPrivateOp RegIMatchOp RegMatchOp RegressionAvgXOp RegressionAvgYOp RegressionCountOp RegressionInterceptOp RegressionR2Op RegressionSXXOp RegressionSXYOp RegressionSYYOp RegressionSlopeOp RightJoinOp // RowNumber computes the position of a row relative to an ordering, with // same-valued rows having ties broken arbitrarily. RowNumberOp STCollectOp STExtentOp STMakeLineOp STUnionOp // ScalarGroupBy computes aggregate functions over the complete set of input // rows. This is similar to GroupBy with empty grouping columns, where all input // rows form a single group. However, there is an important difference. If the // input set is empty, then the output of the ScalarGroupBy operator will have a // single row containing default values for each aggregate function (typically // null or zero, depending on the function). ScalarGroupBy always returns exactly // one row - either the single-group aggregates or the default aggregate values. // // ScalarGroupBy uses the GroupingPrivate struct so that it's polymorphic with // GroupBy and can be used in the same rules (when appropriate). In the // ScalarGroupBy case, the grouping column field in GroupingPrivate is always // empty. ScalarGroupByOp // ScalarList is a list expression that has scalar expression items of type // opt.ScalarExpr. opt.ScalarExpr is an external type that is defined outside of // Optgen. It is hard-coded in the code generator to be the item type for // ScalarList. // // TODO(andyk): Consider adding Optgen syntax like: // define ScalarList []ScalarExpr ScalarListOp // Scan returns a result set containing every row in a table by scanning one of // the table's indexes according to its ordering. The ScanPrivate field // identifies the table and index to scan, as well as the subset of columns to // project from it. // // The scan can be constrained and/or have an internal row limit. A scan can be // executed either as a forward or as a reverse scan (except when it has a limit, // in which case the direction is fixed). ScanOp ScanPrivateOp // Select filters rows from its input result set, based on the boolean filter // predicate expression. Rows which do not match the filter are discarded. While // the Filter operand can be any boolean expression, normalization rules will // typically convert it to a Filters operator in order to make conjunction list // matching easier. SelectOp SemiJoinOp SemiJoinApplyOp // SequenceSelect represents a read from a sequence as a data source. It always returns // three columns, last_value, log_cnt, and is_called, with a single row. last_value is // the most recent value returned from the sequence and log_cnt and is_called are // always 0 and true, respectively. SequenceSelectOp SequenceSelectPrivateOp // SetPrivate contains fields used by the relational set operators: Union, // Intersect, Except, UnionAll, IntersectAll, ExceptAll, and // LocalityOptimizedSearch. It matches columns from the left and right inputs of // the operator with the output columns, since OutputCols are not ordered and may // not correspond to each other. // // For example, consider the following query: // SELECT y, x FROM xy UNION SELECT b, a FROM ab // // Given: // col index // x 1 // y 2 // a 3 // b 4 // // SetPrivate will contain the following values: // Left: [2, 1] // Right: [4, 3] // Out: [5, 6] <-- synthesized output columns // // To make normalization rules and execution simpler, both inputs to the set op // must have matching types. // // SetPrivate can also contain an optional ordering, allowing for more efficient // streaming execution. SetPrivateOp // ShowTraceForSession returns the current session traces. ShowTraceForSessionOp ShowTracePrivateOp SimilarToOp // Sort enforces the ordering of rows returned by its input expression. Rows can // be sorted by one or more of the input columns, each of which can be sorted in // either ascending or descending order. See the Ordering field in the // PhysicalProps struct. SortOp SqrDiffOp StdDevOp StdDevPopOp StringAggOp // Subquery is a subquery in a single-row context. Here are some examples: // // SELECT 1 = (SELECT 1) // SELECT (1, 'a') = (SELECT 1, 'a')` // // In a single-row context, the outer query is only valid if the subquery returns // at most one row. Subqueries in a multi-row context can be transformed to a // single row context using the Any operator. See the comment above the Any // operator for more details. // // The Input field contains the subquery itself, which should be wrapped in a // Max1Row operator to enforce that the subquery can return at most one row // (Max1Row may be removed by the optimizer later if it can determine statically // that the subquery will always return at most one row). In addition, the // subquery must project exactly one output column. If the subquery returns one // row, then that column is bound to the single column value in that row. If the // subquery returns zero rows, then that column is bound to NULL. SubqueryOp // SubqueryPrivate contains information related to a subquery (Subquery, Any, // Exists). It is shared between the operators so that the same rules can be used // across all the subquery operators. SubqueryPrivateOp SumOp SumIntOp // TopK returns the top K, where K is a constant, rows from the input set // according to its sort ordering, discarding the remaining rows. The Limit is a // constant positive integer; the operator returns at most Limit rows. Rows can // be sorted by one or more of the input columns, each of which can be sorted in // either ascending or descending order. See the Ordering field in the // PhysicalProps struct. // // Unlike the Limit relational operator, TopK does not require its input to be // ordered. However, if the input is known to have a partial ordering of the // required ordering, TopK can take advantage of optimizations. TopK can be used // to substitute a Limit that requires its input to be ordered and performs best // when the input is not already fully ordered. TopK scans the input, storing the // K rows that best meet the ordering requirement in a max heap, then sorts the K // rows. TopKOp TopKPrivateOp // True is the boolean true value that is equivalent to the tree.DBoolTrue datum // value. It is a separate operator to make matching and replacement simpler and // more efficient, as patterns can contain (True) expressions. TrueOp TupleOp UnaryCbrtOp UnaryComplementOp UnaryMinusOp UnaryPlusOp UnarySqrtOp // Union is an operator used to combine the Left and Right input relations into // a single set containing rows from both inputs. Duplicate rows are discarded. // The SetPrivate field matches columns from the Left and Right inputs of the // Union with the output columns. See the comment above SetPrivate for more // details. UnionOp // UnionAll is an operator used to combine the Left and Right input relations // into a single set containing rows from both inputs. Duplicate rows are // not discarded. For example: // // SELECT x FROM xx UNION ALL SELECT y FROM yy // x y out // ----- ----- ----- // 1 1 1 // 1 2 -> 1 // 2 3 1 // 2 // 2 // 3 // // The SetPrivate field matches columns from the Left and Right inputs of the // UnionAll with the output columns. See the comment above SetPrivate for more // details. UnionAllOp // UniqueChecks is a list of uniqueness check queries, to be run after the main // query. UniqueChecksOp // UniqueChecksItem is a unique check query, to be run after the main query. // An execution error will be generated if the query returns any results. UniqueChecksItemOp UniqueChecksItemPrivateOp // Update evaluates a relational input expression that fetches existing rows from // a target table and computes new values for one or more columns. Arbitrary // subsets of rows can be selected from the target table and processed in order, // as with this example: // // UPDATE abc SET b=10 WHERE a>0 ORDER BY b+c LIMIT 10 // // The Update operator will also update any computed columns, including mutation // columns that are computed. UpdateOp // Upsert evaluates a relational input expression that tries to insert a new row // into a target table. If a conflicting row already exists, then Upsert will // instead update the existing row. The Upsert operator is used for all of these // syntactic variants: // // INSERT..ON CONFLICT DO UPDATE // INSERT INTO abc VALUES (1, 2, 3) ON CONFLICT (a) DO UPDATE SET b=10 // // INSERT..ON CONFLICT DO NOTHING // INSERT INTO abc VALUES (1, 2, 3) ON CONFLICT DO NOTHING // // UPSERT // UPSERT INTO abc VALUES (1, 2, 3) // // The Update operator will also insert/update any computed columns, including // mutation columns that are computed. UpsertOp // UpsertDistinctOn is a variation on DistinctOn that is only used with UPSERT // and INSERT..ON CONFLICT statements. Unlike DistinctOn, UpsertDistinctOn treats // NULL values as not equal to one another for purposes of grouping. Two rows // having a NULL-valued grouping column will be placed in different groups. This // differs from DistinctOn behavior, where the two rows would be grouped // together. This behavior difference reflects SQL semantics, in which a unique // index key still allows multiple NULL values. // // UpsertDistinctOn is used when nulls are considered distinct for grouping // purposes and duplicates should be filtered out without raising an error. UpsertDistinctOnOp // Values returns a manufactured result set containing a constant number of rows. // specified by the Rows list field. Each row must contain the same set of // columns in the same order. // // The Rows field contains a list of Tuples, one for each row. Each tuple has // the same length (same with that of Cols). // // The Cols field contains the set of column indices returned by each row // as an opt.ColList. It is legal for Cols to be empty. ValuesOp ValuesPrivateOp VarPopOp // Variable is the typed scalar value of a column in the query. The Col field is // a metadata ColumnID value that references the column by index. VariableOp VarianceOp // When represents a single WHEN ... THEN ... condition inside a CASE statement. // It is the type of each list item in Whens (except for the last item which is // a raw expression for the ELSE statement). WhenOp // Window represents a window function. Window functions are operators which // allow computations that take into consideration other rows in the same result // set. // // More concretely, a window function is a relational operator that takes in a // result set and appends a single new column whose value depends on the other // rows within the result set, and that row's relative position in it. // // Depending on the exact window function being computed, the value of the new // column could be the position of the row in the output (`row_number`), or a // cumulative sum, or something else. WindowOp // WindowFromOffset is used as a modifier that wraps the input of a window // function. It supplies the expression to be used as the lower bound of the // window frame, if the lower bound uses OFFSET mode. WindowFromOffsetOp WindowPrivateOp // WindowToOffset is used as a modifier that wraps the input of a window // function. It supplies the expression to be used as the upper bound of the // window frame, if the upper bound uses OFFSET mode. WindowToOffsetOp // Windows is a set of window functions to be computed in the context of a // Window expression. WindowsOp // WindowsItem is a single window function to be computed in the context of a // Window expression. WindowsItemOp WindowsItemPrivateOp // With executes Binding, making its results available to Main. Within Main, the // results of Binding may be referenced by a WithScan expression containing the // ID of this With. WithOp WithPrivateOp // WithScan returns the results present in the With expression referenced // by ID. // Note that in order to construct a WithScan, the WithID must have a bound // expression in the metadata. WithScanOp WithScanPrivateOp XorAggOp // ZigzagJoin represents a join that is executed using the zigzag joiner. // All fields except for the ON expression are stored in the private; // since the zigzag joiner operates directly on indexes and doesn't // support arbitrary inputs. // // TODO(itsbilal): Add support for representing multi-way zigzag joins. ZigzagJoinOp ZigzagJoinPrivateOp // Zip represents a functional zip over generators a,b,c, which returns tuples of // values from a,b,c picked "simultaneously". NULLs are used when a generator is // "shorter" than another. In SQL, these generators can be either generator // functions such as generate_series(), or scalar functions or expressions such // as upper() or CAST. For example, consider this query: // // SELECT * FROM ROWS FROM (generate_series(0, 1), upper('abc')); // // It is equivalent to: // // (Zip [ // (ZipItem (Function generate_series)), // (ZipItem (Function upper)) // ] // ) // // It produces: // // generate_series | upper // -----------------+------- // 0 | ABC // 1 | NULL // ZipOp // ZipItem contains a generator function or scalar expression that is contained // in a Zip. It also contains the list of output columns for the generator or // scalar expression in the ZipItem. Cols is a list since a single function may // output multiple columns (e.g. pg_get_keywords() outputs three columns). // // See the Zip header for more details. ZipItemOp NumOperators )
type OptionalColList ¶
type OptionalColList []ColumnID
OptionalColList is a list of column IDs where some of the IDs can be unset. It is used when the columns map 1-to-1 to a known list of objects (e.g. table columns).
func (OptionalColList) Equals ¶
func (ocl OptionalColList) Equals(other OptionalColList) bool
Equals returns true if this column list is identical to another list.
func (OptionalColList) Find ¶
func (ocl OptionalColList) Find(col ColumnID) (idx int, ok bool)
Find searches for a column in the list and returns its index in the list (if successful).
func (OptionalColList) IsEmpty ¶
func (ocl OptionalColList) IsEmpty() bool
IsEmpty returns true if all columns in the list are unset.
func (OptionalColList) ToSet ¶
func (ocl OptionalColList) ToSet() ColSet
ToSet returns the set of columns that are present in the list.
type Ordering ¶
type Ordering []OrderingColumn
Ordering defines the order of rows provided or required by an operator. A negative value indicates descending order on the column id "-(value)".
type OrderingColumn ¶
type OrderingColumn int32
OrderingColumn is the ColumnID for a column that is part of an ordering, except that it can be negated to indicate a descending ordering on that column.
func MakeOrderingColumn ¶
func MakeOrderingColumn(id ColumnID, descending bool) OrderingColumn
MakeOrderingColumn initializes an ordering column with a ColumnID and a flag indicating whether the direction is descending.
func (OrderingColumn) Ascending ¶
func (c OrderingColumn) Ascending() bool
Ascending returns true if the ordering on this column is ascending.
func (OrderingColumn) Descending ¶
func (c OrderingColumn) Descending() bool
Descending returns true if the ordering on this column is descending.
func (OrderingColumn) Format ¶
func (c OrderingColumn) Format(buf *bytes.Buffer)
Format prints a string representation to the buffer.
func (OrderingColumn) ID ¶
func (c OrderingColumn) ID() ColumnID
ID returns the ColumnID for this OrderingColumn.
func (OrderingColumn) RemapColumn ¶
func (c OrderingColumn) RemapColumn(from, to TableID) OrderingColumn
RemapColumn returns a new OrderingColumn that uses a ColumnID from the 'to' table. The original ColumnID must be from the 'from' table.
func (OrderingColumn) String ¶
func (c OrderingColumn) String() string
type RuleName ¶
type RuleName uint32
RuleName enumerates the names of all the optimizer rules. Manual rule names are defined in this file and rule names generated by Optgen are defined in rule_name.og.go.
const ( InvalidRuleName RuleName = iota SimplifyRootOrdering PruneRootCols SimplifyZeroCardinalityGroup // NumManualRules tracks the number of manually-defined rules. NumManualRuleNames )
Enumeration of all manual rule names.
const ( // ------------------------------------------------------------ // Normalize Rule Names // ------------------------------------------------------------ EliminateAggDistinct RuleName NormalizeNestedAnds SimplifyTrueAnd SimplifyAndTrue SimplifyFalseAnd SimplifyAndFalse SimplifyTrueOr SimplifyOrTrue SimplifyFalseOr SimplifyOrFalse SimplifyRange FoldNullAndOr FoldNotTrue FoldNotFalse FoldNotNull NegateComparison EliminateNot NegateAnd NegateOr ExtractRedundantConjunct CommuteVarInequality CommuteConstInequality NormalizeCmpPlusConst NormalizeCmpMinusConst NormalizeCmpConstMinus NormalizeTupleEquality FoldNullComparisonLeft FoldNullComparisonRight FoldIsNull FoldNonNullIsNull FoldNullTupleIsTupleNull FoldNonNullTupleIsTupleNull FoldIsNotNull FoldNonNullIsNotNull FoldNonNullTupleIsTupleNotNull FoldNullTupleIsTupleNotNull CommuteNullIs NormalizeCmpTimeZoneFunction NormalizeCmpTimeZoneFunctionTZ FoldEqZeroSTDistance FoldCmpSTDistanceLeft FoldCmpSTDistanceRight FoldCmpSTMaxDistanceLeft FoldCmpSTMaxDistanceRight FoldEqTrue FoldEqFalse FoldNeTrue FoldNeFalse NormCycleTestRelTrueToFalse NormCycleTestRelFalseToTrue DecorrelateJoin DecorrelateProjectSet TryDecorrelateSelect TryDecorrelateProject TryDecorrelateProjectSelect TryDecorrelateProjectInnerJoin TryDecorrelateInnerJoin TryDecorrelateInnerLeftJoin TryDecorrelateGroupBy TryDecorrelateScalarGroupBy TryDecorrelateSemiJoin TryDecorrelateLimitOne TryDecorrelateLimit TryDecorrelateProjectSet TryDecorrelateWindow TryDecorrelateMax1Row HoistSelectExists HoistSelectNotExists HoistSelectSubquery HoistProjectSubquery HoistJoinSubquery HoistValuesSubquery HoistProjectSetSubquery NormalizeSelectAnyFilter NormalizeJoinAnyFilter NormalizeSelectNotAnyFilter NormalizeJoinNotAnyFilter FoldNullCast FoldNullUnary FoldNullBinaryLeft FoldNullBinaryRight FoldNullInNonEmpty FoldInEmpty FoldNotInEmpty FoldArray FoldBinary FoldUnary FoldComparison FoldCast FoldAssignmentCast FoldIndirection FoldColumnAccess FoldFunctionWithNullArg FoldFunction FoldEqualsAnyNull ConvertGroupByToDistinct EliminateGroupByProject EliminateJoinUnderGroupByLeft EliminateJoinUnderGroupByRight EliminateDistinct ReduceGroupingCols ReduceNotNullGroupingCols EliminateAggDistinctForKeys EliminateAggFilteredDistinctForKeys EliminateDistinctNoColumns EliminateEnsureDistinctNoColumns EliminateDistinctOnValues PushAggDistinctIntoGroupBy PushAggFilterIntoScalarGroupBy ConvertCountToCountRows ConvertRegressionCountToCount FoldGroupingOperators InlineConstVar InlineProjectConstants InlineSelectConstants InlineJoinConstantsLeft InlineJoinConstantsRight PushSelectIntoInlinableProject InlineSelectVirtualColumns InlineProjectInProject CommuteRightJoin SimplifyJoinFilters DetectJoinContradiction PushFilterIntoJoinLeftAndRight MapFilterIntoJoinLeft MapFilterIntoJoinRight MapEqualityIntoJoinLeftAndRight PushFilterIntoJoinLeft PushFilterIntoJoinRight SimplifyLeftJoin SimplifyRightJoin EliminateSemiJoin SimplifyZeroCardinalitySemiJoin EliminateAntiJoin SimplifyZeroCardinalityAntiJoin EliminateJoinNoColsLeft EliminateJoinNoColsRight HoistJoinProjectRight HoistJoinProjectLeft SimplifyJoinNotNullEquality ExtractJoinEqualities SortFiltersInJoin LeftAssociateJoinsLeft LeftAssociateJoinsRight RightAssociateJoinsLeft RightAssociateJoinsRight RemoveJoinNotNullCondition ProjectInnerJoinValues EliminateLimit EliminateOffset PushLimitIntoProject PushOffsetIntoProject PushLimitIntoOffset PushLimitIntoOrdinality PushLimitIntoJoinLeft PushLimitIntoJoinRight FoldLimits AssociateLimitJoinsLeft AssociateLimitJoinsRight EliminateMax1Row SimplifyPartialIndexProjections FoldPlusZero FoldZeroPlus FoldMinusZero FoldMultOne FoldOneMult FoldDivOne InvertMinus EliminateUnaryMinus SimplifyLimitOrdering SimplifyOffsetOrdering SimplifyGroupByOrdering SimplifyOrdinalityOrdering SimplifyExplainOrdering SimplifyWithBindingOrdering EliminateJoinUnderProjectLeft EliminateJoinUnderProjectRight EliminateProject MergeProjects MergeProjectWithValues PushColumnRemappingIntoValues PushAssignmentCastsIntoValues FoldTupleAccessIntoValues FoldJSONAccessIntoValues ConvertZipArraysToValues PruneProjectCols PruneScanCols PruneSelectCols PruneLimitCols PruneOffsetCols PruneJoinLeftCols PruneJoinRightCols PruneSemiAntiJoinRightCols PruneAggCols PruneGroupByCols PruneValuesCols PruneOrdinalityCols PruneExplainCols PruneProjectSetCols PruneWindowOutputCols PruneWindowInputCols PruneMutationFetchCols PruneMutationInputCols PruneMutationReturnCols PruneWithScanCols PruneWithCols PruneUnionAllCols RejectNullsLeftJoin RejectNullsRightJoin RejectNullsGroupBy RejectNullsUnderJoinLeft RejectNullsUnderJoinRight RejectNullsProject CommuteVar CommuteConst EliminateCoalesce SimplifyCoalesce EliminateCast NormalizeInConst FoldInNull SimplifyInSingleElement SimplifyNotInSingleElement UnifyComparisonTypes EliminateExistsZeroRows EliminateExistsProject EliminateExistsGroupBy InlineExistsSelectTuple IntroduceExistsLimit EliminateExistsLimit SimplifyCaseWhenConstValue InlineAnyValuesSingleCol InlineAnyValuesMultiCol SimplifyEqualsAnyTuple SimplifyAnyScalarArray FoldCollate NormalizeArrayFlattenToAgg SimplifySameVarEqualities SimplifySameVarInequalities SimplifyNotDisjoint SimplifySelectFilters ConsolidateSelectFilters DeduplicateSelectFilters EliminateSelect MergeSelects PushSelectIntoProject MergeSelectInnerJoin PushSelectCondLeftIntoJoinLeftAndRight PushSelectIntoJoinLeft PushSelectIntoGroupBy RemoveNotNullCondition PushSelectIntoProjectSet PushFilterIntoSetOp EliminateSetLeft EliminateSetRight EliminateDistinctSetLeft EliminateDistinctSetRight SimplifyExcept SimplifyIntersectLeft SimplifyIntersectRight ConvertUnionToDistinctUnionAll EliminateWindow ReduceWindowPartitionCols SimplifyWindowOrdering PushSelectIntoWindow PushLimitIntoWindow InlineWith // ------------------------------------------------------------ // Explore Rule Names // ------------------------------------------------------------ MemoCycleTestRelRule ReplaceScalarMinMaxWithScalarSubqueries ReplaceFilteredScalarMinMaxWithSubqueries ReplaceScalarMinMaxWithLimit ReplaceMinWithLimit ReplaceMaxWithLimit GenerateStreamingGroupBy SplitGroupByScanIntoUnionScans SplitGroupByFilteredScanIntoUnionScans EliminateIndexJoinOrProjectInsideGroupBy GenerateLimitedGroupByScans ReorderJoins CommuteLeftJoin CommuteSemiJoin ConvertSemiToInnerJoin GenerateMergeJoins GenerateLookupJoins GenerateInvertedJoins GenerateInvertedJoinsFromSelect GenerateLookupJoinsWithFilter GenerateLookupJoinsWithVirtualCols GenerateLookupJoinsWithVirtualColsAndFilter PushJoinIntoIndexJoin HoistProjectFromInnerJoin HoistProjectFromLeftJoin GenerateLocalityOptimizedAntiJoin GenerateLocalityOptimizedLookupJoin GenerateLimitedScans PushLimitIntoFilteredScan PushLimitIntoIndexJoin SplitLimitedScanIntoUnionScans SplitLimitedSelectIntoUnionSelects GenerateTopK GenerateLimitedTopKScans GeneratePartialOrderTopK EliminateIndexJoinInsideProject GenerateIndexScans GenerateLocalityOptimizedScan GeneratePartialIndexScans GenerateConstrainedScans GenerateInvertedIndexScans GenerateZigzagJoins GenerateInvertedIndexZigzagJoins SplitDisjunction SplitDisjunctionAddKey GenerateStreamingSetOp // NumRuleNames tracks the total count of rule names. NumRuleNames )
func (RuleName) IsNormalize ¶
IsNormalize returns true if r is a normalization rule.
type ScalarExpr ¶
type ScalarExpr interface {
Expr
// Rank is a value that defines how the scalar expression should be ordered
// among a collection of scalar expressions. It defines a total order over
// ScalarExprs within the context of a memo.
Rank() ScalarRank
// DataType is the SQL type of the expression.
DataType() *types.T
}
ScalarExpr is a scalar expression, which is an expression that returns a primitive-typed value like boolean or string rather than rows and columns.
type ScalarRank ¶
type ScalarRank int
ScalarRank is the type of the sort order given to every scalar expression.
type SchemaID ¶
type SchemaID int32
SchemaID uniquely identifies the usage of a schema within the scope of a query. SchemaID 0 is reserved to mean "unknown schema". Internally, the SchemaID consists of an index into the Metadata.schemas slice.
See the comment for Metadata for more details on identifiers.
type SequenceID ¶
type SequenceID uint64
SequenceID uniquely identifies the usage of a sequence within the scope of a query. SequenceID 0 is reserved to mean "unknown sequence".
type TableAnnID ¶
type TableAnnID int
TableAnnID uniquely identifies an annotation on an instance of table metadata. A table annotation allows arbitrary values to be cached with table metadata, which can be used to avoid recalculating base table properties or other information each time it's needed.
WARNING! When copying memo metadata (which happens when we use a cached memo), the annotations are cleared. Any code using a annotation must treat this as a best-effort cache and be prepared to repopulate the annotation as necessary.
To create a TableAnnID, call NewTableAnnID during Go's program initialization phase. The returned TableAnnID never clashes with other annotations on the same table. Here is a usage example:
var myAnnID = NewTableAnnID() md.SetTableAnnotation(TableID(1), myAnnID, "foo") ann := md.TableAnnotation(TableID(1), myAnnID)
Currently, the following annotations are in use:
- WeakKeys: weak keys derived from the base table
- Stats: statistics derived from the base table
To add an additional annotation, increase the value of maxTableAnnIDCount and add a call to NewTableAnnID.
func NewTableAnnID ¶
func NewTableAnnID() TableAnnID
NewTableAnnID allocates a unique annotation identifier that is used to associate arbitrary data with table metadata. Only maxTableAnnIDCount total annotation ID's can exist in the system. Attempting to exceed the maximum results in a panic.
This method is not thread-safe, and therefore should only be called during Go's program initialization phase (which uses a single goroutine to init variables).
See the TableAnnID comment for more details and a usage example.
type TableID ¶
type TableID uint64
TableID uniquely identifies the usage of a table within the scope of a query. TableID 0 is reserved to mean "unknown table".
Internally, the TableID consists of an index into the Metadata.tables slice, as well as the ColumnID of the first column in the table. Subsequent columns have sequential ids, relative to their ordinal position in the table.
See the comment for Metadata for more details on identifiers.
func (TableID) ColumnID ¶
ColumnID returns the metadata id of the column at the given ordinal position in the table.
NOTE: This method cannot do bounds checking, so it's up to the caller to
ensure that a column really does exist at this ordinal position.
func (TableID) ColumnOrdinal ¶
ColumnOrdinal returns the ordinal position of the given column in its base table.
NOTE: This method cannot do complete bounds checking, so it's up to the
caller to ensure that this column is really in the given base table.
type TableMeta ¶
type TableMeta struct {
// MetaID is the identifier for this table that is unique within the query
// metadata.
MetaID TableID
// Table is a reference to the table in the catalog.
Table cat.Table
// Alias stores the identifier used in the query to identify the table. This
// might be explicitly qualified (e.g. <catalog>.<schema>.<table>), or not
// (e.g. <table>). Or, it may be an alias used in the query, in which case it
// is always an unqualified name.
Alias tree.TableName
// IgnoreForeignKeys is true if we should disable any rules that depend on the
// consistency of outgoing foreign key references. Set by the
// IGNORE_FOREIGN_KEYS table hint; useful for scrub queries meant to verify
// the consistency of foreign keys.
IgnoreForeignKeys bool
// IgnoreUniqueWithoutIndexKeys is true if we should disable any rules that
// depend on the consistency of unique without index constraints.
IgnoreUniqueWithoutIndexKeys bool
// Constraints stores a *FiltersExpr containing filters that are known to
// evaluate to true on the table data. This list is extracted from validated
// check constraints; specifically, those check constraints that we can prove
// never evaluate to NULL (as NULL is treated differently in check constraints
// and filters).
//
// If nil, there are no check constraints.
//
// See comment above GenerateConstrainedScans for more detail.
Constraints ScalarExpr
// ComputedCols stores ScalarExprs for computed columns on the table, indexed
// by ColumnID. These will be used as "known truths" about data when
// constraining indexes. See comment above GenerateConstrainedScans for more
// detail.
//
// Computed columns with non-immutable operators are omitted.
ComputedCols map[ColumnID]ScalarExpr
// contains filtered or unexported fields
}
TableMeta stores information about one of the tables stored in the metadata.
NOTE: Metadata.DuplicateTable and TableMeta.copyFrom must be kept in sync with changes to this struct.
func (*TableMeta) AddCheckConstraintsStats ¶
AddCheckConstraintsStats adds a column, ColumnStatistic pair to the checkConstraintsStats map. When the table is duplicated, the mapping from the new check constraint ColumnID back to the original ColumnStatistic is recorded so it can be reused.
func (*TableMeta) AddComputedCol ¶
func (tm *TableMeta) AddComputedCol(colID ColumnID, computedCol ScalarExpr)
AddComputedCol adds a computed column expression to the table's metadata.
func (*TableMeta) AddIndexPartitionLocality ¶
func (tm *TableMeta) AddIndexPartitionLocality(ord cat.IndexOrdinal, ps *partition.PrefixSorter)
AddIndexPartitionLocality adds a PrefixSorter to the table's metadata for the index with IndexOrdinal ord.
func (*TableMeta) AddPartialIndexPredicate ¶
func (tm *TableMeta) AddPartialIndexPredicate(ord cat.IndexOrdinal, pred ScalarExpr)
AddPartialIndexPredicate adds a partial index predicate to the table's metadata.
func (*TableMeta) ComputedColExpr ¶
func (tm *TableMeta) ComputedColExpr(id ColumnID) (_ ScalarExpr, ok bool)
ComputedColExpr returns the computed expression for the given column, if it is a computed column. If it is not a computed column, ok=false is returned.
func (*TableMeta) IndexColumns ¶
IndexColumns returns the set of table columns in the given index. TODO(justin): cache this value in the table metadata.
func (*TableMeta) IndexColumnsMapInverted ¶
IndexColumnsMapInverted returns the set of table columns in the given index. Inverted index columns are mapped to their source column.
func (*TableMeta) IndexKeyColumns ¶
IndexKeyColumns returns the set of strict key columns in the given index.
func (*TableMeta) IndexKeyColumnsMapInverted ¶
IndexKeyColumnsMapInverted returns the set of strict key columns in the given index. Inverted index columns are mapped to their source column.
func (*TableMeta) IndexPartitionLocality ¶
func (tm *TableMeta) IndexPartitionLocality( ord cat.IndexOrdinal, index cat.Index, evalCtx *tree.EvalContext, ) (ps *partition.PrefixSorter, ok bool)
IndexPartitionLocality returns the given index's PrefixSorter.
func (*TableMeta) OrigCheckConstraintsStats ¶
func (tm *TableMeta) OrigCheckConstraintsStats( colID ColumnID, ) (origColumnStatistic interface{}, ok bool)
OrigCheckConstraintsStats looks up if statistics were ever created based on a CHECK constraint on colID, and if so, returns the original ColumnStatistic.
func (*TableMeta) PartialIndexPredicate ¶
func (tm *TableMeta) PartialIndexPredicate(ord cat.IndexOrdinal) (pred ScalarExpr, ok bool)
PartialIndexPredicate returns the given index's predicate scalar expression, if the index is a partial index. Returns ok=false if the index is not a partial index. Panics if the index is a partial index according to the catalog, but a predicate scalar expression does not exist in the table metadata.
func (*TableMeta) PartialIndexPredicatesUnsafe ¶
func (tm *TableMeta) PartialIndexPredicatesUnsafe() map[cat.IndexOrdinal]ScalarExpr
PartialIndexPredicatesUnsafe returns the partialIndexPredicates map.
WARNING: The returned map is NOT a source-of-truth for determining if an index is a partial index. It does not verify that all partial indexes in the catalog are included in the returned map. This function should only be used in tests or for formatting optimizer expressions. Use PartialIndexPredicate in all other cases.
func (*TableMeta) SetConstraints ¶
func (tm *TableMeta) SetConstraints(constraints ScalarExpr)
SetConstraints sets the filters derived from check constraints; see TableMeta.Constraint. The argument must be a *FiltersExpr.
func (*TableMeta) VirtualComputedColumns ¶
VirtualComputedColumns returns the set of virtual computed table columns.
type UniqueID ¶
type UniqueID uint64
UniqueID should be used to disambiguate multiple uses of an expression within the scope of a query. For example, a UniqueID field should be added to an expression type if two instances of that type might otherwise be indistinguishable based on the values of their other fields.
See the comment for Metadata for more details on identifiers.
type ViewDep ¶
type ViewDep struct {
DataSource cat.DataSource
// ColumnOrdinals is the set of column ordinals that are referenced by the
// view for this table.
ColumnOrdinals util.FastIntSet
// ColumnIDToOrd maps a scopeColumn's ColumnID to its ColumnOrdinal.
// This helps us add only the columns that are actually referenced
// by the view's query into the view dependencies. We add a
// dependency on a column only when the column is referenced by the view
// and created as a scopeColumn.
ColumnIDToOrd map[ColumnID]int
// If an index is referenced specifically (via an index hint), SpecificIndex
// is true and Index is the ordinal of that index.
SpecificIndex bool
Index cat.IndexOrdinal
}
ViewDep contains information about a view dependency.
func (ViewDep) GetColumnNames ¶
GetColumnNames returns a sorted list of the names of the column dependencies and a boolean to determine if the dependency was a table. We only track column dependencies on tables.
type ViewDeps ¶
type ViewDeps []ViewDep
ViewDeps contains information about the dependencies of a view.
type ViewTypeDeps ¶
type ViewTypeDeps = util.FastIntSet
ViewTypeDeps contains a set of the IDs of types that this view depends on.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package bench houses benchmarks for the SQL optimizer.
|
Package bench houses benchmarks for the SQL optimizer. |
|
Package cat contains interfaces that are used by the query optimizer to avoid including specifics of sqlbase structures in the opt code.
|
Package cat contains interfaces that are used by the query optimizer to avoid including specifics of sqlbase structures in the opt code. |
|
optgen
|
|
|
cmd/langgen
command
|
|
|
cmd/optfmt
command
optfmt pretty prints .opt files.
|
optfmt pretty prints .opt files. |
|
cmd/optgen
command
|
|
|
lang
Package lang implements a language called Optgen, short for "optimizer generator".
|
Package lang implements a language called Optgen, short for "optimizer generator". |
|
Package ordering contains operator-specific logic related to orderings - whether ops can provide Required orderings, what orderings do they need to require from their children, etc.
|
Package ordering contains operator-specific logic related to orderings - whether ops can provide Required orderings, what orderings do they need to require from their children, etc. |