CalciteBuilder#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_builder.CalciteBuilder#

Translator used to transform DFAlgNode tree into a calcite node sequence.

class CompoundAggregate(builder, arg)#

A base class for a compound aggregate translation.

Translation is done in three steps. Step 1 is an additional values generation using a projection. Step 2 is a generation of aggregates that will be later used for a compound aggregate value computation. Step 3 is a final aggregate value generation using another projection.

Parameters:
gen_agg_exprs()#

Generate intermediate aggregates required for a compound aggregate computation.

Returns:

New aggregate expressions mapped to their names.

Return type:

dict

gen_proj_exprs()#

Generate values required for intermediate aggregates computation.

Returns:

New column expressions mapped to their names.

Return type:

dict

gen_reduce_expr()#

Generate an expression for a compound aggregate.

Returns:

A final compound aggregate expression.

Return type:

BaseExpr

class CompoundAggregateWithColArg(agg, builder, arg, dtype=None)#

A base class for a compound aggregate that require a LiteralExpr column argument.

This aggregate requires 2 arguments. The first argument is an InputRefExpr, refering to the aggregation column. The second argument is a LiteralExpr, this expression is added into the frame as a new column.

Parameters:
  • agg (str) – Aggregate name.

  • builder (CalciteBuilder) – A builder to use for translation.

  • arg (List of BaseExpr) – Aggregate arguments.

  • dtype (dtype, optional) – Aggregate data type. If not specified, _dtype from the first argument is used.

gen_agg_exprs()#

Generate intermediate aggregates required for a compound aggregate computation.

Returns:

New aggregate expressions mapped to their names.

Return type:

dict

gen_proj_exprs()#

Generate values required for intermediate aggregates computation.

Returns:

New column expressions mapped to their names.

Return type:

dict

gen_reduce_expr()#

Generate an expression for a compound aggregate.

Returns:

A final compound aggregate expression.

Return type:

BaseExpr

class InputContext(input_frames, input_nodes)#

A class to track current input frames and corresponding nodes.

Used to translate input column references to numeric indices.

Parameters:
  • input_frames (list of DFAlgNode) – Input nodes of the currently translated node.

  • input_nodes (list of CalciteBaseNode) – Translated input nodes.

input_nodes#

Input nodes of the currently translated node.

Type:

list of CalciteBaseNode

frame_to_node#

Maps input frames to corresponding calcite nodes.

Type:

dict

input_offsets#

Maps input frame to an input index used for its first column.

Type:

dict

replacements#

Maps input frame to a new list of columns to use. Used when a single DFAlgNode is lowered into multiple computation steps, e.g. for compound aggregates requiring additional projections.

Type:

dict

input_ids()#

Get ids of all input nodes.

Return type:

list of int

ref(frame, col)#

Translate input column into CalciteInputRefExpr.

Parameters:
  • frame (DFAlgNode) – An input frame.

  • col (str) – An input column.

Return type:

CalciteInputRefExpr

ref_idx(frame, col)#

Translate input column into CalciteInputIdxExpr.

Parameters:
  • frame (DFAlgNode) – An input frame.

  • col (str) – An input column.

Return type:

CalciteInputIdxExpr

replace_input_node(frame, node, new_cols)#

Use node as an input node for references to columns of frame.

Parameters:
  • frame (DFAlgNode) – Replaced input frame.

  • node (CalciteBaseNode) – A new node to use.

  • new_cols (list of str) – A new columns list to use.

translate(expr)#

Translate an expression.

Translation is done by replacing InputRefExpr with CalciteInputRefExpr and CalciteInputIdxExpr.

Parameters:

expr (BaseExpr) – An expression to translate.

Returns:

Translated expression.

Return type:

BaseExpr

class InputContextMgr(builder, input_frames, input_nodes)#

A helper class to manage an input context stack.

The class is designed to be used in a recursion with nested ‘with’ statements.

Parameters:
builder#

An outer builder.

Type:

CalciteBuilder

input_frames#

Input nodes for the new context.

Type:

list of DFAlgNode

input_nodes#

Translated input nodes.

Type:

list of CalciteBaseNode

class QuantileAggregate(builder, arg)#

A QUANTILE aggregate generator.

Parameters:
  • builder (CalciteBuilder) – A builder to use for translation.

  • arg (List of BaseExpr) –

    A list of 3 values:
    1. InputRefExpr - the column to compute the quantiles for.

    2. LiteralExpr - the quantile value.

    3. str - the interpolation method to use.

gen_agg_exprs()#

Generate intermediate aggregates required for a compound aggregate computation.

Returns:

New aggregate expressions mapped to their names.

Return type:

dict

class SkewAggregate(builder, arg)#

An unbiased skew aggregate generator.

Parameters:
  • builder (CalciteBuilder) – A builder to use for translation.

  • arg (list of BaseExpr) – An aggregated value.

gen_agg_exprs()#

Generate intermediate aggregates required for a compound aggregate computation.

Returns:

New aggregate expressions mapped to their names.

Return type:

dict

gen_proj_exprs()#

Generate values required for intermediate aggregates computation.

Returns:

New column expressions mapped to their names.

Return type:

dict

gen_reduce_expr()#

Generate an expression for a compound aggregate.

Returns:

A final compound aggregate expression.

Return type:

BaseExpr

class StdAggregate(builder, arg)#

A sample standard deviation aggregate generator.

Parameters:
  • builder (CalciteBuilder) – A builder to use for translation.

  • arg (list of BaseExpr) – An aggregated value.

gen_agg_exprs()#

Generate intermediate aggregates required for a compound aggregate computation.

Returns:

New aggregate expressions mapped to their names.

Return type:

dict

gen_proj_exprs()#

Generate values required for intermediate aggregates computation.

Returns:

New column expressions mapped to their names.

Return type:

dict

gen_reduce_expr()#

Generate an expression for a compound aggregate.

Returns:

A final compound aggregate expression.

Return type:

BaseExpr

class TopkAggregate(builder, arg)#

A TOP_K aggregate generator.

Parameters:
  • builder (CalciteBuilder) – A builder to use for translation.

  • arg (List of BaseExpr) – An aggregated values.

gen_reduce_expr()#

Generate an expression for a compound aggregate.

Returns:

A final compound aggregate expression.

Return type:

BaseExpr

build(op)#

Translate a DFAlgNode tree into a calcite nodes sequence.

Parameters:

op (DFAlgNode) – A tree to translate.

Returns:

The resulting calcite nodes sequence.

Return type:

list of CalciteBaseNode