CalciteBaseNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteBaseNode(relOp)#

A base class for a Calcite computation sequence node.

Calcite nodes are not combined into a tree but usually stored in a sequence which works similar to a stack machine: the result of the previous operation is an implicit operand of the current one. Input nodes also can be referenced directly via its unique ID number.

Calcite nodes structure is based on a JSON representation used by HDK for parsed queries serialization/deserialization for interactions with a Calcite server. Currently, this format is internal and is not a part of public API. It’s not documented and can be modified in an incompatible way in the future.

Parameters:

relOp (str) – An operation name.

id#

Id of the node. Should be unique within a single query.

Type:

int

relOp#

Operation name.

Type:

str

classmethod reset_id(next_id=0)#

Reset ID to be used for the next new node to next_id.

Can be used to have a zero-based numbering for each generated query.

Parameters:

next_id (int, default: 0) – Next node id.

CalciteScanNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteScanNode(modin_frame)#

A node to represent a scan operation.

Scan operation can only be applied to physical tables.

Parameters:

modin_frame (HdkOnNativeDataframe) – A frame to scan. The frame should have a materialized table in HDK.

table#

A list holding a database name and a table name.

Type:

list of str

fieldNames#

A list of columns to include into the scan.

Type:

list of str

inputs#

An empty list existing for the sake of serialization simplicity. Has no meaning but is expected by HDK deserializer.

Type:

list

CalciteProjectionNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteProjectionNode(fields, exprs)#

A node to represent a projection operation.

Parameters:
  • fields (list of str) – Output column names.

  • exprs (list of BaseExpr) – Output column expressions.

fields#

A list of output columns.

Type:

list of str

exprs#

A list of expressions describing how output columns are computed. Order of expression follows fields order.

Type:

list of BaseExpr

CalciteFilterNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteFilterNode(condition)#

A node to represent a filter operation.

Parameters:

condition (BaseExpr) – A filtering condition.

condition#

A filter to apply.

Type:

BaseExpr

CalciteAggregateNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteAggregateNode(fields, group, aggs)#

A node to represent an aggregate operation.

Parameters:
  • fields (list of str) – Output field names.

  • group (list of CalciteInputIdxExpr) – Group key columns.

  • aggs (list of BaseExpr) – Aggregates to compute.

fields#

Output field names.

Type:

list of str

group#

Group key columns.

Type:

list of CalciteInputIdxExpr

aggs#

Aggregates to compute.

Type:

list of BaseExpr

CalciteCollation#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteCollation(field, dir='ASCENDING', nulls='LAST')#

A structure to describe sorting order.

Parameters:
  • field (CalciteInputIdxExpr) – A column to sort by.

  • dir ({"ASCENDING", "DESCENDING"}, default: "ASCENDING") – A sort order.

  • nulls ({"LAST", "FIRST"}, default: "LAST") – NULLs position after the sort.

field#

A column to sort by.

Type:

CalciteInputIdxExpr

dir#

A sort order.

Type:

{“ASCENDING”, “DESCENDING”}

nulls#

NULLs position after the sort.

Type:

{“LAST”, “FIRST”}

CalciteSortNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteSortNode(collation)#

A node to represent a sort operation.

Parameters:

collation (list of CalciteCollation) – Sort keys.

collation#

Sort keys.

Type:

list of CalciteCollation

CalciteJoinNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteJoinNode(left_id, right_id, how, condition)#

A node to represent a join operation.

Parameters:
  • left_id (int) – ID of the left join operand.

  • right_id (int) – ID of the right join operand.

  • how (str) – Type of the join.

  • condition (BaseExpr) – Join condition.

inputs#

IDs of the left and the right operands of the join.

Type:

list of int

joinType#

Type of the join.

Type:

str

condition#

Join condition.

Type:

BaseExpr

CalciteUnionNode#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteUnionNode(inputs, all)#

A node to represent a union operation.

Parameters:
  • inputs (list of int) – Input frame IDs.

  • all (bool) – True for UNION ALL operation.

inputs#

Input frame IDs.

Type:

list of int

all#

True for UNION ALL operation.

Type:

bool

CalciteInputRefExpr#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteInputRefExpr(idx)#

Calcite version of input column reference.

Calcite translation should replace all InputRefExpr.

Calcite references columns by their indexes (positions in input table). If there are multiple input tables for Calcite node, then a position in a concatenated list of all columns is used.

Parameters:

idx (int) – Input column index.

input#

Input column index.

Type:

int

copy()#

Make a shallow copy of the expression.

Return type:

CalciteInputRefExpr

CalciteInputIdxExpr#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.calcite_algebra.CalciteInputIdxExpr(idx)#

Basically the same as CalciteInputRefExpr but with a different serialization.

Parameters:

idx (int) – Input column index.

input#

Input column index.

Type:

int

copy()#

Make a shallow copy of the expression.

Return type:

CalciteInputIdxExpr