TransformMapper#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.TransformMapper(op)#
A helper class for
InputMapper
.This class is used to map column references to expressions used for their computation. This mapper is used to fold expressions from multiple
TransformNode
-s into a single expression.- Parameters
op (TransformNode) – Transformation used for mapping.
- _op#
Transformation used for mapping.
- Type
FrameMapper#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.FrameMapper(frame)#
A helper class for
InputMapper
.This class is used to map column references to another frame. This mapper is used to replace input frame in expressions.
- Parameters
frame (HdkOnNativeDataframe) – Target frame.
- _frame#
Target frame.
- Type
InputMapper#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.InputMapper#
Input reference mapper.
This class is used for input translation/replacement in expressions via
BaseExpr.translate_input
method.Translation is performed using column mappers registered via add_mapper method. Each input frame can have at most one mapper. References to frames with no registered mapper are not translated.
- _mappers#
Column mappers to use for translation.
- Type
dict
- add_mapper(frame, mapper)#
Register a mapper for a frame.
- Parameters
frame (HdkOnNativeDataframe) – A frame for which a mapper is registered.
mapper (object) – A mapper to register.
- translate(ref)#
Translate column reference by its name.
- Parameters
ref (InputRefExpr) – A column reference to translate.
- Returns
Translated expression.
- Return type
DFAlgNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.DFAlgNode#
A base class for dataframe algebra tree node.
A dataframe algebra tree is used to describe how dataframe is computed.
- collect_frames()#
Collect all frames participating in a tree.
- Returns
A list of collected frames.
- Return type
list
- collect_partitions()#
Collect all partitions participating in a tree.
- Returns
A list of collected partitions.
- Return type
list
- dump(prefix='')#
Dump the tree.
- Parameters
prefix (str, default: '') – A prefix to add at each string of the dump.
- dumps(prefix='')#
Return a string representation of the tree.
- Parameters
prefix (str, default: '') – A prefix to add at each string of the dump.
- Return type
str
- walk_dfs(cb, *args, **kwargs)#
Perform a depth-first walk over a tree.
Walk over an input in the depth-first order and call a callback function for each node.
- Parameters
cb (callable) – A callback function.
*args (list) – Arguments for the callback.
**kwargs (dict) – Keyword arguments for the callback.
FrameNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.FrameNode(modin_frame)#
A node to reference a materialized frame.
- Parameters
modin_frame (HdkOnNativeDataframe) – Referenced frame.
- modin_frame#
Referenced frame.
- Type
MaskNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.MaskNode(base, row_labels=None, row_positions=None)#
A filtering node which filters rows by index values or row id.
- Parameters
base (DFAlgNode) – A filtered frame.
row_labels (list, optional) – List of row labels to select.
row_positions (list of int, optional) – List of rows ids to select.
- row_labels#
List of row labels to select.
- Type
list or None
- row_positions#
List of rows ids to select.
- Type
list of int or None
GroupbyAggNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.GroupbyAggNode(base, by, agg_exprs, groupby_opts)#
A node to represent a groupby aggregation operation.
- Parameters
base (DFAlgNode) – An aggregated frame.
by (list of str) – A list of columns used for grouping.
agg_exprs (dict) – Aggregates to compute.
groupby_opts (dict) – Additional groupby parameters.
- by#
A list of columns used for grouping.
- Type
list of str
- agg_exprs#
Aggregates to compute.
- Type
dict
- groupby_opts#
Additional groupby parameters.
- Type
dict
- copy()#
Make a shallow copy of the node.
- Return type
TransformNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.TransformNode(base, exprs, fold=True)#
A node to represent a projection of a single frame.
Provides expressions to compute each column of the projection.
- Parameters
base (DFAlgNode) – A transformed frame.
exprs (dict) – Expressions for frame’s columns computation.
fold (bool, default: True) – If True and base is another TransformNode, then translate all expressions in expr to its base.
- exprs#
Expressions used to compute frame’s columns.
- Type
dict
- _original_refs#
Set of columns expressed with InputRefExpr prior folding.
- Type
set
- copy()#
Make a shallow copy of the node.
- Return type
- fold()#
Fold two
TransformNode
-s.If base of this node is another TransformNode, then translate all expressions in expr to its base.
- is_original_ref(col)#
Check original column expression type.
Return True if col is an
InputRefExpr
expression or originally was anInputRefExpr
expression before folding.- Parameters
col (str) – Column name.
- Return type
bool
- is_simple_select()#
Check if transform node is a simple selection.
Simple selection can only use InputRefExpr expressions.
- Returns
True for simple select and False otherwise.
- Return type
bool
JoinNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.JoinNode(left, right, how='inner', exprs=None, condition=None)#
A node to represent a join of two frames.
- Parameters
- input#
Holds joined frames. The first frame in the list is considered as the left join operand.
- Type
list of DFAlgNode
- how#
A type of join.
- Type
str
- exprs#
Expressions for the resulting frame’s columns.
- Type
dict
UnionNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.UnionNode(frames, join, sort, ignore_index)#
A node to represent rows union of input frames.
- Parameters
frames (list of DFAlgNode) – Input frames.
join (str) – Either outer or inner.
sort (bool) – Sort columns.
ignore_index (bool) – Ignore index columns.
SortNode#
Public API#
- class modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.SortNode(frame, columns, ascending, na_position)#
A sort node to order frame’s rows in a specified order.
- Parameters
frame (DFAlgNode) – Sorted frame.
columns (list of str) – A list of key columns for a sort.
ascending (bool) – Ascending or descending sort.
na_position ({"first", "last"}) – “first” to put NULLs at the start of the result, “last” to put NULLs at the end of the result.
- columns#
A list of key columns for a sort.
- Type
list of str
- ascending#
Ascending or descending sort.
- Type
bool
- na_position#
“first” to put NULLs at the start of the result, “last” to put NULLs at the end of the result.
- Type
{“first”, “last”}
FilterNode#
Public API#
Utilities#
Public API#
- modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.translate_exprs_to_base(exprs, base)#
Fold expressions.
Fold expressions with their input nodes until base frame is the only input frame.
- Parameters
exprs (dict) – Expressions to translate.
base (HdkOnNativeDataframe) – Required input frame for translated expressions.
- Returns
Translated expressions.
- Return type
dict
- modin.experimental.core.execution.native.implementations.hdk_on_native.df_algebra.replace_frame_in_exprs(exprs, old_frame, new_frame)#
Translate input expression replacing an input frame in them.
- Parameters
exprs (dict) – Expressions to translate.
old_frame (HdkOnNativeDataframe) – An input frame to replace.
new_frame (HdkOnNativeDataframe) – A new input frame to use.
- Returns
Translated expressions.
- Return type
dict