TransformMapper

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.TransformMapper(op)

A helper class for InputMapper.

This class is used to map column references to expressions used for their computation. This mapper is used to fold expressions from multiple TransformNode-s into a single expression.

Parameters

op (TransformNode) – Transformation used for mapping.

_op

Transformation used for mapping.

Type

TransformNode

translate(col)

Translate column reference by its name.

Parameters

col (str) – A name of the column to translate.

Returns

Translated expression.

Return type

BaseExpr

FrameMapper

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.FrameMapper(frame)

A helper class for InputMapper.

This class is used to map column references to another frame. This mapper is used to replace input frame in expressions.

Parameters

frame (OmnisciOnNativeFrame) – Target frame.

_frame

Target frame.

Type

OmnisciOnNativeFrame

translate(col)

Translate column reference by its name.

Parameters

col (str) – A name of the column to translate.

Returns

Translated expression.

Return type

BaseExpr

InputMapper

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.InputMapper

Input reference mapper.

This class is used for input translation/replacement in expressions via BaseExpr.translate_input method.

Translation is performed using column mappers registered via add_mapper method. Each input frame can have at most one mapper. References to frames with no registered mapper are not translated.

_mappers

Column mappers to use for translation.

Type

dict

add_mapper(frame, mapper)

Register a mapper for a frame.

Parameters
  • frame (OmnisciOnNativeFrame) – A frame for which a mapper is registered.

  • mapper (object) – A mapper to register.

translate(ref)

Translate column reference by its name.

Parameters

ref (InputRefExpr) – A column reference to translate.

Returns

Translated expression.

Return type

BaseExpr

DFAlgNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.DFAlgNode

A base class for dataframe algebra tree node.

A dataframe algebra tree is used to describe how dataframe is computed.

input

Holds child nodes.

Type

list of DFAlgNode, optional

collect_frames()

Collect all frames participating in a tree.

Returns

A list of collected frames.

Return type

list

collect_partitions()

Collect all partitions participating in a tree.

Returns

A list of collected partitions.

Return type

list

abstract copy()

Make a shallow copy of the node.

Returns

Return type

DFAlgNode

dump(prefix='')

Dump the tree.

Parameters

prefix (str, default: '') – A prefix to add at each string of the dump.

dumps(prefix='')

Return a string representation of the tree.

Parameters

prefix (str, default: '') – A prefix to add at each string of the dump.

Returns

Return type

str

walk_dfs(cb, *args, **kwargs)

Perform a depth-first walk over a tree.

Walk over an input in the depth-first order and call a callback function for each node.

Parameters
  • cb (callable) – A callback function.

  • *args (list) – Arguments for the callback.

  • **kwargs (dict) – Keyword arguments for the callback.

FrameNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.FrameNode(modin_frame)

A node to reference a materialized frame.

Parameters

modin_frame (OmnisciOnNativeFrame) – Referenced frame.

modin_frame

Referenced frame.

Type

OmnisciOnNativeFrame

copy()

Make a shallow copy of the node.

Returns

Return type

FrameNode

MaskNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.MaskNode(base, row_indices=None, row_numeric_idx=None)

A filtering node which filters rows by index values or row id.

Parameters
  • base (DFAlgNode) – A filtered frame.

  • row_indices (list, optional) – List of index values to select.

  • row_numeric_idx (list of int, optional) – List of rows ids to select.

input

Holds a single filtered frame.

Type

list of DFAlgNode

row_indices

List of index values to select.

Type

list or None

row_numeric_idx

List of rows ids to select.

Type

list of int or None

copy()

Make a shallow copy of the node.

Returns

Return type

MaskNode

GroupbyAggNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.GroupbyAggNode(base, by, agg_exprs, groupby_opts)

A node to represent a groupby aggregation operation.

Parameters
  • base (DFAlgNode) – An aggregated frame.

  • by (list of str) – A list of columns used for grouping.

  • agg_exprs (dict) – Aggregates to compute.

  • groupby_opts (dict) – Additional groupby parameters.

input

Holds a single aggregated frame.

Type

list of DFAlgNode

by

A list of columns used for grouping.

Type

list of str

agg_exprs

Aggregates to compute.

Type

dict

groupby_opts

Additional groupby parameters.

Type

dict

copy()

Make a shallow copy of the node.

Returns

Return type

GroupbyAggNode

TransformNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.TransformNode(base, exprs, fold=True)

A node to represent a projection of a single frame.

Provides expressions to compute each column of the projection.

Parameters
  • base (DFAlgNode) – A transformed frame.

  • exprs (dict) – Expressions for frame’s columns computation.

  • fold (bool, default: True) – If True and base is another TransformNode, then translate all expressions in expr to its base.

input

Holds a single projected frame.

Type

list of DFAlgNode

exprs

Expressions used to compute frame’s columns.

Type

dict

_original_refs

Set of columns expressed with InputRefExpr prior folding.

Type

set

copy()

Make a shallow copy of the node.

Returns

Return type

TransformNode

fold()

Fold two TransformNode-s.

If base of this node is another TransformNode, then translate all expressions in expr to its base.

is_original_ref(col)

Check original column expression type.

Return True if col is an InputRefExpr expression or originally was an InputRefExpr expression before folding.

Parameters

col (str) – Column name.

Returns

Return type

bool

is_simple_select()

Check if transform node is a simple selection.

Simple selection can only use InputRefExpr expressions.

Returns

True for simple select and False otherwise.

Return type

bool

JoinNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.JoinNode(left, right, how='inner', exprs=None, condition=None)

A node to represent a join of two frames.

Parameters
  • left (DFAlgNode) – A left frame to join.

  • right (DFAlgNode) – A right frame to join.

  • how (str, default: "inner") – A type of join.

  • exprs (dict, default: None) – Expressions for the resulting frame’s columns.

  • condition (BaseExpr, default: None) – Join condition.

input

Holds joined frames. The first frame in the list is considered as the left join operand.

Type

list of DFAlgNode

how

A type of join.

Type

str

exprs

Expressions for the resulting frame’s columns.

Type

dict

condition

Join condition.

Type

BaseExpr

copy()

Make a shallow copy of the node.

Returns

Return type

JoinNode

UnionNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.UnionNode(frames)

A node to represent rows union of input frames.

Parameters

frames (list of DFAlgNode) – Input frames.

input

Input frames.

Type

list of DFAlgNode

copy()

Make a shallow copy of the node.

Returns

Return type

UnionNode

SortNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.SortNode(frame, columns, ascending, na_position)

A sort node to order frame’s rows in a specified order.

Parameters
  • frame (DFAlgNode) – Sorted frame.

  • columns (list of str) – A list of key columns for a sort.

  • ascending (bool) – Ascending or descending sort.

  • na_position ({"first", "last"}) – “first” to put NULLs at the start of the result, “last” to put NULLs at the end of the result.

input

Holds a single sorted frame.

Type

list of DFAlgNode

columns

A list of key columns for a sort.

Type

list of str

ascending

Ascending or descending sort.

Type

bool

na_position

“first” to put NULLs at the start of the result, “last” to put NULLs at the end of the result.

Type

{“first”, “last”}

copy()

Make a shallow copy of the node.

Returns

Return type

SortNode

FilterNode

Public API

class modin.experimental.engines.omnisci_on_native.frame.df_algebra.FilterNode(frame, condition)

A node for generic rows filtering.

For rows filter by row id a MaskNode should be preferred.

Parameters
input

Holds a single filtered frame.

Type

list of DFAlgNode

condition

Filter condition.

Type

BaseExpr

copy()

Make a shallow copy of the node.

Returns

Return type

FilterNode

Utilities

Public API

modin.experimental.engines.omnisci_on_native.frame.df_algebra.translate_exprs_to_base(exprs, base)

Fold expressions.

Fold expressions with their input nodes until base frame is the only input frame.

Parameters
  • exprs (dict) – Expressions to translate.

  • base (OmnisciOnNativeFrame) – Required input frame for translated expressions.

Returns

Translated expressions.

Return type

dict

modin.experimental.engines.omnisci_on_native.frame.df_algebra.replace_frame_in_exprs(exprs, old_frame, new_frame)

Translate input expression replacing an input frame in them.

Parameters
Returns

Translated expressions.

Return type

dict