HdkOnNativeDataframePartitionManager#

Public API#

class modin.experimental.core.execution.native.implementations.hdk_on_native.partitioning.partition_manager.HdkOnNativeDataframePartitionManager#

Frame manager for HdkOnNativeDataframe.

This class handles several features of HdkOnNativeDataframe:
  • frame always has a single partition

  • frame cannot process some data types

  • frame has to use mangling for index labels

  • frame uses HDK storage format for execution

classmethod from_arrow(at, return_dims=False, unsupported_cols=None, encode_col_names=True)#

Build partitions from a pyarrow.Table.

Parameters:
  • at (pyarrow.Table) – Input table.

  • return_dims (bool, default: False) – True to include dimensions into returned tuple.

  • unsupported_cols (list of str, optional) – List of columns holding unsupported data. If None then check all columns to compute the list.

  • encode_col_names (bool, default: True) – Encode column names.

Returns:

Tuple holding array of partitions, list of columns with unsupported data and optionally partitions’ dimensions.

Return type:

tuple

classmethod from_pandas(df, return_dims=False, encode_col_names=True)#

Build partitions from a pandas.DataFrame.

Parameters:
  • df (pandas.DataFrame) – Source frame.

  • return_dims (bool, default: False) – Include resulting dimensions into the returned value.

  • encode_col_names (bool, default: True) – Encode column names.

Returns:

Tuple holding array of partitions, list of columns with unsupported data and optionally partitions’ dimensions.

Return type:

tuple

classmethod import_table(frame, worker=<modin.experimental.core.execution.native.implementations.hdk_on_native.hdk_worker.HdkWorker object>) DbTable#

Import the frame’s partition data, if required.

Parameters:
Return type:

DbTable

classmethod run_exec_plan(plan)#

Run execution plan in HDK storage format to materialize frame.

Parameters:

plan (DFAlgNode) – A root of an execution plan tree.

Returns:

Created frame’s partitions.

Return type:

np.array