cuDFOnRayDataframePartitionManager#

This class is the specific implementation of GenericRayDataframePartitionManager. It serves as an intermediate level between cuDFOnRayDataframe and cuDFOnRayDataframePartition class. This class is responsible for partition manipulation and applying a function to block/row/column partitions.

Public API#

class modin.core.execution.ray.implementations.cudf_on_ray.partitioning.cuDFOnRayDataframePartitionManager#

The class implements the interface in GenericRayDataframePartitionManager using cuDF on Ray.

classmethod from_pandas(df, return_dims=False)#

Create partitions from pandas.DataFrame/pandas.Series.

Parameters:
  • df (pandas.DataFrame/pandas.Series) – A pandas.DataFrame to add.

  • return_dims (boolean, default: False) – Is return dimensions or not.

Returns:

List of partitions in case return_dims == False, tuple (partitions, row lengths, col widths) in other case.

Return type:

list or tuple

classmethod lazy_map_partitions(partitions, map_func)#

Apply map_func to every partition lazily.

Compared to Modin-CPU, Modin-GPU lazy version represents:

  1. A scheduled function in the Ray task graph.

  2. A non-materialized key.

Parameters:
  • partitions (np.ndarray) – NumPy array with partitions.

  • map_func (callable) – The function to apply.

Returns:

A NumPy array of cuDFOnRayDataframePartition objects.

Return type:

np.ndarray