cuDFOnRayFramePartition

The class is the specific implementation of PandasFramePartition, providing the API to perform operations on a block partition, namely, cudf.DataFrame, using Ray as an execution engine.

An operation on a block partition can be performed asynchronously in two ways:

Public API

class modin.engines.ray.cudf_on_ray.frame.partition.cuDFOnRayFramePartition(gpu_manager, key, length=None, width=None)

The class implements the interface in PandasFramePartition using cuDF on Ray.

Parameters
  • gpu_manager (modin.engines.ray.cudf_on_ray.frame.GPUManager) – A gpu manager to store cuDF dataframes.

  • key (ray.ObjectRef or int) – An integer key (or reference to key) associated with cudf.DataFrame stored in gpu_manager.

  • length (ray.ObjectRef or int, optional) – Length or reference to it of wrapped pandas.DataFrame.

  • width (ray.ObjectRef or int, optional) – Width or reference to it of wrapped pandas.DataFrame.

add_to_apply_calls(func, **kwargs)

Apply func to this partition and create new.

Parameters
  • func (callable) – A function to apply.

  • **kwargs (dict) – Additional keywords arguments to be passed in func.

Returns

New partition based on result of func.

Return type

cuDFOnRayFramePartition

Notes

We eagerly schedule the apply func and produce a new cuDFOnRayFramePartition.

apply(func, **kwargs)

Apply func to this partition.

Parameters
  • func (callable) – A function to apply.

  • **kwargs (dict) – Additional keywords arguments to be passed in func.

Returns

A reference to integer key of result in internal dict-storage of self.gpu_manager.

Return type

ray.ObjectRef

apply_result_not_dataframe(func, **kwargs)

Apply func to this partition.

Parameters
  • func (callable) – A function to apply.

  • **kwargs (dict) – Additional keywords arguments to be passed in func.

Returns

A reference to integer key of result in internal dict-storage of self.gpu_manager.

Return type

ray.ObjectRef

copy()

Create a full copy of this object.

Returns

Return type

cuDFOnRayFramePartition

free()

Free the dataFrame and associated self.key out of self.gpu_manager.

get()

Get object stored by this partition from self.gpu_manager.

Returns

Return type

ray.ObjectRef

get_gpu_manager()

Get gpu manager associated with this partition.

Returns

GPUManager associated with this object.

Return type

modin.engines.ray.cudf_on_ray.frame.GPUManager

get_key()

Get integer key of this partition in dict-storage of self.gpu_manager.

Returns

Return type

int

get_object_id()

Get object stored for this partition from self.gpu_manager.

Returns

Return type

ray.ObjectRef

length()

Get the length of the object wrapped by this partition.

Returns

The length (or reference to length) of the object.

Return type

int or ray.ObjectRef

mask(row_indices, col_indices)

Select columns or rows from given indices.

Parameters
  • row_indices (list of hashable) – The row labels to extract.

  • col_indices (list of hashable) – The column labels to extract.

Returns

A reference to integer key of result in internal dict-storage of self.gpu_manager.

Return type

ray.ObjectRef

classmethod preprocess_func(func)

Put func to Ray object store.

Parameters

func (callable) – Function to put.

Returns

A reference to func in Ray object store.

Return type

ray.ObjectRef

classmethod put(gpu_manager, pandas_dataframe)

Put pandas_dataframe to gpu_manager.

Parameters
  • gpu_manager (modin.engines.ray.cudf_on_ray.frame.GPUManager) – A gpu manager to store cuDF dataframes.

  • pandas_dataframe (pandas.DataFrame/pandas.Series) – A pandas.DataFrame/pandas.Series to put.

Returns

A reference to integer key of added pandas.DataFrame to internal dict-storage in gpu_manager.

Return type

ray.ObjectRef

to_numpy()

Convert this partition to NumPy array.

Returns

Return type

NumPy array

to_pandas()

Convert this partition to pandas.DataFrame.

Returns

Return type

pandas.DataFrame

width()

Get the width of the object wrapped by this partition.

Returns

The width (or reference to width) of the object.

Return type

int or ray.ObjectRef