Generic Ray-based members#

Objects which are storage format agnostic but require specific Ray implementation are placed in modin.core.execution.ray.generic.

Their purpose is to implement certain parallel I/O operations and to serve as a foundation for building storage format specific objects:

class modin.core.execution.ray.generic.io.RayIO#

Base class for doing I/O operations over Ray.

classmethod from_ray(ray_obj)#

Create a Modin query_compiler from a Ray Dataset.

Parameters:

ray_obj (ray.data.Dataset) – The Ray Dataset to convert from.

Returns:

QueryCompiler containing data from the Ray Dataset.

Return type:

BaseQueryCompiler

Notes

This function must be implemented in every subclass otherwise NotImplementedError will be raised.

classmethod to_ray(modin_obj)#

Convert a Modin DataFrame/Series to a Ray Dataset.

Parameters:

modin_obj (modin.pandas.DataFrame, modin.pandas.Series) – The Modin DataFrame/Series to convert.

Returns:

Converted object with type depending on input.

Return type:

ray.data.Dataset

Notes

This function must be implemented in every subclass otherwise NotImplementedError will be raised.

class modin.core.execution.ray.generic.partitioning.GenericRayDataframePartitionManager#

The class implements the interface in PandasDataframePartitionManager.

classmethod to_numpy(partitions, **kwargs)#

Convert partitions into a NumPy array.

Parameters:
  • partitions (NumPy array) – A 2-D array of partitions to convert to local NumPy array.

  • **kwargs (dict) – Keyword arguments to pass to each partition .to_numpy() call.

Return type:

NumPy array