PandasOnRayFrame

The class is specific implementation of PandasFrame class using Ray distributed engine. It serves as an intermediate level between PandasQueryCompiler and PandasOnRayFramePartitionManager.

Public API

class modin.engines.ray.pandas_on_ray.frame.data.PandasOnRayFrame(partitions, index, columns, row_lengths=None, column_widths=None, dtypes=None)

The class implements the interface in PandasFrame using Ray.

Parameters
  • partitions (np.ndarray) – A 2D NumPy array of partitions.

  • index (sequence) – The index for the dataframe. Converted to a pandas.Index.

  • columns (sequence) – The columns object for the dataframe. Converted to a pandas.Index.

  • row_lengths (list, optional) – The length of each partition in the rows. The “height” of each of the block partitions. Is computed if not provided.

  • column_widths (list, optional) – The width of each partition in the columns. The “width” of each of the block partitions. Is computed if not provided.

  • dtypes (pandas.Series, optional) – The data types for the dataframe columns.

classmethod combine_dtypes(list_of_dtypes, column_names)

Describe how data types should be combined when they do not match.

Parameters
  • list_of_dtypes (list) – A list of pandas.Series with the data types.

  • column_names (list) – The names of the columns that the data types map to.

Returns

A pandas.Series containing the finalized data types.

Return type

pandas.Series