PandasOnRayDataframe¶
The class is specific implementation of PandasDataframe
class using Ray distributed engine. It serves as an intermediate level between
PandasQueryCompiler
and
PandasOnRayDataframePartitionManager
.
Public API¶
- class modin.core.execution.ray.implementations.pandas_on_ray.dataframe.dataframe.PandasOnRayDataframe(partitions, index, columns, row_lengths=None, column_widths=None, dtypes=None)¶
The class implements the interface in
PandasDataframe
using Ray.- Parameters
partitions (np.ndarray) – A 2D NumPy array of partitions.
index (sequence) – The index for the dataframe. Converted to a
pandas.Index
.columns (sequence) – The columns object for the dataframe. Converted to a
pandas.Index
.row_lengths (list, optional) – The length of each partition in the rows. The “height” of each of the block partitions. Is computed if not provided.
column_widths (list, optional) – The width of each partition in the columns. The “width” of each of the block partitions. Is computed if not provided.
dtypes (pandas.Series, optional) – The data types for the dataframe columns.
- classmethod combine_dtypes(list_of_dtypes, column_names)¶
Describe how data types should be combined when they do not match.
- Parameters
list_of_dtypes (list) – A list of
pandas.Series
with the data types.column_names (list) – The names of the columns that the data types map to.
- Returns
A
pandas.Series
containing the finalized data types.- Return type
pandas.Series