PandasOnDaskDataframe#
The class is the specific implementation of the dataframe algebra for the Dask execution engine.
It serves as an intermediate level between pandas
query compiler and
PandasOnDaskDataframePartitionManager
.
Public API#
- class modin.core.execution.dask.implementations.pandas_on_dask.dataframe.PandasOnDaskDataframe(partitions, index=None, columns=None, row_lengths=None, column_widths=None, dtypes=None)#
The class implements the interface in
PandasDataframe
.- Parameters
partitions (np.ndarray) – A 2D NumPy array of partitions.
index (sequence) – The index for the dataframe. Converted to a pandas.Index.
columns (sequence) – The columns object for the dataframe. Converted to a pandas.Index.
row_lengths (list, optional) – The length of each partition in the rows. The “height” of each of the block partitions. Is computed if not provided.
column_widths (list, optional) – The width of each partition in the columns. The “width” of each of the block partitions. Is computed if not provided.
dtypes (pandas.Series, optional) – The data types for the dataframe columns.