PandasOnDaskFrame

The class is the specific implementation of the dataframe algebra for the PandasOnDask backend. It serves as an intermediate level between pandas query compiler and PandasOnDaskFramePartitionManager.

Public API

class modin.engines.dask.pandas_on_dask.frame.data.PandasOnDaskFrame(partitions, index, columns, row_lengths=None, column_widths=None, dtypes=None)

The class implements the interface in PandasFrame.

Parameters
  • partitions (np.ndarray) – A 2D NumPy array of partitions.

  • index (sequence) – The index for the dataframe. Converted to a pandas.Index.

  • columns (sequence) – The columns object for the dataframe. Converted to a pandas.Index.

  • row_lengths (list, optional) – The length of each partition in the rows. The “height” of each of the block partitions. Is computed if not provided.

  • column_widths (list, optional) – The width of each partition in the columns. The “width” of each of the block partitions. Is computed if not provided.

  • dtypes (pandas.Series, optional) – The data types for the dataframe columns.