PandasOnPythonDataframe#

The class is specific implementation of PandasDataframe for Python execution engine. It serves as an intermediate level between PandasQueryCompiler and PandasOnPythonDataframePartitionManager.

Public API#

class modin.core.execution.python.implementations.pandas_on_python.dataframe.dataframe.PandasOnPythonDataframe(partitions, index=None, columns=None, row_lengths=None, column_widths=None, dtypes=None)#

Class for dataframes with pandas storage format and Python engine.

PandasOnPythonDataframe doesn’t implement any specific interfaces, all functionality is inherited from the PandasDataframe class.

Parameters:
  • partitions (np.ndarray) – A 2D NumPy array of partitions.

  • index (sequence) – The index for the dataframe. Converted to a pandas.Index.

  • columns (sequence) – The columns object for the dataframe. Converted to a pandas.Index.

  • row_lengths (list, optional) – The length of each partition in the rows. The “height” of each of the block partitions. Is computed if not provided.

  • column_widths (list, optional) – The width of each partition in the columns. The “width” of each of the block partitions. Is computed if not provided.

  • dtypes (pandas.Series, optional) – The data types for the dataframe columns.