Core Modin Dataframe Objects#

Modin partitions data to scale efficiently. To keep track of everything a few key classes are introduced: Dataframe, Partition, AxisPartiton and PartitionManager.

  • Dataframe is the class conforming to Dataframe Algebra.

  • Partition is an element of a NxM grid which, when combined, represents the Dataframe

  • AxisPartition is a joined group of Partition-s along some axis (either rows or columns)

  • PartitionManager is the manager that implements the primitives used for Dataframe Algebra operations over Partition-s

Each storage format, execution engine, and each execution system (storage format + execution engine) may have its own implementations of these Core Dataframe’s entities. Current stable implementations are the following:

  • Base ModinDataframe defines a common interface and algebra operators for Dataframe implementations.

Storage format specific:

Engine specific:

Execution system specific:

Note

At the current stage of Modin development, the base interfaces of the Dataframe objects are not defined yet. So for now the origin of all changes in the Dataframe interfaces is the Dataframe for pandas storage format.