Core Modin Dataframe Objects#
Modin partitions data to scale efficiently.
To keep track of everything a few key classes are introduced:
Dataframeis the class conforming to Dataframe Algebra.
Partitionis an element of a NxM grid which, when combined, represents the
AxisPartitionis a joined group of
Partition-s along some axis (either rows or columns)
PartitionManageris the manager that implements the primitives used for Dataframe Algebra operations over
Each storage format, execution engine, and each execution system (storage format + execution engine) may have its own implementations of these Core Dataframe’s entities. Current stable implementations are the following:
Base ModinDataframe defines a common interface and algebra operators for Dataframe implementations.
Storage format specific:
Modin PandasDataframe is an implementation for any frame class of pandas storage format.
Modin GenericRayDataframe is an implementation for any frame class that works on Ray execution engine.
Modin GenericUnidistDataframe is an implementation for any frame class that works on Unidist execution engine.
Execution system specific:
Modin PandasOnRayDataframe is a specialization of the Core Modin Dataframe for
Modin cuDFOnRayDataframe is a specialization of the Core Modin Dataframe for
Modin PandasOnDaskDataframe is specialization of the Core Modin Dataframe for
Modin PandasOnPythonDataframe is a specialization of the Core Modin Dataframe for
Modin PandasOnUnidistDataframe is a specialization of the Core Modin Dataframe for
At the current stage of Modin development, the base interfaces of the Dataframe objects are not defined yet. So for now the origin of all changes in the Dataframe interfaces is the Dataframe for pandas storage format.