PyArrow backend¶
In general, PyArrow backend follows the flow of the pandas backend: query compiler contains an instance of Modin Frame, which is internally split into partitions. The main difference is that partitions contain PyArrow tables, instead of DataFrames like in pandas backend. To learn more about this approach please visit PyArrow execution engine section.
High-Level Module Overview¶
This module houses submodules which are responsible for communication between the query compiler level and execution engine level for PyArrow backend:
Query compiler is responsible for compiling efficient queries for PyarrowOnRayFrame.
Parsers are responsible for parsing data on workers during IO operations.
Note
Currently the only one available PyArrow backend factory is PyarrowOnRay
which works
in experimental mode only.