PyArrow Query Compiler¶
PyarrowQueryCompiler is responsible for compiling efficient
DataFrame algebra queries for the PyarrowOnRayFrame,
the frames which are backed by pyarrow.Table objects.
Each PyarrowQueryCompiler contains an instance of
PyarrowOnRayFrame which it queries to get the result.
Public API¶
PyarrowQueryCompiler implements common query compilers API
defined by the BaseQueryCompiler. Most functionalities
are inherited from PandasQueryCompiler, in the following
section only overridden methods are presented.
- class modin.backends.pyarrow.query_compiler.PyarrowQueryCompiler(modin_frame)¶
Bases:
modin.backends.pandas.query_compiler.PandasQueryCompilerQuery compiler for the PyArrow backend.
This class translates common query compiler API into the DataFrame Algebra queries, that is supposed to be executed by
PyarrowOnRayFrame.- Parameters
modin_frame (PyarrowOnRayFrame) – Modin Frame to query with the compiled queries.
- property dtypes¶
Get columns dtypes.
- Returns
Series with dtypes of each column.
- Return type
pandas.Series
- query(expr, **kwargs)¶
Query columns of the QueryCompiler with a boolean expression.
- Parameters
expr (str) –
**kwargs (dict) –
- Returns
New QueryCompiler containing the rows where the boolean expression is satisfied.
- Return type
Notes
Please refer to
modin.pandas.DataFrame.queryfor more information about parameters and output format.
- to_numpy(**kwargs)¶
Convert underlying query compilers data to NumPy array.
- Parameters
dtype (dtype) – The dtype of the resulted array.
copy (bool) – Whether to ensure that the returned value is not a view on another array.
na_value (object) – The value to replace missing values with.
**kwargs (dict) – Serves the compatibility purpose. Does not affect the result.
- Returns
The QueryCompiler converted to NumPy array.
- Return type
np.ndarray
- to_pandas()¶
Convert underlying query compilers data to
pandas.DataFrame.- Returns
The QueryCompiler converted to pandas.
- Return type