PyArrow Query Compiler

PyarrowQueryCompiler is responsible for compiling efficient DataFrame algebra queries for the PyarrowOnRayFrame, the frames which are backed by pyarrow.Table objects.

Each PyarrowQueryCompiler contains an instance of PyarrowOnRayFrame which it queries to get the result.

Public API

PyarrowQueryCompiler implements common query compilers API defined by the BaseQueryCompiler. Most functionalities are inherited from PandasQueryCompiler, in the following section only overridden methods are presented.

class modin.backends.pyarrow.query_compiler.PyarrowQueryCompiler(modin_frame)

Bases: modin.backends.pandas.query_compiler.PandasQueryCompiler

Query compiler for the PyArrow backend.

This class translates common query compiler API into the DataFrame Algebra queries, that is supposed to be executed by PyarrowOnRayFrame.

Parameters

modin_frame (PyarrowOnRayFrame) – Modin Frame to query with the compiled queries.

property dtypes

Get columns dtypes.

Returns

Series with dtypes of each column.

Return type

pandas.Series

query(expr, **kwargs)

Query columns of the QueryCompiler with a boolean expression.

Parameters
  • expr (str) –

  • **kwargs (dict) –

Returns

New QueryCompiler containing the rows where the boolean expression is satisfied.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.query for more information about parameters and output format.

to_numpy(**kwargs)

Convert underlying query compilers data to NumPy array.

Parameters
  • dtype (dtype) – The dtype of the resulted array.

  • copy (bool) – Whether to ensure that the returned value is not a view on another array.

  • na_value (object) – The value to replace missing values with.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

The QueryCompiler converted to NumPy array.

Return type

np.ndarray

to_pandas()

Convert underlying query compilers data to pandas.DataFrame.

Returns

The QueryCompiler converted to pandas.

Return type

pandas.DataFrame