OmniSci Query Compiler

DFAlgQueryCompiler implements a query compiler for lazy frame. Each compiler instance holds an instance of OmnisciOnNativeFrame which is used to build a lazy execution tree.

Public API

class modin.experimental.backends.omnisci.query_compiler.DFAlgQueryCompiler(frame, shape_hint=None)

Query compiler for the OmniSci backend.

This class doesn’t perform much processing and mostly forwards calls to OmnisciOnNativeFrame for lazy execution trees build.

Parameters
  • frame (OmnisciOnNativeFrame) – Modin Frame to query with the compiled queries.

  • shape_hint ({"row", "column", None}, default: None) – Shape hint for frames known to be a column or a row, otherwise None.

_modin_frame

Modin Frame to query with the compiled queries.

Type

OmnisciOnNativeFrame

_shape_hint

Shape hint for frames known to be a column or a row, otherwise None.

Type

{“row”, “column”, None}

add(other, **kwargs)

Perform element-wise addition (self + other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

astype(col_dtypes, **kwargs)

Convert columns dtypes to given dtypes.

Parameters
  • col_dtypes (dict) – Map for column names and new dtypes.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

New QueryCompiler with updated dtypes.

Return type

BaseQueryCompiler

cat_codes()

Convert underlying categories data into its codes.

Returns

New QueryCompiler containing the integer codes of the underlying categories.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.cat.codes for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

columnarize()

Transpose this QueryCompiler if it has a single row but multiple columns.

This method should be called for QueryCompilers representing a Series object, i.e. self.is_series_like() should be True.

Returns

Transposed new QueryCompiler or self.

Return type

BaseQueryCompiler

property columns

Return frame’s columns.

Returns

Return type

pandas.Index

concat(axis, other, **kwargs)

Concatenate self with passed query compilers along specified axis.

Parameters
  • axis ({0, 1}) – Axis to concatenate along. 0 is for index and 1 is for columns.

  • other (BaseQueryCompiler or list of such) – Objects to concatenate with self.

  • join ({'outer', 'inner', 'right', 'left'}, default: 'outer') – Type of join that will be used if indices on the other axis are different. (note: if specified, has to be passed as join=value).

  • ignore_index (bool, default: False) – If True, do not use the index values along the concatenation axis. The resulting axis will be labeled 0, …, n - 1. (note: if specified, has to be passed as ignore_index=value).

  • sort (bool, default: False) – Whether or not to sort non-concatenation axis. (note: if specified, has to be passed as sort=value).

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Concatenated objects.

Return type

BaseQueryCompiler

copy()

Make a copy of this object.

Returns

Copy of self.

Return type

BaseQueryCompiler

Notes

For copy, we don’t want a situation where we modify the metadata of the copies if we end up modifying something here. We copy all of the metadata to prevent that.

count(**kwargs)

Get the number of non-NaN values for each column or row.

Parameters
  • axis ({{0, 1}}) –

  • level (None, default: None) – Serves the compatibility purpose. Always has to be None.

  • numeric_only (bool, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

One-column QueryCompiler with index labels of the specified axis, where each row contains the number of non-NaN values for the corresponding row or column.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.count for more information about parameters and output format.

default_to_pandas(pandas_op, *args, **kwargs)

Do fallback to pandas for the passed function.

Parameters
  • pandas_op (callable(pandas.DataFrame) -> object) – Function to apply to the casted to pandas frame.

  • *args (iterable) – Positional arguments to pass to pandas_op.

  • **kwargs (dict) – Key-value arguments to pass to pandas_op.

Returns

The result of the pandas_op, converted back to BaseQueryCompiler.

Return type

BaseQueryCompiler

drop(index=None, columns=None)

Drop specified rows or columns.

Parameters
  • index (list of labels, optional) – Labels of rows to drop.

  • columns (list of labels, optional) – Labels of columns to drop.

Returns

New QueryCompiler with removed data.

Return type

BaseQueryCompiler

dropna(axis=0, how='any', thresh=None, subset=None)

Remove missing values.

Parameters
  • axis ({0, 1}) –

  • how ({"any", "all"}) –

  • thresh (int, optional) –

  • subset (list of labels) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

New QueryCompiler with null values dropped along given axis.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.dropna for more information about parameters and output format.

dt_day()

Get day component for each datetime value.

Returns

New QueryCompiler with the same shape as self, where each element is day component for the corresponding datetime value.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.day for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_month()

Get month component for each datetime value.

Returns

New QueryCompiler with the same shape as self, where each element is month component for the corresponding datetime value.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.month for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_year()

Get year component for each datetime value.

Returns

New QueryCompiler with the same shape as self, where each element is year component for the corresponding datetime value.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.year for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

property dtypes

Get columns dtypes.

Returns

Series with dtypes of each column.

Return type

pandas.Series

eq(other, **kwargs)

Perform element-wise equality comparison (self == other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

fillna(squeeze_self=False, squeeze_value=False, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)

Replace NaN values using provided method.

Parameters
  • value (scalar or dict) –

  • method ({"backfill", "bfill", "pad", "ffill", None}) –

  • axis ({0, 1}) –

  • inplace ({False}) – This parameter serves the compatibility purpose. Always has to be False.

  • limit (int, optional) –

  • downcast (dict, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

New QueryCompiler with all null values filled.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.fillna for more information about parameters and output format.

finalize()

Finalize constructing the dataframe calling all deferred functions which were used to build it.

floordiv(other, **kwargs)

Perform element-wise integer division (self // other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

free()

Trigger a cleanup of this object.

classmethod from_arrow(at, data_cls)

Build QueryCompiler from Arrow Table.

Parameters
  • at (Arrow Table) – The Arrow Table to convert from.

  • data_cls (type) – BasePandasFrame class (or its descendant) to convert to.

Returns

QueryCompiler containing data from the pandas DataFrame.

Return type

BaseQueryCompiler

classmethod from_pandas(df, data_cls)

Build QueryCompiler from pandas DataFrame.

Parameters
  • df (pandas.DataFrame) – The pandas DataFrame to convert from.

  • data_cls (type) – BasePandasFrame class (or its descendant) to convert to.

Returns

QueryCompiler containing data from the pandas DataFrame.

Return type

BaseQueryCompiler

ge(other, **kwargs)

Perform element-wise greater than or equal comparison (self >= other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

get_index_name(axis=0)

Get index name of specified axis.

Parameters

axis ({0, 1}, default: 0) – Axis to get index name on.

Returns

Index name, None for MultiIndex.

Return type

hashable

get_index_names(axis=0)

Get index names of specified axis.

Parameters

axis ({0, 1}, default: 0) – Axis to get index names on.

Returns

Index names.

Return type

list

getitem_array(key)

Mask QueryCompiler with key.

Parameters

key (BaseQueryCompiler, np.ndarray or list of column labels) – Boolean mask represented by QueryCompiler or np.ndarray of the same shape as self, or enumerable of columns to pick.

Returns

New masked QueryCompiler.

Return type

BaseQueryCompiler

getitem_column_array(key, numeric=False)

Get column data for target labels.

Parameters
  • key (list-like) – Target labels by which to retrieve data.

  • numeric (bool, default: False) – Whether or not the key passed in represents the numeric index or the named index.

Returns

New QueryCompiler that contains specified columns.

Return type

BaseQueryCompiler

groupby_agg(by, is_multi_by, axis, agg_func, agg_args, agg_kwargs, groupby_kwargs, drop=False)

Group QueryCompiler data and apply passed aggregation function.

Parameters
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • is_multi_by (bool) – If by is a QueryCompiler or list of such indicates whether it’s grouping on multiple columns/rows.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • agg_func (dict or callable(DataFrameGroupBy) -> DataFrame) – Function to apply to the GroupBy object.

  • agg_args (dict) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns

QueryCompiler containing the result of groupby aggregation.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.GroupBy.aggregate for more information about parameters and output format.

groupby_count(by, axis, groupby_args, map_args, **kwargs)

Group QueryCompiler data and count non-null values for every group.

Parameters
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply reduction function along. 0 is for index, when 1 is for columns.

  • groupby_args (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • map_args (dict) – Keyword arguments to pass to the reduction function. If GroupBy is implemented via MapReduce approach, this argument is passed at the map phase only.

  • reduce_args (dict, optional) – If GroupBy is implemented with MapReduce approach, specifies arguments to pass to the reduction function at the reduce phase, has no effect otherwise.

  • numeric_only (bool, default: True) – Whether or not to drop non-numeric columns before executing GroupBy.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduction built by the following rules:

    • Labels on the opposit of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the number of non-null values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas backend implements groupby via MapReduce approach, but for other backends these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.count for more information about parameters and output format.

groupby_size(by, axis, groupby_args, map_args, **kwargs)

Group QueryCompiler data and get the number of elements for every group.

Parameters
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply reduction function along. 0 is for index, when 1 is for columns.

  • groupby_args (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • map_args (dict) – Keyword arguments to pass to the reduction function. If GroupBy is implemented via MapReduce approach, this argument is passed at the map phase only.

  • reduce_args (dict, optional) – If GroupBy is implemented with MapReduce approach, specifies arguments to pass to the reduction function at the reduce phase, has no effect otherwise.

  • numeric_only (bool, default: True) – Whether or not to drop non-numeric columns before executing GroupBy.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduction built by the following rules:

    • Labels on the opposit of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the number of elements for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas backend implements groupby via MapReduce approach, but for other backends these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.size for more information about parameters and output format.

groupby_sum(by, axis, groupby_args, map_args, **kwargs)

Group QueryCompiler data and compute sum for every group.

Parameters
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply reduction function along. 0 is for index, when 1 is for columns.

  • groupby_args (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • map_args (dict) – Keyword arguments to pass to the reduction function. If GroupBy is implemented via MapReduce approach, this argument is passed at the map phase only.

  • reduce_args (dict, optional) – If GroupBy is implemented with MapReduce approach, specifies arguments to pass to the reduction function at the reduce phase, has no effect otherwise.

  • numeric_only (bool, default: True) – Whether or not to drop non-numeric columns before executing GroupBy.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduction built by the following rules:

    • Labels on the opposit of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the sum for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas backend implements groupby via MapReduce approach, but for other backends these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.sum for more information about parameters and output format.

gt(other, **kwargs)

Perform element-wise greater than comparison (self > other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

has_multiindex(axis=0)

Check if specified axis is indexed by MultiIndex.

Parameters

axis ({0, 1}, default: 0) – The axis to check (0 - index, 1 - columns).

Returns

True if index at specified axis is MultiIndex and False otherwise.

Return type

bool

property index

Return frame’s index.

Returns

Return type

pandas.Index

insert(loc, column, value)

Insert new column.

Parameters
  • loc (int) – Insertion position.

  • column (label) – Label of the new column.

  • value (One-column BaseQueryCompiler, 1D array or scalar) – Data to fill new column with.

Returns

QueryCompiler with new column inserted.

Return type

BaseQueryCompiler

is_series_like()

Check whether this QueryCompiler can represent modin.pandas.Series object.

Returns

Return True if QueryCompiler has a single column or row, False otherwise.

Return type

bool

le(other, **kwargs)

Perform element-wise less than or equal comparison (self <= other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

lt(other, **kwargs)

Perform element-wise less than comparison (self < other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

max(**kwargs)

Get the maximum value for each column or row.

Parameters
  • axis ({{0, 1}}) –

  • level (None, default: None) – Serves the compatibility purpose. Always has to be None.

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

One-column QueryCompiler with index labels of the specified axis, where each row contains the maximum value for the corresponding row or column.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.max for more information about parameters and output format.

mean(**kwargs)

Get the mean value for each column or row.

Parameters
  • axis ({{0, 1}}) –

  • level (None, default: None) – Serves the compatibility purpose. Always has to be None.

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

One-column QueryCompiler with index labels of the specified axis, where each row contains the mean value for the corresponding row or column.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.mean for more information about parameters and output format.

merge(right, **kwargs)

Merge QueryCompiler objects using a database-style join.

Parameters
  • right (BaseQueryCompiler) – QueryCompiler of the right frame to merge with.

  • how ({"left", "right", "outer", "inner", "cross"}) –

  • on (label or list of such) –

  • left_on (label or list of such) –

  • right_on (label or list of such) –

  • left_index (bool) –

  • right_index (bool) –

  • sort (bool) –

  • suffixes (list-like) –

  • copy (bool) –

  • indicator (bool or str) –

  • validate (str) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

QueryCompiler that contains result of the merge.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.merge for more information about parameters and output format.

min(**kwargs)

Get the minimum value for each column or row.

Parameters
  • axis ({{0, 1}}) –

  • level (None, default: None) – Serves the compatibility purpose. Always has to be None.

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

One-column QueryCompiler with index labels of the specified axis, where each row contains the minimum value for the corresponding row or column.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.min for more information about parameters and output format.

mod(other, **kwargs)

Perform element-wise modulo (self % other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

mul(other, **kwargs)

Perform element-wise multiplication (self * other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

ne(other, **kwargs)

Perform element-wise not equal comparison (self != other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

nunique(axis=0, dropna=True)

Get the number of unique values for each column or row.

Parameters
  • axis ({0, 1}) –

  • dropna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

One-column QueryCompiler with index labels of the specified axis, where each row contains the number of unique values for the corresponding row or column.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.nunique for more information about parameters and output format.

reset_index(**kwargs)

Reset the index, or a level of it.

Parameters
  • drop (bool) – Whether to drop the reset index or insert it at the beginning of the frame.

  • level (int or label, optional) – Level to remove from index. Removes all levels by default.

  • col_level (int or label) – If the columns have multiple levels, determines which level the labels are inserted into.

  • col_fill (label) – If the columns have multiple levels, determines how the other levels are named.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

QueryCompiler with reset index.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.reset_index for more information about parameters and output format.

set_index_name(name, axis=0)

Set index name for the specified axis.

Parameters
  • name (hashable) – New index name.

  • axis ({0, 1}, default: 0) – Axis to set name along.

set_index_names(names=None, axis=0)

Set index names for the specified axis.

Parameters
  • names (list) – New index names.

  • axis ({0, 1}, default: 0) – Axis to set names along.

setitem(axis, key, value)

Set the row/column defined by key to the value provided.

Parameters
  • axis ({0, 1}) – Axis to set value along. 0 means set row, 1 means set column.

  • key (label) – Row/column label to set value in.

  • value (BaseQueryCompiler, list-like or scalar) – Define new row/column value.

Returns

New QueryCompiler with updated key value.

Return type

BaseQueryCompiler

sort_rows_by_column_values(columns, ascending=True, **kwargs)

Reorder the rows based on the lexicographic order of the given columns.

Parameters
  • columns (label or list of labels) – The column or columns to sort by.

  • ascending (bool, default: True) – Sort in ascending order (True) or descending order (False).

  • kind ({"quicksort", "mergesort", "heapsort"}) –

  • na_position ({"first", "last"}) –

  • ignore_index (bool) –

  • key (callable(pandas.Index) -> pandas.Index, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

New QueryCompiler that contains result of the sort.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sort_values for more information about parameters and output format.

sub(other, **kwargs)

Perform element-wise substraction (self - other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

sum(**kwargs)

Get the sum for each column or row.

Parameters
  • axis ({0, 1}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

One-column QueryCompiler with index labels of the specified axis, where each row contains the sum for the corresponding row or column.

Return type

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sum for more information about parameters and output format.

to_pandas()

Convert underlying query compilers data to pandas.DataFrame.

Returns

The QueryCompiler converted to pandas.

Return type

pandas.DataFrame

truediv(other, **kwargs)

Perform element-wise division (self / other).

If axes are not equal, perform frames alignment first.

Parameters
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns

Result of binary operation.

Return type

BaseQueryCompiler

view(index=None, columns=None)

Mask QueryCompiler with passed keys.

Parameters
  • index (list of ints, optional) – Positional indices of rows to grab.

  • columns (list of ints, optional) – Positional indices of columns to grab.

Returns

New masked QueryCompiler.

Return type

BaseQueryCompiler