BaseQueryCompiler#

Brief description#

BaseQueryCompiler is an abstract class of query compiler, and sets a common interface that every other query compiler implementation in Modin must follow. The Base class contains a basic implementations for most of the interface methods, all of which fallback to pandas.

Subclassing BaseQueryCompiler#

If you want to add new type of query compiler to Modin the new class needs to inherit from BaseQueryCompiler and implement the abstract methods:

(Please refer to the code documentation to see the full documentation for these functions).

This is a minimum set of operations to ensure a new query compiler will function in the Modin architecture, and the rest of the API can safely default to the pandas implementation via the base class implementation. To add a storage format specific implementation for some of the query compiler operations, just override the corresponding method in your query compiler class.

Example#

As an exercise let’s define a new query compiler in Modin, just to see how easy it is. Usually, the query compiler routes formed queries to the underlying frame class, which submits operators to an execution engine. For the sake of simplicity and independence of this example, our execution engine will be the pandas itself.

We need to inherit a new class from BaseQueryCompiler and implement all of the abstract methods. In this case, with pandas as an execution engine, it’s trivial:

from modin.core.storage_formats import BaseQueryCompiler

class DefaultToPandasQueryCompiler(BaseQueryCompiler):
    def __init__(self, pandas_df):
        self._pandas_df = pandas_df

    @classmethod
    def from_pandas(cls, df, *args, **kwargs):
        return cls(df)

    @classmethod
    def from_arrow(cls, at, *args, **kwargs):
        return cls(at.to_pandas())

    def to_pandas(self):
        return self._pandas_df.copy()

    def default_to_pandas(self, pandas_op, *args, **kwargs):
        return type(self)(pandas_op(self.to_pandas(), *args, **kwargs))

    def finalize(self):
        pass

    def free(self):
        pass

All done! Now you’ve got a fully functional query compiler, which is ready for extensions and already can be used in Modin DataFrame:

import pandas
pandas_df = pandas.DataFrame({"col1": [1, 2, 2, 1], "col2": [10, 2, 3, 40]})
# Building our query compiler from pandas object
qc = DefaultToPandasQueryCompiler.from_pandas(pandas_df)

import modin.pandas as pd
# Building Modin DataFrame from newly created query compiler
modin_df = pd.DataFrame(query_compiler=qc)

# Got fully functional Modin DataFrame
>>> print(modin_df.groupby("col1").sum().reset_index())
   col1  col2
0     1    50
1     2     5

To be able to select this query compiler as default via modin.config you also need to define the combination of your query compiler and pandas engine as an execution by adding the corresponding factory. To find more information about factories, visit dispatching page.

Query Compiler API#

class modin.core.storage_formats.base.query_compiler.BaseQueryCompiler#

Abstract class that handles the queries to Modin dataframes.

This class defines common query compilers API, most of the methods are already implemented and defaulting to pandas.

lazy_execution#

Whether underlying execution engine is designed to be executed in a lazy mode only. If True, such QueryCompiler will be handled differently at the front-end in order to reduce execution triggering as much as possible.

Type:

bool

_shape_hint#

Shape hint for frames known to be a column or a row, otherwise None.

Type:

{“row”, “column”, None}, default: None

Notes

See the Abstract Methods and Fields section immediately below this for a list of requirements for subclassing this object.

abs()#

Get absolute numeric value of each element.

Returns:

QueryCompiler with absolute numeric value of each element.

Return type:

BaseQueryCompiler

add(other, **kwargs)#

Perform element-wise addition (self + other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

add_prefix(prefix, axis=1)#

Add string prefix to the index labels along specified axis.

Parameters:
  • prefix (str) – The string to add before each label.

  • axis ({0, 1}, default: 1) – Axis to add prefix along. 0 is for index and 1 is for columns.

Returns:

New query compiler with updated labels.

Return type:

BaseQueryCompiler

add_suffix(suffix, axis=1)#

Add string suffix to the index labels along specified axis.

Parameters:
  • suffix (str) – The string to add after each label.

  • axis ({0, 1}, default: 1) – Axis to add suffix along. 0 is for index and 1 is for columns.

Returns:

New query compiler with updated labels.

Return type:

BaseQueryCompiler

align(other, **kwargs)#

Align two objects on their axes with the specified join method.

Join method is specified for each axis Index.

Parameters:
Returns:

  • BaseQueryCompiler – Aligned self.

  • BaseQueryCompiler – Aligned other.

Notes

Please refer to modin.pandas.DataFrame.align for more information about parameters and output format.

all(**kwargs)#

Return whether all the elements are true, potentially over an axis.

Parameters:
  • axis ({0, 1}, optional) –

  • bool_only (bool, optional) –

  • skipna (bool) –

  • level (int or label) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

If axis was specified return one-column QueryCompiler with index labels of the specified axis, where each row contains boolean of whether all elements at the corresponding row or column are True. Otherwise return QueryCompiler with a single bool of whether all elements are True.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.all for more information about parameters and output format.

any(**kwargs)#

Return whether any element is true, potentially over an axis.

Parameters:
  • axis ({0, 1}, optional) –

  • bool_only (bool, optional) –

  • skipna (bool) –

  • level (int or label) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

If axis was specified return one-column QueryCompiler with index labels of the specified axis, where each row contains boolean of whether any element at the corresponding row or column is True. Otherwise return QueryCompiler with a single bool of whether any element is True.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.any for more information about parameters and output format.

apply(func, axis, raw=False, result_type=None, *args, **kwargs)#

Apply passed function across given axis.

Parameters:
  • func (callable(pandas.Series) -> scalar, str, list or dict of such) – The function to apply to each column or row.

  • axis ({0, 1}) – Target axis to apply the function along. 0 is for index, 1 is for columns.

  • raw (bool, default: False) – Whether to pass a high-level Series object (False) or a raw representation of the data (True).

  • result_type ({"expand", "reduce", "broadcast", None}, default: None) –

    Determines how to treat list-like return type of the func (works only if a single function was passed):

    • ”expand”: expand list-like result into columns.

    • ”reduce”: keep result into a single cell (opposite of “expand”).

    • ”broadcast”: broadcast result to original data shape (overwrite the existing column/row with the function result).

    • None: use “expand” strategy if Series is returned, “reduce” otherwise.

  • *args (iterable) – Positional arguments to pass to func.

  • **kwargs (dict) – Keyword arguments to pass to func.

Returns:

QueryCompiler that contains the results of execution and is built by the following rules:

  • Index of the specified axis contains: the names of the passed functions if multiple functions are passed, otherwise: indices of the func result if “expand” strategy is used, indices of the original frame if “broadcast” strategy is used, a single label MODIN_UNNAMED_SERIES_LABEL if “reduce” strategy is used.

  • Labels of the opposite axis are preserved.

  • Each element is the result of execution of func against corresponding row/column.

Return type:

BaseQueryCompiler

apply_on_series(func, *args, **kwargs)#

Apply passed function on underlying Series.

Parameters:
  • func (callable(pandas.Series) -> scalar, str, list or dict of such) – The function to apply to each row.

  • *args (iterable) – Positional arguments to pass to func.

  • **kwargs (dict) – Keyword arguments to pass to func.

Return type:

BaseQueryCompiler

argsort(**kwargs)#

Return the integer indices that would sort the Series values.

Override ndarray.argsort. Argsorts the value, omitting NA/null values, and places the result in the same locations as the non-NA values.

Parameters:
  • axis ({0 or 'index'}) – Unused. Parameter needed for compatibility with DataFrame.

  • kind ({'mergesort', 'quicksort', 'heapsort', 'stable'}, default 'quicksort') – Choice of sorting algorithm. See numpy.sort() for more information. ‘mergesort’ and ‘stable’ are the only stable algorithms.

  • order (None) – Has no effect but is accepted for compatibility with NumPy.

  • **kwargs (dict) – Serves compatibility purposes.

Returns:

One-column QueryCompiler with positions of values within the sort order with -1 indicating nan values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.argsort for more information about parameters and output format.

asfreq(**kwargs)#

Convert time series to specified frequency.

Returns the original data conformed to a new index with the specified frequency.

Returns:

New QueryCompiler reindexed to the specified frequency.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.asfreq for more information about parameters and output format.

astype(col_dtypes, errors: str = 'raise')#

Convert columns dtypes to given dtypes.

Parameters:
  • col_dtypes (dict or str) – Map for column names and new dtypes.

  • errors ({'raise', 'ignore'}, default: 'raise') – Control raising of exceptions on invalid data for provided dtype. - raise : allow exceptions to be raised - ignore : suppress exceptions. On error return original object.

Returns:

New QueryCompiler with updated dtypes.

Return type:

BaseQueryCompiler

between_time(**kwargs)#

Select values between particular times of the day (e.g., 9:00-9:30 AM).

By setting start_time to be later than end_time, you can get the times that are not between the two times.

Return type:

BaseQueryCompiler

case_when(caselist)#

Replace values where the conditions are True.

Notes

Please refer to modin.pandas.Series.case_when for more information about parameters and output format.

cat_codes()#

Convert underlying categories data into its codes.

Returns:

New QueryCompiler containing the integer codes of the underlying categories.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.cat.codes for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

clip(lower, upper, **kwargs)#

Trim values at input threshold.

Parameters:
  • lower (float or list-like) –

  • upper (float or list-like) –

  • axis ({0, 1}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler with values limited by the specified thresholds.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.clip for more information about parameters and output format.

columnarize()#

Transpose this QueryCompiler if it has a single row but multiple columns.

This method should be called for QueryCompilers representing a Series object, i.e. self.is_series_like() should be True.

Returns:

Transposed new QueryCompiler or self.

Return type:

BaseQueryCompiler

combine(other, **kwargs)#

Perform column-wise combine with another QueryCompiler with passed func.

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler) – Left operand of the binary operation.

  • func (callable(pandas.Series, pandas.Series) -> pandas.Series) – Function that takes two pandas.Series with aligned axes and returns one pandas.Series as resulting combination.

  • fill_value (float or None) – Value to fill missing values with after frame alignment occurred.

  • overwrite (bool) – If True, columns in self that do not exist in other will be overwritten with NaNs.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of combine.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.combine for more information about parameters and output format.

combine_first(other, **kwargs)#

Fill null elements of self with value in the same location in other.

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler) – Provided frame to use to fill null values from.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.combine_first for more information about parameters and output format.

compare(other, align_axis, keep_shape, keep_equal, result_names)#

Compare data of two QueryCompilers and highlight the difference.

Parameters:
  • other (BaseQueryCompiler) – Query compiler to compare with. Have to be the same shape and the same labeling as self.

  • align_axis ({0, 1}) –

  • keep_shape (bool) –

  • keep_equal (bool) –

  • result_names (tuple) –

Returns:

New QueryCompiler containing the differences between self and passed query compiler.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.compare for more information about parameters and output format.

concat(axis, other, **kwargs)#

Concatenate self with passed query compilers along specified axis.

Parameters:
  • axis ({0, 1}) – Axis to concatenate along. 0 is for index and 1 is for columns.

  • other (BaseQueryCompiler or list of such) – Objects to concatenate with self.

  • join ({'outer', 'inner', 'right', 'left'}, default: 'outer') – Type of join that will be used if indices on the other axis are different. (note: if specified, has to be passed as join=value).

  • ignore_index (bool, default: False) – If True, do not use the index values along the concatenation axis. The resulting axis will be labeled 0, …, n - 1. (note: if specified, has to be passed as ignore_index=value).

  • sort (bool, default: False) – Whether or not to sort non-concatenation axis. (note: if specified, has to be passed as sort=value).

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Concatenated objects.

Return type:

BaseQueryCompiler

conj(**kwargs)#

Get the complex conjugate for every element of self.

Parameters:

**kwargs (dict) –

Returns:

QueryCompiler with conjugate applied element-wise.

Return type:

BaseQueryCompiler

Notes

Please refer to numpy.conj for parameters description.

convert_dtypes(infer_objects: bool = True, convert_string: bool = True, convert_integer: bool = True, convert_boolean: bool = True, convert_floating: bool = True, dtype_backend: Literal['pyarrow', 'numpy_nullable'] = 'numpy_nullable')#

Convert columns to best possible dtypes using dtypes supporting pd.NA.

Parameters:
  • infer_objects (bool, default: True) – Whether object dtypes should be converted to the best possible types.

  • convert_string (bool, default: True) – Whether object dtypes should be converted to pd.StringDtype().

  • convert_integer (bool, default: True) – Whether, if possbile, conversion should be done to integer extension types.

  • convert_boolean (bool, default: True) – Whether object dtypes should be converted to pd.BooleanDtype().

  • convert_floating (bool, default: True) – Whether, if possible, conversion can be done to floating extension types. If convert_integer is also True, preference will be give to integer dtypes if the floats can be faithfully casted to integers.

  • dtype_backend ({"numpy_nullable", "pyarrow"}, default: "numpy_nullable") – Which dtype_backend to use, e.g. whether a DataFrame should use nullable dtypes for all dtypes that have a nullable implementation when “numpy_nullable” is set, PyArrow is used for all dtypes if “pyarrow” is set.

Returns:

New QueryCompiler with updated dtypes.

Return type:

BaseQueryCompiler

copy()#

Make a copy of this object.

Returns:

Copy of self.

Return type:

BaseQueryCompiler

Notes

For copy, we don’t want a situation where we modify the metadata of the copies if we end up modifying something here. We copy all of the metadata to prevent that.

corr(**kwargs)#

Compute pairwise correlation of columns, excluding NA/null values.

Parameters:
  • method ({'pearson', 'kendall', 'spearman'} or callable(pandas.Series, pandas.Series) -> pandas.Series) – Correlation method.

  • min_periods (int) – Minimum number of observations required per pair of columns to have a valid result. If fewer than min_periods non-NA values are present the result will be NA.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Correlation matrix.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.corr for more information about parameters and output format.

corrwith(**kwargs)#

Compute pairwise correlation.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.corrwith for more information about parameters and output format.

count(**kwargs)#

Get the number of non-NaN values for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the number of non-NaN values for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.count for more information about parameters and output format.

cov(**kwargs)#

Compute pairwise covariance of columns, excluding NA/null values.

Parameters:
  • min_periods (int) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Covariance matrix.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.cov for more information about parameters and output format.

cummax(fold_axis, **kwargs)#

Get cumulative maximum for every row or column.

Parameters:
  • fold_axis ({0, 1}) –

  • skipna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler of the same shape as self, where each element is the maximum of all the previous values in this row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.cummax for more information about parameters and output format.

cummin(fold_axis, **kwargs)#

Get cumulative minimum for every row or column.

Parameters:
  • fold_axis ({0, 1}) –

  • skipna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler of the same shape as self, where each element is the minimum of all the previous values in this row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.cummin for more information about parameters and output format.

cumprod(fold_axis, **kwargs)#

Get cumulative product for every row or column.

Parameters:
  • fold_axis ({0, 1}) –

  • skipna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler of the same shape as self, where each element is the product of all the previous values in this row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.cumprod for more information about parameters and output format.

cumsum(fold_axis, **kwargs)#

Get cumulative sum for every row or column.

Parameters:
  • fold_axis ({0, 1}) –

  • skipna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler of the same shape as self, where each element is the sum of all the previous values in this row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.cumsum for more information about parameters and output format.

cut(bins, **kwargs)#

Bin values into discrete intervals.

Parameters:
  • bins (int, array of ints, or IntervalIndex) – The criteria to bin by.

  • **kwargs (dict) – The keyword arguments to pass through.

Returns:

Returns the result of pd.cut.

Return type:

BaseQueryCompiler or np.ndarray or list[np.ndarray]

Notes

Please refer to modin.pandas.cut for more information about parameters and output format.

dataframe_to_dict(orient='dict', into=<class 'dict'>, index=True)#

Convert the DataFrame to a dictionary.

Return type:

dict or into instance

Notes

Please refer to modin.pandas.DataFrame.to_dict for more information about parameters and output format.

default_to_pandas(pandas_op, *args, **kwargs)#

Do fallback to pandas for the passed function.

Parameters:
  • pandas_op (callable(pandas.DataFrame) -> object) – Function to apply to the casted to pandas frame.

  • *args (iterable) – Positional arguments to pass to pandas_op.

  • **kwargs (dict) – Key-value arguments to pass to pandas_op.

Returns:

The result of the pandas_op, converted back to BaseQueryCompiler.

Return type:

BaseQueryCompiler

delitem(key)#

Drop key column.

Parameters:

key (label) – Column name to drop.

Returns:

New QueryCompiler without key column.

Return type:

BaseQueryCompiler

describe(percentiles: ndarray)#

Generate descriptive statistics.

Parameters:

percentiles (list-like) –

Returns:

QueryCompiler object containing the descriptive statistics of the underlying data.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.describe for more information about parameters and output format.

df_update(other, **kwargs)#

Update values of self using non-NA values of other at the corresponding positions.

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler) – Frame to grab replacement values from.

  • join ({"left"}) – Specify type of join to align frames if axes are not equal (note: currently only one type of join is implemented).

  • overwrite (bool) – Whether to overwrite every corresponding value of self, or only if it’s NAN.

  • filter_func (callable(pandas.Series, pandas.Series) -> numpy.ndarray<bool>) – Function that takes column of the self and return bool mask for values, that should be overwritten in the self frame.

  • errors ({"raise", "ignore"}) – If “raise”, will raise a ValueError if self and other both contain non-NA data in the same place.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with updated values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.update for more information about parameters and output format.

diff(**kwargs)#

First discrete difference of element.

Parameters:
  • periods (int) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler of the same shape as self, where each element is the difference between the corresponding value and the previous value in this row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.diff for more information about parameters and output format.

divmod(other, **kwargs)#

Return Integer division and modulo of self and other, element-wise (binary operator divmod).

Equivalent to divmod(self, other), but with support to substitute a fill_value for missing data in either one of the inputs.

Parameters:
  • other (BaseQueryCompiler or scalar value) –

  • **kwargs (dict) – Other arguments for division.

Returns:

  • BaseQueryCompiler – Compiler representing Series with divisor part of division.

  • BaseQueryCompiler – Compiler representing Series with modulo part of division.

Notes

Please refer to modin.pandas.Series.divmod for more information about parameters and output format.

dot(other, **kwargs)#

Compute the matrix multiplication of self and other.

Parameters:
  • other (BaseQueryCompiler or NumPy array) – The other query compiler or NumPy array to matrix multiply with self.

  • squeeze_self (boolean) – If self is a one-column query compiler, indicates whether it represents Series object.

  • squeeze_other (boolean) – If other is a one-column query compiler, indicates whether it represents Series object.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

A new query compiler that contains result of the matrix multiply.

Return type:

BaseQueryCompiler

drop(index=None, columns=None, errors: str = 'raise')#

Drop specified rows or columns.

Parameters:
  • index (list of labels, optional) – Labels of rows to drop.

  • columns (list of labels, optional) – Labels of columns to drop.

  • errors (str, default: "raise") – If ‘ignore’, suppress error and only existing labels are dropped.

Returns:

New QueryCompiler with removed data.

Return type:

BaseQueryCompiler

dropna(**kwargs)#

Remove missing values.

Parameters:
  • axis ({0, 1}) –

  • how ({"any", "all"}) –

  • thresh (int, optional) –

  • subset (list of labels) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with null values dropped along given axis.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.dropna for more information about parameters and output format.

dt_as_unit(*args, **kwargs)#

Notes

Please refer to modin.pandas.Series.dt.as_unit for more information about parameters and output format.

dt_asfreq(freq=None, how: str = 'E')#

Convert the PeriodArray to the specified frequency freq.

Equivalent to applying pandas.Period.asfreq() with the given arguments to each Period in this PeriodArray.

Parameters:
  • freq (str, optional) – A frequency.

  • how (str {'E', 'S'}, default: 'E') – Whether the elements should be aligned to the end or start within pa period. * ‘E’, “END”, or “FINISH” for end, * ‘S’, “START”, or “BEGIN” for start. January 31st (“END”) vs. January 1st (“START”) for example.

Returns:

New QueryCompiler containing period data.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.asfreq for more information about parameters and output format.

dt_ceil(freq, ambiguous='raise', nonexistent='raise')#

Perform ceil operation on the underlying time-series data to the specified freq.

Parameters:
  • freq (str) –

  • ambiguous ({"raise", "infer", "NaT"} or bool mask, default: "raise") –

  • nonexistent ({"raise", "shift_forward", "shift_backward", "NaT"} or timedelta, default: "raise") –

Returns:

New QueryCompiler with performed ceil operation on every element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.ceil for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_components()#

Spread each date-time value into its components (days, hours, minutes…).

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.components for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_date()#

Get the date without timezone information for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the date without timezone information for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.date for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_day()#

Get day component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is day component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.day for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_day_name(locale=None)#

Get day name for each datetime value.

Parameters:

locale (str, optional) –

Returns:

New QueryCompiler with the same shape as self, where each element is day name for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.day_name for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_dayofweek()#

Get integer day of week for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is integer day of week for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.dayofweek for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_dayofyear()#

Get day of year for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is day of year for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.dayofyear for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_days()#

Get days for each interval value.

Returns:

New QueryCompiler with the same shape as self, where each element is days for the corresponding interval value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.days for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_days_in_month()#

Get number of days in month for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is number of days in month for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.days_in_month for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_daysinmonth()#

Get number of days in month for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is number of days in month for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.daysinmonth for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_end_time()#

Get the timestamp of end time for each period value.

Returns:

New QueryCompiler with the same shape as self, where each element is the timestamp of end time for the corresponding period value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.end_time for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_floor(freq, ambiguous='raise', nonexistent='raise')#

Perform floor operation on the underlying time-series data to the specified freq.

Parameters:
  • freq (str) –

  • ambiguous ({"raise", "infer", "NaT"} or bool mask, default: "raise") –

  • nonexistent ({"raise", "shift_forward", "shift_backward", "NaT"} or timedelta, default: "raise") –

Returns:

New QueryCompiler with performed floor operation on every element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.floor for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_freq()#

Get the time frequency of the underlying time-series data.

Returns:

QueryCompiler containing a single value, the frequency of the data.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.freq for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_hour()#

Get hour for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is hour for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.hour for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_is_leap_year()#

Get the boolean of whether corresponding year is leap for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the boolean of whether corresponding year is leap for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.is_leap_year for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_is_month_end()#

Get the boolean of whether the date is the last day of the month for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the boolean of whether the date is the last day of the month for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.is_month_end for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_is_month_start()#

Get the boolean of whether the date is the first day of the month for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the boolean of whether the date is the first day of the month for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.is_month_start for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_is_quarter_end()#

Get the boolean of whether the date is the last day of the quarter for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the boolean of whether the date is the last day of the quarter for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.is_quarter_end for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_is_quarter_start()#

Get the boolean of whether the date is the first day of the quarter for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the boolean of whether the date is the first day of the quarter for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.is_quarter_start for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_is_year_end()#

Get the boolean of whether the date is the last day of the year for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the boolean of whether the date is the last day of the year for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.is_year_end for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_is_year_start()#

Get the boolean of whether the date is the first day of the year for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is the boolean of whether the date is the first day of the year for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.is_year_start for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_isocalendar()#

Get Calculate year, week, and day according to the ISO 8601 standard. for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is Calculate year, week, and day according to the ISO 8601 standard. for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.isocalendar for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_microsecond()#

Get microseconds component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is microseconds component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.microsecond for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_microseconds()#

Get microseconds component for each interval value.

Returns:

New QueryCompiler with the same shape as self, where each element is microseconds component for the corresponding interval value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.microseconds for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_minute()#

Get minute component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is minute component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.minute for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_month()#

Get month component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is month component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.month for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_month_name(locale=None)#

Get the month name for each datetime value.

Parameters:

locale (str, optional) –

Returns:

New QueryCompiler with the same shape as self, where each element is the month name for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.month name for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_nanosecond()#

Get nanoseconds component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is nanoseconds component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.nanosecond for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_nanoseconds()#

Get nanoseconds component for each interval value.

Returns:

New QueryCompiler with the same shape as self, where each element is nanoseconds component for the corresponding interval value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.nanoseconds for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_normalize()#

Set the time component of each date-time value to midnight.

Returns:

New QueryCompiler containing date-time values with midnight time.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.normalize for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_quarter()#

Get quarter component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is quarter component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.quarter for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_qyear()#

Get the fiscal year for each period value.

Returns:

New QueryCompiler with the same shape as self, where each element is the fiscal year for the corresponding period value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.qyear for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_round(freq, ambiguous='raise', nonexistent='raise')#

Perform round operation on the underlying time-series data to the specified freq.

Parameters:
  • freq (str) –

  • ambiguous ({"raise", "infer", "NaT"} or bool mask, default: "raise") –

  • nonexistent ({"raise", "shift_forward", "shift_backward", "NaT"} or timedelta, default: "raise") –

Returns:

New QueryCompiler with performed round operation on every element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.round for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_second()#

Get seconds component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is seconds component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.second for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_seconds()#

Get seconds component for each interval value.

Returns:

New QueryCompiler with the same shape as self, where each element is seconds component for the corresponding interval value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.seconds for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_start_time()#

Get the timestamp of start time for each period value.

Returns:

New QueryCompiler with the same shape as self, where each element is the timestamp of start time for the corresponding period value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.start_time for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_strftime(date_format)#

Format underlying date-time data using specified format.

Parameters:

date_format (str) –

Returns:

New QueryCompiler containing formatted date-time values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.strftime for more information about parameters and output format.

dt_time()#

Get time component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is time component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.time for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_timetz()#

Get time component with timezone information for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is time component with timezone information for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.timetz for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_to_period(freq=None)#

Convert underlying data to the period at a particular frequency.

Parameters:

freq (str, optional) –

Returns:

New QueryCompiler containing period data.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.to_period for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_to_pydatetime()#

Convert underlying data to array of python native datetime.

Returns:

New QueryCompiler containing 1D array of datetime objects.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.to_pydatetime for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_to_pytimedelta()#

Convert underlying data to array of python native datetime.timedelta.

Returns:

New QueryCompiler containing 1D array of datetime.timedelta.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.to_pytimedelta for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_to_timestamp()#

Get the timestamp representation for each period value.

Returns:

New QueryCompiler with the same shape as self, where each element is the timestamp representation for the corresponding period value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.to_timestamp for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_total_seconds()#

Get duration in seconds for each interval value.

Returns:

New QueryCompiler with the same shape as self, where each element is duration in seconds for the corresponding interval value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.total_seconds for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_tz()#

Get the time-zone of the underlying time-series data.

Returns:

QueryCompiler containing a single value, time-zone of the data.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.tz for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_tz_convert(tz)#

Convert time-series data to the specified time zone.

Parameters:

tz (str, pytz.timezone) –

Returns:

New QueryCompiler containing values with converted time zone.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.tz_convert for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_tz_localize(tz, ambiguous='raise', nonexistent='raise')#

Localize tz-naive to tz-aware.

Parameters:
  • tz (str, pytz.timezone, optional) –

  • ambiguous ({"raise", "inner", "NaT"} or bool mask, default: "raise") –

  • nonexistent ({"raise", "shift_forward", "shift_backward, "NaT"} or pandas.timedelta, default: "raise") –

Returns:

New QueryCompiler containing values with localized time zone.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.tz_localize for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_unit()#

Notes

Please refer to modin.pandas.Series.dt.unit for more information about parameters and output format.

dt_weekday()#

Get integer day of week for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is integer day of week for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.weekday for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

dt_year()#

Get year component for each datetime value.

Returns:

New QueryCompiler with the same shape as self, where each element is year component for the corresponding datetime value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.dt.year for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

property dtypes#

Get columns dtypes.

Returns:

Series with dtypes of each column.

Return type:

pandas.Series

duplicated(**kwargs)#

Return boolean Series denoting duplicate rows.

Parameters:

**kwargs (dict) – Additional keyword arguments to be passed in to pandas.DataFrame.duplicated.

Returns:

New QueryCompiler containing boolean Series denoting duplicate rows.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.duplicated for more information about parameters and output format.

eq(other, **kwargs)#

Perform element-wise equality comparison (self == other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

equals(other)#

Notes

Please refer to modin.pandas.DataFrame.equals for more information about parameters and output format.

eval(expr, **kwargs)#

Evaluate string expression on QueryCompiler columns.

Parameters:
  • expr (str) –

  • **kwargs (dict) –

Returns:

QueryCompiler containing the result of evaluation.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.eval for more information about parameters and output format.

abstract execute()#

Wait for all computations to complete without materializing data.

expanding_aggregate(fold_axis, expanding_args, func, *args, **kwargs)#

Create expanding window and apply specified functions for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • func (str, dict, callable(pandas.Series) -> scalar, or list of such) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing the result of passed functions for each window, built by the following rules:

  • Labels on the specified axis are preserved.

  • Labels on the opposite of specified axis are MultiIndex, where first level contains preserved labels of this axis and the second level has the function names.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding window and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.aggregate for more information about parameters and output format.

expanding_corr(fold_axis, expanding_args, squeeze_self, squeeze_other, other=None, pairwise=None, ddof=1, numeric_only=False, **kwargs)#

Create expanding window and compute correlation for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • squeeze_self (bool) –

  • squeeze_other (bool) –

  • other (pandas.Series or pandas.DataFrame, default: None) –

  • pairwise (bool | None, default: None) –

  • ddof (int, default: 1) –

  • numeric_only (bool, default: False) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing correlation for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the correlation for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.corr for more information about parameters and output format.

expanding_count(fold_axis, expanding_args, ddof=1, *args, **kwargs)#

Create expanding window and compute standard deviation for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing standard deviation for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the standard deviation for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.std for more information about parameters and output format.

expanding_cov(fold_axis, expanding_args, squeeze_self, squeeze_other, other=None, pairwise=None, ddof=1, numeric_only=False, **kwargs)#

Create expanding window and compute sample covariance for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • squeeze_self (bool) –

  • squeeze_other (bool) –

  • other (pandas.Series or pandas.DataFrame, default: None) –

  • pairwise (bool | None, default: None) –

  • ddof (int, default: 1) –

  • numeric_only (bool, default: False) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing sample covariance for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the sample covariance for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.cov for more information about parameters and output format.

expanding_kurt(fold_axis, expanding_args, numeric_only=False, **kwargs)#

Create expanding window and compute Fisher’s definition of kurtosis without bias for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • numeric_only (bool, default: False) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing Fisher’s definition of kurtosis without bias for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the Fisher’s definition of kurtosis without bias for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.kurt for more information about parameters and output format.

expanding_max(fold_axis, expanding_args, *args, **kwargs)#

Create expanding window and compute maximum value for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing maximum value for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the maximum value for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.max for more information about parameters and output format.

expanding_mean(fold_axis, expanding_args, *args, **kwargs)#

Create expanding window and compute mean value for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing mean value for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the mean value for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.mean for more information about parameters and output format.

expanding_median(fold_axis, expanding_args, numeric_only=False, engine=None, engine_kwargs=None, **kwargs)#

Create expanding window and compute median for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • numeric_only (bool, default: False) –

  • engine (Optional[str], default: None) –

  • engine_kwargs (Optional[dict], default: None) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing median for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the median for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.median for more information about parameters and output format.

expanding_min(fold_axis, expanding_args, *args, **kwargs)#

Create expanding window and compute minimum value for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing minimum value for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the minimum value for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.min for more information about parameters and output format.

expanding_quantile(fold_axis, expanding_args, quantile, interpolation, **kwargs)#

Create expanding window and compute quantile for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • quantile (float) –

  • interpolation ({'linear', 'lower', 'higher', 'midpoint', 'nearest'}, default: 'linear') –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing quantile for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the quantile for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.quantile for more information about parameters and output format.

expanding_rank(fold_axis, expanding_args, method='average', ascending=True, pct=False, numeric_only=False, *args, **kwargs)#

Create expanding window and compute rank for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • method ({'average', 'min', 'max'}, default: 'average') –

  • ascending (bool, default: True) –

  • pct (bool, default: False) –

  • numeric_only (bool, default: False) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing rank for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the rank for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.rank for more information about parameters and output format.

expanding_sem(fold_axis, expanding_args, ddof=1, numeric_only=False, *args, **kwargs)#

Create expanding window and compute unbiased standard error mean for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • numeric_only (bool, default: False) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing unbiased standard error mean for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the unbiased standard error mean for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.std for more information about parameters and output format.

expanding_skew(fold_axis, expanding_args, numeric_only=False, **kwargs)#

Create expanding window and compute unbiased skewness for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • numeric_only (bool, default: False) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing unbiased skewness for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the unbiased skewness for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.skew for more information about parameters and output format.

expanding_std(fold_axis, expanding_args, ddof=1, *args, **kwargs)#

Create expanding window and compute standard deviation for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing standard deviation for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the standard deviation for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.std for more information about parameters and output format.

expanding_sum(fold_axis, expanding_args, *args, **kwargs)#

Create expanding window and compute sum for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing sum for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the sum for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.sum for more information about parameters and output format.

expanding_var(fold_axis, expanding_args, ddof=1, *args, **kwargs)#

Create expanding window and compute variance for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • expanding_args (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing variance for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the variance for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Expanding.var for more information about parameters and output format.

explode(column)#

Explode the given columns.

Parameters:

column (Union[Hashable, Sequence[Hashable]]) – The columns to explode.

Returns:

QueryCompiler that contains the results of execution. For each row in the input QueryCompiler, if the selected columns each contain M items, there will be M rows created by exploding the columns.

Return type:

BaseQueryCompiler

fillna(**kwargs)#

Replace NaN values using provided method.

Parameters:
  • value (scalar or dict) –

  • method ({"backfill", "bfill", "pad", "ffill", None}) –

  • axis ({0, 1}) –

  • inplace ({False}) – This parameter serves the compatibility purpose. Always has to be False.

  • limit (int, optional) –

  • downcast (dict, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with all null values filled.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.fillna for more information about parameters and output format.

abstract finalize()#

Finalize constructing the dataframe calling all deferred functions which were used to build it.

first(offset: DateOffset)#

Select initial periods of time series data based on a date offset.

When having a query compiler with dates as index, this function can select the first few rows based on a date offset.

Parameters:

offset (pandas.DateOffset) – The offset length of the data to select.

Returns:

New compiler containing the selected data.

Return type:

BaseQueryCompiler

first_valid_index()#

Return index label of first non-NaN/NULL value.

Return type:

scalar

floordiv(other, **kwargs)#

Perform element-wise integer division (self // other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

abstract free()#

Trigger a cleanup of this object.

abstract classmethod from_arrow(at, data_cls)#

Build QueryCompiler from Arrow Table.

Parameters:
  • at (Arrow Table) – The Arrow Table to convert from.

  • data_cls (type) – PandasDataframe class (or its descendant) to convert to.

Returns:

QueryCompiler containing data from the pandas DataFrame.

Return type:

BaseQueryCompiler

abstract classmethod from_dataframe(df, data_cls)#

Build QueryCompiler from a DataFrame object supporting the dataframe exchange protocol __dataframe__().

Parameters:
  • df (DataFrame) – The DataFrame object supporting the dataframe exchange protocol.

  • data_cls (type) – PandasDataframe class (or its descendant) to convert to.

Returns:

QueryCompiler containing data from the DataFrame.

Return type:

BaseQueryCompiler

abstract classmethod from_pandas(df, data_cls)#

Build QueryCompiler from pandas DataFrame.

Parameters:
  • df (pandas.DataFrame) – The pandas DataFrame to convert from.

  • data_cls (type) – PandasDataframe class (or its descendant) to convert to.

Returns:

QueryCompiler containing data from the pandas DataFrame.

Return type:

BaseQueryCompiler

ge(other, **kwargs)#

Perform element-wise greater than or equal comparison (self >= other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

get_axis(axis)#

Return index labels of the specified axis.

Parameters:

axis ({0, 1}) – Axis to return labels on. 0 is for index, when 1 is for columns.

Return type:

pandas.Index

get_dtypes_set()#

Get a set of dtypes that are in this query compiler.

Return type:

set

get_dummies(columns, **kwargs)#

Convert categorical variables to dummy variables for certain columns.

Parameters:
  • columns (label or list of such) – Columns to convert.

  • prefix (str or list of such) –

  • prefix_sep (str) –

  • dummy_na (bool) –

  • drop_first (bool) –

  • dtype (dtype) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with categorical variables converted to dummy.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.get_dummies for more information about parameters and output format.

get_index_name(axis=0)#

Get index name of specified axis.

Parameters:

axis ({0, 1}, default: 0) – Axis to get index name on.

Returns:

Index name, None for MultiIndex.

Return type:

hashable

get_index_names(axis=0)#

Get index names of specified axis.

Parameters:

axis ({0, 1}, default: 0) – Axis to get index names on.

Returns:

Index names.

Return type:

list

get_positions_from_labels(row_loc, col_loc)#

Compute index and column positions from their respective locators.

Inputs to this method are arguments the the pandas user could pass to loc. This function will compute the corresponding index and column positions that the user could equivalently pass to iloc.

Parameters:
  • row_loc (scalar, slice, list, array or tuple) – Row locator.

  • col_loc (scalar, slice, list, array or tuple) – Columns locator.

Returns:

  • row_lookup (slice(None) if full axis grab, pandas.RangeIndex if repetition is detected, numpy.ndarray otherwise) – List of index labels.

  • col_lookup (slice(None) if full axis grab, pandas.RangeIndex if repetition is detected, numpy.ndarray otherwise) – List of columns labels.

Notes

Usage of slice(None) as a resulting lookup is a hack to pass information about full-axis grab without computing actual indices that triggers lazy computations. Ideally, this API should get rid of using slices as indexers and either use a common Indexer object or range and np.ndarray only.

getitem_array(key)#

Mask QueryCompiler with key.

Parameters:

key (BaseQueryCompiler, np.ndarray or list of column labels) – Boolean mask represented by QueryCompiler or np.ndarray of the same shape as self, or enumerable of columns to pick.

Returns:

New masked QueryCompiler.

Return type:

BaseQueryCompiler

getitem_column_array(key, numeric=False, ignore_order=False)#

Get column data for target labels.

Parameters:
  • key (list-like) – Target labels by which to retrieve data.

  • numeric (bool, default: False) – Whether or not the key passed in represents the numeric index or the named index.

  • ignore_order (bool, default: False) – Allow returning columns in an arbitrary order for the sake of performance.

Returns:

New QueryCompiler that contains specified columns.

Return type:

BaseQueryCompiler

getitem_row_array(key)#

Get row data for target indices.

Parameters:

key (list-like) – Numeric indices of the rows to pick.

Returns:

New QueryCompiler that contains specified rows.

Return type:

BaseQueryCompiler

groupby_agg(by, agg_func, axis, groupby_kwargs, agg_args, agg_kwargs, how='axis_wise', drop=False, series_groupby=False)#

Group QueryCompiler data and apply passed aggregation function.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • agg_func (str, dict or callable(Series | DataFrame) -> scalar | Series | DataFrame) – Function to apply to the GroupBy object.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • how ({'axis_wise', 'group_wise', 'transform'}, default: 'axis_wise') –

    How to apply passed agg_func:
    • ’axis_wise’: apply the function against each row/column.

    • ’group_wise’: apply the function against every group.

    • ’transform’: apply the function against every group and broadcast the result to the original Query Compiler shape.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

  • series_groupby (bool, default: False) – Whether we should treat self as Series when performing groupby.

Returns:

QueryCompiler containing the result of groupby aggregation.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.GroupBy.aggregate for more information about parameters and output format.

groupby_all(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and check whether all elements are True for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the boolean of whether all elements are True for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.all for more information about parameters and output format.

groupby_any(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and check whether any element is True for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the boolean of whether there is any element which is True for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.any for more information about parameters and output format.

groupby_corr(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute correlation for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the correlation for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.corr for more information about parameters and output format.

groupby_count(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and count non-null values for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the number of non-null values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.count for more information about parameters and output format.

groupby_cov(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute covariance for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the covariance for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.cov for more information about parameters and output format.

groupby_cumcount(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute cumulative count for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the count of all the previous values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.cumcount for more information about parameters and output format.

groupby_cummax(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get cumulative maximum for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the maximum of all the previous values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.cummax for more information about parameters and output format.

groupby_cummin(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get cumulative minimum for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the minimum of all the previous values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.cummin for more information about parameters and output format.

groupby_cumprod(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get cumulative production for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the production of all the previous values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.cumprod for more information about parameters and output format.

groupby_cumsum(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute cumulative sum for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the sum of all the previous values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.cumsum for more information about parameters and output format.

groupby_dtypes(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get data types for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the data type for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.dtypes for more information about parameters and output format.

groupby_fillna(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and fill NaN values for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the fill_value if it was NaN, original value otherwise for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.fillna for more information about parameters and output format.

groupby_first(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get first value in group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the first value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.first for more information about parameters and output format.

groupby_get_group(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and construct DataFrame from group with provided name for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the DataFrame for given group for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.get_group for more information about parameters and output format.

groupby_head(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get first n values of a group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the first n values of a group for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.head for more information about parameters and output format.

groupby_idxmax(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get the index of the maximum value for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the index of maximum value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.idxmax for more information about parameters and output format.

groupby_idxmin(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get the index of the minimum value for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the index of minimum value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.idxmin for more information about parameters and output format.

groupby_last(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get last value in group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the last value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.last for more information about parameters and output format.

groupby_max(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get the maximum value for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the maximum value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.max for more information about parameters and output format.

groupby_mean(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute the mean value for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the mean value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.mean for more information about parameters and output format.

groupby_median(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get the median value for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the median value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.median for more information about parameters and output format.

groupby_min(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get the minimum value for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the minimum value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.min for more information about parameters and output format.

groupby_ngroup(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get group number of each value for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the group number of each value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.ngroup for more information about parameters and output format.

groupby_nlargest(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get n largest values in group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the n largest values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.nlargest for more information about parameters and output format.

groupby_nsmallest(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get n nsmallest values in group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the n nsmallest values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.nsmallest for more information about parameters and output format.

groupby_nth(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get nth value in group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the nth value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.nth for more information about parameters and output format.

groupby_nunique(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get the number of unique values for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the number of unique values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.nunique for more information about parameters and output format.

groupby_prod(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute product for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the product for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.prod for more information about parameters and output format.

groupby_quantile(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute specified quantile for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the quantile value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.quantile for more information about parameters and output format.

groupby_rank(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute numerical rank for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the numerical rank for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.rank for more information about parameters and output format.

groupby_rolling(by, agg_func, axis, groupby_kwargs, rolling_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and apply passed aggregation function to a rolling window in each group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • agg_func (str, dict or callable(Series | DataFrame) -> scalar | Series | DataFrame) – Function to apply to the GroupBy object.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • rolling_kwargs (dict) – Parameters to build a rolling window as expected by modin.pandas.window.RollingGroupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

QueryCompiler containing the result of groupby aggregation.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.GroupBy.rolling for more information about parameters and output format.

groupby_sem(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute standard error for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the standard error for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.sem for more information about parameters and output format.

groupby_shift(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and shift data with the specified settings for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the shifted value for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.shift for more information about parameters and output format.

groupby_size(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get the number of elements for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the number of elements for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.size for more information about parameters and output format.

groupby_skew(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute unbiased skew for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the unbiased skew for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.skew for more information about parameters and output format.

groupby_std(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute standard deviation for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the standard deviation for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.std for more information about parameters and output format.

groupby_sum(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute sum for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the sum for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.sum for more information about parameters and output format.

groupby_tail(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get last n values in group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the last n values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.tail for more information about parameters and output format.

groupby_unique(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and get unique values in group for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the unique values for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.unique for more information about parameters and output format.

groupby_var(by, axis, groupby_kwargs, agg_args, agg_kwargs, drop=False)#

Group QueryCompiler data and compute variance for every group.

Parameters:
  • by (BaseQueryCompiler, column or index label, Grouper or list of such) – Object that determine groups.

  • axis ({0, 1}) – Axis to group and apply aggregation function along. 0 is for index, when 1 is for columns.

  • groupby_kwargs (dict) – GroupBy parameters as expected by modin.pandas.DataFrame.groupby signature.

  • agg_args (list-like) – Positional arguments to pass to the agg_func.

  • agg_kwargs (dict) – Key arguments to pass to the agg_func.

  • drop (bool, default: False) – If by is a QueryCompiler indicates whether or not by-data came from the self.

Returns:

  • BaseQueryCompiler – QueryCompiler containing the result of groupby reduce built by the following rules:

    • Labels on the opposite of specified axis are preserved.

    • If groupby_args[“as_index”] is True then labels on the specified axis are the group names, otherwise labels would be default: 0, 1 … n.

    • If groupby_args[“as_index”] is False, then first N columns/rows of the frame contain group names, where N is the columns/rows to group on.

    • Each element of QueryCompiler is the variance for the corresponding group and column/row.

  • .. warningmap_args and reduce_args parameters are deprecated. They’re leaked here from PandasQueryCompiler.groupby_*, pandas storage format implements groupby via TreeReduce approach, but for other storage formats these parameters make no sense, and so they’ll be removed in the future.

Notes

Please refer to modin.pandas.GroupBy.var for more information about parameters and output format.

gt(other, **kwargs)#

Perform element-wise greater than comparison (self > other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

has_multiindex(axis=0)#

Check if specified axis is indexed by MultiIndex.

Parameters:

axis ({0, 1}, default: 0) – The axis to check (0 - index, 1 - columns).

Returns:

True if index at specified axis is MultiIndex and False otherwise.

Return type:

bool

idxmax(**kwargs)#

Get position of the first occurrence of the maximum for each row or column.

Parameters:
  • axis ({0, 1}) –

  • skipna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains position of the maximum element for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.idxmax for more information about parameters and output format.

idxmin(**kwargs)#

Get position of the first occurrence of the minimum for each row or column.

Parameters:
  • axis ({0, 1}) –

  • skipna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains position of the minimum element for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.idxmin for more information about parameters and output format.

infer_objects()#

Attempt to infer better dtypes for object columns.

Attempts soft conversion of object-dtyped columns, leaving non-object and unconvertible columns unchanged. The inference rules are the same as during normal Series/DataFrame construction.

Returns:

New query compiler with udpated dtypes.

Return type:

BaseQueryCompiler

insert(loc, column, value)#

Insert new column.

Parameters:
  • loc (int) – Insertion position.

  • column (label) – Label of the new column.

  • value (One-column BaseQueryCompiler, 1D array or scalar) – Data to fill new column with.

Returns:

QueryCompiler with new column inserted.

Return type:

BaseQueryCompiler

insert_item(axis, loc, value, how='inner', replace=False)#

Insert rows/columns defined by value at the specified position.

If frames are not aligned along specified axis, perform frames alignment first.

Parameters:
  • axis ({0, 1}) – Axis to insert along. 0 means insert rows, when 1 means insert columns.

  • loc (int) – Position to insert value.

  • value (BaseQueryCompiler) – Rows/columns to insert.

  • how ({"inner", "outer", "left", "right"}, default: "inner") – Type of join that will be used if frames are not aligned.

  • replace (bool, default: False) – Whether to insert item after column/row at loc-th position or to replace it by value.

Returns:

New QueryCompiler with inserted values.

Return type:

BaseQueryCompiler

interpolate(**kwargs)#

Fill NaN values using an interpolation method.

Returns:

Returns the same object type as the caller, interpolated at some or all NaN values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.interpolate for more information about parameters and output format.

invert()#

Apply bitwise inversion for each element of the QueryCompiler.

Returns:

New QueryCompiler containing bitwise inversion for each value.

Return type:

BaseQueryCompiler

is_monotonic_decreasing()#

Return boolean if values in the object are monotonically decreasing.

Return type:

bool

is_monotonic_increasing()#

Return boolean if values in the object are monotonically increasing.

Return type:

bool

is_series_like()#

Check whether this QueryCompiler can represent modin.pandas.Series object.

Returns:

Return True if QueryCompiler has a single column or row, False otherwise.

Return type:

bool

isin(values, ignore_indices=False, **kwargs)#

Check for each element of self whether it’s contained in passed values.

Parameters:
  • values (list-like, modin.pandas.Series, modin.pandas.DataFrame or dict) – Values to check elements of self in.

  • ignore_indices (bool, default: False) – Whether to execute isin() only on an intersection of indices.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Boolean mask for self of whether an element at the corresponding position is contained in values.

Return type:

BaseQueryCompiler

isna()#

Check for each element of self whether it’s NaN.

Returns:

Boolean mask for self of whether an element at the corresponding position is NaN.

Return type:

BaseQueryCompiler

join(right, **kwargs)#

Join columns of another QueryCompiler.

Parameters:
  • right (BaseQueryCompiler) – QueryCompiler of the right frame to join with.

  • on (label or list of such) –

  • how ({"left", "right", "outer", "inner"}) –

  • lsuffix (str) –

  • rsuffix (str) –

  • sort (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler that contains result of the join.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.join for more information about parameters and output format.

kurt(axis, numeric_only=False, skipna=True, **kwargs)#

Get the unbiased kurtosis for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the unbiased kurtosis for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.kurt for more information about parameters and output format.

last(offset: DateOffset)#

Select final periods of time series data based on a date offset.

For a query compiler with a sorted DatetimeIndex, this function selects the last few rows based on a date offset.

Parameters:

offset (pandas.DateOffset) – The offset length of the data to select.

Returns:

New compiler containing the selected data.

Return type:

BaseQueryCompiler

last_valid_index()#

Return index label of last non-NaN/NULL value.

Return type:

scalar

le(other, **kwargs)#

Perform element-wise less than or equal comparison (self <= other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

list__getitem__(key)#

Index or slice lists in the Series.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.list.__getitem__ for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

list_flatten()#

Flatten list values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.list.flatten for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

list_len()#

Return the length of each list in the Series.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.list.len for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

lookup(row_labels, col_labels)#

Label-based “fancy indexing” function for DataFrame.

lt(other, **kwargs)#

Perform element-wise less than comparison (self < other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

map(func, *args, **kwargs)#

Apply passed function elementwise.

Parameters:
  • func (callable(scalar) -> scalar) – Function to apply to each element of the QueryCompiler.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

Transformed QueryCompiler.

Return type:

BaseQueryCompiler

mask(cond, other, **kwargs)#

Replace values where the condition cond is True.

Returns:

New QueryCompiler with elements replaced with ones from other where cond is True.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.mask for more information about parameters and output format.

max(**kwargs)#

Get the maximum value for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the maximum value for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.max for more information about parameters and output format.

mean(**kwargs)#

Get the mean value for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the mean value for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.mean for more information about parameters and output format.

median(**kwargs)#

Get the median value for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the median value for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.median for more information about parameters and output format.

melt(*args, **kwargs)#

Unpivot QueryCompiler data from wide to long format.

Parameters:
  • id_vars (list of labels, optional) –

  • value_vars (list of labels, optional) –

  • var_name (label) –

  • value_name (label) –

  • col_level (int or label) –

  • ignore_index (bool) –

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with unpivoted data.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.melt for more information about parameters and output format.

memory_usage(**kwargs)#

Return the memory usage of each column in bytes.

Parameters:
  • index (bool) –

  • deep (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of self, where each row contains the memory usage for the corresponding column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.memory_usage for more information about parameters and output format.

merge(right, **kwargs)#

Merge QueryCompiler objects using a database-style join.

Parameters:
  • right (BaseQueryCompiler) – QueryCompiler of the right frame to merge with.

  • how ({"left", "right", "outer", "inner", "cross"}) –

  • on (label or list of such) –

  • left_on (label or list of such) –

  • right_on (label or list of such) –

  • left_index (bool) –

  • right_index (bool) –

  • sort (bool) –

  • suffixes (list-like) –

  • copy (bool) –

  • indicator (bool or str) –

  • validate (str) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler that contains result of the merge.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.merge for more information about parameters and output format.

merge_ordered(right, **kwargs)#

Perform a merge for ordered data with optional filling/interpolation.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.merge_ordered for more information about parameters and output format.

min(**kwargs)#

Get the minimum value for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the minimum value for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.min for more information about parameters and output format.

mod(other, **kwargs)#

Perform element-wise modulo (self % other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

mode(**kwargs)#

Get the modes for every column or row.

Parameters:
  • axis ({0, 1}) –

  • numeric_only (bool) –

  • dropna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with modes calculated along given axis.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.mode for more information about parameters and output format.

mul(other, **kwargs)#

Perform element-wise multiplication (self * other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

ne(other, **kwargs)#

Perform element-wise not equal comparison (self != other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

negative(**kwargs)#

Change the sign for every value of self.

Parameters:

**kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Return type:

BaseQueryCompiler

Notes

Be aware, that all QueryCompiler values have to be numeric.

nlargest(n=5, columns=None, keep='first')#

Return the first n rows ordered by columns in descending order.

Parameters:
  • n (int, default: 5) –

  • columns (list of labels, optional) – Column labels to order by. (note: this parameter can be omitted only for a single-column query compilers representing Series object, otherwise columns has to be specified).

  • keep ({"first", "last", "all"}, default: "first") –

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.nlargest for more information about parameters and output format.

notna()#

Check for each element of self whether it’s existing (non-missing) value.

Returns:

Boolean mask for self of whether an element at the corresponding position is not NaN.

Return type:

BaseQueryCompiler

nsmallest(n=5, columns=None, keep='first')#

Return the first n rows ordered by columns in ascending order.

Parameters:
  • n (int, default: 5) –

  • columns (list of labels, optional) – Column labels to order by. (note: this parameter can be omitted only for a single-column query compilers representing Series object, otherwise columns has to be specified).

  • keep ({"first", "last", "all"}, default: "first") –

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.nsmallest for more information about parameters and output format.

nunique(**kwargs)#

Get the number of unique values for each column or row.

Parameters:
  • axis ({0, 1}) –

  • dropna (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the number of unique values for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.nunique for more information about parameters and output format.

pct_change(**kwargs)#

Percentage change between the current and a prior element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.pct_change for more information about parameters and output format.

pivot(index, columns, values)#

Produce pivot table based on column values.

Parameters:
  • index (label or list of such, pandas.Index, optional) –

  • columns (label or list of such) –

  • values (label or list of such, optional) –

Returns:

New QueryCompiler containing pivot table.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.pivot for more information about parameters and output format.

pivot_table(index, values, columns, aggfunc, fill_value, margins, dropna, margins_name, observed, sort)#

Create a spreadsheet-style pivot table from underlying data.

Parameters:
  • index (label, pandas.Grouper, array or list of such) –

  • values (label, optional) –

  • columns (column, pandas.Grouper, array or list of such) –

  • aggfunc (callable(pandas.Series) -> scalar, dict of list of such) –

  • fill_value (scalar, optional) –

  • margins (bool) –

  • dropna (bool) –

  • margins_name (str) –

  • observed (bool) –

  • sort (bool) –

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.pivot_table for more information about parameters and output format.

pow(other, **kwargs)#

Perform element-wise exponential power (self ** other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

prod(**kwargs)#

Get the production for each column or row.

Parameters:
  • axis ({0, 1}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the production for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.prod for more information about parameters and output format.

prod_min_count(**kwargs)#

Get the production for each column or row.

Parameters:
  • axis ({0, 1}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the production for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.prod for more information about parameters and output format.

quantile_for_list_of_values(**kwargs)#

Get the value at the given quantile for each column or row.

Parameters:
  • q (list-like) –

  • axis ({0, 1}) –

  • numeric_only (bool) –

  • interpolation ({"linear", "lower", "higher", "midpoint", "nearest"}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the value at the given quantile for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.quantile for more information about parameters and output format.

quantile_for_single_value(**kwargs)#

Get the value at the given quantile for each column or row.

Parameters:
  • q (float) –

  • axis ({0, 1}) –

  • numeric_only (bool) –

  • interpolation ({"linear", "lower", "higher", "midpoint", "nearest"}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the value at the given quantile for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.quantile for more information about parameters and output format.

radd(other, **kwargs)#

Perform element-wise addition (other + self).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

rank(**kwargs)#

Compute numerical rank along the specified axis.

By default, equal values are assigned a rank that is the average of the ranks of those values, this behavior can be changed via method parameter.

Parameters:
  • axis ({0, 1}) –

  • method ({"average", "min", "max", "first", "dense"}) –

  • numeric_only (bool) –

  • na_option ({"keep", "top", "bottom"}) –

  • ascending (bool) –

  • pct (bool) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler of the same shape as self, where each element is the numerical rank of the corresponding value along row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.rank for more information about parameters and output format.

rdivmod(other, **kwargs)#

Return Integer division and modulo of self and other, element-wise (binary operator rdivmod).

Equivalent to other divmod self, but with support to substitute a fill_value for missing data in either one of the inputs.

Parameters:
  • other (BaseQueryCompiler or scalar value) –

  • **kwargs (dict) – Other arguments for division.

Returns:

  • BaseQueryCompiler – Compiler representing Series with divisor part of division.

  • BaseQueryCompiler – Compiler representing Series with modulo part of division.

Notes

Please refer to modin.pandas.Series.rdivmod for more information about parameters and output format.

reindex(axis, labels, **kwargs)#

Align QueryCompiler data with a new index along specified axis.

Parameters:
  • axis ({0, 1}) – Axis to align labels along. 0 is for index, 1 is for columns.

  • labels (list-like) – Index-labels to align with.

  • method ({None, "backfill"/"bfill", "pad"/"ffill", "nearest"}) – Method to use for filling holes in reindexed frame.

  • fill_value (scalar) – Value to use for missing values in the resulted frame.

  • limit (int) –

  • tolerance (int) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler with aligned axis.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.reindex for more information about parameters and output format.

repartition(axis=None)#

Repartitioning QueryCompiler objects to get ideal partitions inside.

Allows to improve performance where the query compiler can’t improve yet by doing implicit repartitioning.

Parameters:

axis ({0, 1, None}, optional) – The axis along which the repartitioning occurs. None is used for repartitioning along both axes.

Returns:

The repartitioned BaseQueryCompiler.

Return type:

BaseQueryCompiler

repeat(repeats)#

Repeat each element of one-column QueryCompiler given number of times.

Parameters:

repeats (int or array of ints) – The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty QueryCompiler.

Returns:

New QueryCompiler with repeated elements.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.repeat for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

replace(**kwargs)#

Replace values given in to_replace by value.

Parameters:
  • to_replace (scalar, list-like, regex, modin.pandas.Series, or None) –

  • value (scalar, list-like, regex or dict) –

  • inplace ({False}) – This parameter serves the compatibility purpose. Always has to be False.

  • limit (int or None) –

  • regex (bool or same types as to_replace) –

  • method ({"pad", "ffill", "bfill", None}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler with all to_replace values replaced by value.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.replace for more information about parameters and output format.

resample_agg_df(resample_kwargs, func, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and apply passed aggregation function for each group over the specified axis.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • func (str, dict, callable(pandas.Series) -> scalar, or list of such) –

  • *args (iterable) – Positional arguments to pass to the aggregation function.

  • **kwargs (dict) – Keyword arguments to pass to the aggregation function.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are a MultiIndex, where first level contains preserved labels of this axis and the second level is the function names.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.agg for more information about parameters and output format.

resample_agg_ser(resample_kwargs, func, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and apply passed aggregation function in a one-column query compiler for each group over the specified axis.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • func (str, dict, callable(pandas.Series) -> scalar, or list of such) –

  • *args (iterable) – Positional arguments to pass to the aggregation function.

  • **kwargs (dict) – Keyword arguments to pass to the aggregation function.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are a MultiIndex, where first level contains preserved labels of this axis and the second level is the function names.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.agg for more information about parameters and output format.

Warning

This method duplicates logic of resample_agg_df and will be removed soon.

resample_app_df(resample_kwargs, func, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and apply passed aggregation function for each group over the specified axis.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • func (str, dict, callable(pandas.Series) -> scalar, or list of such) –

  • *args (iterable) – Positional arguments to pass to the aggregation function.

  • **kwargs (dict) – Keyword arguments to pass to the aggregation function.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are a MultiIndex, where first level contains preserved labels of this axis and the second level is the function names.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.apply for more information about parameters and output format.

Warning

This method duplicates logic of resample_agg_df and will be removed soon.

resample_app_ser(resample_kwargs, func, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and apply passed aggregation function in a one-column query compiler for each group over the specified axis.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • func (str, dict, callable(pandas.Series) -> scalar, or list of such) –

  • *args (iterable) – Positional arguments to pass to the aggregation function.

  • **kwargs (dict) – Keyword arguments to pass to the aggregation function.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are a MultiIndex, where first level contains preserved labels of this axis and the second level is the function names.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.apply for more information about parameters and output format.

Warning

This method duplicates logic of resample_agg_df and will be removed soon.

resample_asfreq(resample_kwargs, fill_value)#

Resample time-series data and get the values at the new frequency.

Group data into intervals by time-series row/column with a specified frequency and get values at the new frequency.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • fill_value (scalar) –

Returns:

New QueryCompiler containing values at the specified frequency.

Return type:

BaseQueryCompiler

resample_bfill(resample_kwargs, limit)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and fill missing values in each group independently using back-fill method.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • limit (int) –

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • QueryCompiler contains unsampled data with missing values filled.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.bfill for more information about parameters and output format.

resample_count(resample_kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute number of non-NA values for each group.

Parameters:

resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the number of non-NA values for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.count for more information about parameters and output format.

resample_ffill(resample_kwargs, limit)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and fill missing values in each group independently using forward-fill method.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • limit (int) –

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • QueryCompiler contains unsampled data with missing values filled.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.ffill for more information about parameters and output format.

resample_fillna(resample_kwargs, method, limit)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and fill missing values in each group independently using specified method.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • method (str) –

  • limit (int) –

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • QueryCompiler contains unsampled data with missing values filled.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.fillna for more information about parameters and output format.

resample_first(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute first element for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the first element for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.first for more information about parameters and output format.

resample_get_group(resample_kwargs, name, obj)#

Resample time-series data and get the specified group.

Group data into intervals by time-series row/column with a specified frequency and get the values of the specified group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • name (object) –

  • obj (modin.pandas.DataFrame, optional) –

Returns:

New QueryCompiler containing the values from the specified group.

Return type:

BaseQueryCompiler

resample_interpolate(resample_kwargs, method, axis, limit, inplace, limit_direction, limit_area, downcast, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and fill missing values in each group independently using specified interpolation method.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • method (str) –

  • axis ({0, 1}) –

  • limit (int) –

  • inplace ({False}) – This parameter serves the compatibility purpose. Always has to be False.

  • limit_direction ({"forward", "backward", "both"}) –

  • limit_area ({None, "inside", "outside"}) –

  • downcast (str, optional) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • QueryCompiler contains unsampled data with missing values filled.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.interpolate for more information about parameters and output format.

resample_last(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute last element for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the last element for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.last for more information about parameters and output format.

resample_max(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute maximum value for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the maximum value for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.max for more information about parameters and output format.

resample_mean(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute mean value for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the mean value for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.mean for more information about parameters and output format.

resample_median(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute median value for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the median value for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.median for more information about parameters and output format.

resample_min(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute minimum value for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the minimum value for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.min for more information about parameters and output format.

resample_nearest(resample_kwargs, limit)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and fill missing values in each group independently using ‘nearest’ method.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • limit (int) –

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • QueryCompiler contains unsampled data with missing values filled.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.nearest for more information about parameters and output format.

resample_nunique(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute number of unique values for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the number of unique values for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.nunique for more information about parameters and output format.

resample_ohlc_df(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute open, high, low and close values for each group over the specified axis.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Positional arguments to pass to the aggregation function.

  • **kwargs (dict) – Keyword arguments to pass to the aggregation function.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are a MultiIndex, where first level contains preserved labels of this axis and the second level is the labels of columns containing computed values.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.ohlc for more information about parameters and output format.

resample_ohlc_ser(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute open, high, low and close values for each group over the specified axis.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Positional arguments to pass to the aggregation function.

  • **kwargs (dict) – Keyword arguments to pass to the aggregation function.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are a MultiIndex, where first level contains preserved labels of this axis and the second level is the labels of columns containing computed values.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.ohlc for more information about parameters and output format.

resample_pipe(resample_kwargs, func, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency, build equivalent pandas.Resampler object and apply passed function to it.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • func (callable(pandas.Resampler) -> object or tuple(callable, str)) –

  • *args (iterable) – Positional arguments to pass to function.

  • **kwargs (dict) – Keyword arguments to pass to function.

Returns:

New QueryCompiler containing the result of passed function.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Resampler.pipe for more information about parameters and output format.

resample_prod(resample_kwargs, min_count, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute product for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • min_count (int) –

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the product for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.prod for more information about parameters and output format.

resample_quantile(resample_kwargs, q, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute quantile for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • q (float) –

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the quantile for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.quantile for more information about parameters and output format.

resample_sem(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute standard error of the mean for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the standard error of the mean for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.sem for more information about parameters and output format.

resample_size(resample_kwargs, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute number of elements in a group for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the number of elements in a group for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.size for more information about parameters and output format.

resample_std(resample_kwargs, ddof, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute standard deviation for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • ddof (int) –

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the standard deviation for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.std for more information about parameters and output format.

resample_sum(resample_kwargs, min_count, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute sum for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • min_count (int) –

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the sum for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.sum for more information about parameters and output format.

resample_transform(resample_kwargs, arg, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and call passed function on each group. In contrast to resample_app_df apply function to the whole group, instead of a single axis.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • arg (callable(pandas.DataFrame) -> pandas.Series) –

  • *args (iterable) – Positional arguments to pass to function.

  • **kwargs (dict) – Keyword arguments to pass to function.

Returns:

New QueryCompiler containing the result of passed function.

Return type:

BaseQueryCompiler

resample_var(resample_kwargs, ddof, *args, **kwargs)#

Resample time-series data and apply aggregation on it.

Group data into intervals by time-series row/column with a specified frequency and compute variance for each group.

Parameters:
  • resample_kwargs (dict) – Resample parameters as expected by modin.pandas.DataFrame.resample signature.

  • ddof (int) –

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the result of resample aggregation built by the following rules:

  • Labels on the specified axis are the group names (time-stamps)

  • Labels on the opposite of specified axis are preserved.

  • Each element of QueryCompiler is the variance for the corresponding group and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.resample.Resampler.var for more information about parameters and output format.

reset_index(**kwargs)#

Reset the index, or a level of it.

Parameters:
  • drop (bool) – Whether to drop the reset index or insert it at the beginning of the frame.

  • level (int or label, optional) – Level to remove from index. Removes all levels by default.

  • col_level (int or label) – If the columns have multiple levels, determines which level the labels are inserted into.

  • col_fill (label) – If the columns have multiple levels, determines how the other levels are named.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler with reset index.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.reset_index for more information about parameters and output format.

rfloordiv(other, **kwargs)#

Perform element-wise integer division (other // self).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

rmod(other, **kwargs)#

Perform element-wise modulo (other % self).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

rmul(other, **kwargs)#

Perform element-wise multiplication (other * self).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

rolling_aggregate(fold_axis, rolling_kwargs, func, *args, **kwargs)#

Create rolling window and apply specified functions for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • func (str, dict, callable(pandas.Series) -> scalar, or list of such) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing the result of passed functions for each window, built by the following rules:

  • Labels on the specified axis are preserved.

  • Labels on the opposite of specified axis are MultiIndex, where first level contains preserved labels of this axis and the second level has the function names.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding window and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.aggregate for more information about parameters and output format.

rolling_apply(fold_axis, rolling_kwargs, func, raw=False, engine=None, engine_kwargs=None, args=None, kwargs=None)#

Create rolling window and apply specified function for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • func (callable(pandas.Series) -> scalar) –

  • raw (bool, default: False) –

  • engine (None, default: None) – This parameters serves the compatibility purpose. Always has to be None.

  • engine_kwargs (None, default: None) – This parameters serves the compatibility purpose. Always has to be None.

  • args (tuple, optional) –

  • kwargs (dict, optional) –

Returns:

New QueryCompiler containing the result of passed function for each window, built by the following rules:

  • Labels on the specified axis are preserved.

  • Labels on the opposite of specified axis are MultiIndex, where first level contains preserved labels of this axis and the second level has the function names.

  • Each element of QueryCompiler is the result of corresponding function for the corresponding window and column/row.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.apply for more information about parameters and output format.

Warning

This method duplicates logic of rolling_aggregate and will be removed soon.

rolling_corr(fold_axis, rolling_kwargs, other=None, pairwise=None, *args, **kwargs)#

Create rolling window and compute correlation for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • other (modin.pandas.Series, modin.pandas.DataFrame, list-like, optional) –

  • pairwise (bool, optional) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing correlation for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the correlation for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.corr for more information about parameters and output format.

rolling_count(fold_axis, rolling_kwargs)#

Create rolling window and compute number of non-NA values for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

Returns:

New QueryCompiler containing number of non-NA values for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the number of non-NA values for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.count for more information about parameters and output format.

rolling_cov(fold_axis, rolling_kwargs, other=None, pairwise=None, ddof=1, **kwargs)#

Create rolling window and compute covariance for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • other (modin.pandas.Series, modin.pandas.DataFrame, list-like, optional) –

  • pairwise (bool, optional) –

  • ddof (int, default: 1) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing covariance for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the covariance for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.cov for more information about parameters and output format.

rolling_kurt(fold_axis, rolling_kwargs, **kwargs)#

Create rolling window and compute unbiased kurtosis for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • **kwargs (dict) –

Returns:

New QueryCompiler containing unbiased kurtosis for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the unbiased kurtosis for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.kurt for more information about parameters and output format.

rolling_max(fold_axis, rolling_kwargs, *args, **kwargs)#

Create rolling window and compute maximum value for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing maximum value for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the maximum value for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.max for more information about parameters and output format.

rolling_mean(fold_axis, rolling_kwargs, *args, **kwargs)#

Create rolling window and compute mean value for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing mean value for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the mean value for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.mean for more information about parameters and output format.

rolling_median(fold_axis, rolling_kwargs, **kwargs)#

Create rolling window and compute median value for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • **kwargs (dict) –

Returns:

New QueryCompiler containing median value for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the median value for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.median for more information about parameters and output format.

rolling_min(fold_axis, rolling_kwargs, *args, **kwargs)#

Create rolling window and compute minimum value for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing minimum value for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the minimum value for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.min for more information about parameters and output format.

rolling_quantile(fold_axis, rolling_kwargs, quantile, interpolation='linear', **kwargs)#

Create rolling window and compute quantile for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • quantile (float) –

  • interpolation ({'linear', 'lower', 'higher', 'midpoint', 'nearest'}, default: 'linear') –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing quantile for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the quantile for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.quantile for more information about parameters and output format.

rolling_rank(fold_axis, rolling_kwargs, method='average', ascending=True, pct=False, numeric_only=False, *args, **kwargs)#

Create rolling window and compute rank for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • method ({'average', 'min', 'max'}, default: 'average') –

  • ascending (bool, default: True) –

  • pct (bool, default: False) –

  • numeric_only (bool, default: False) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing rank for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the rank for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.rank for more information about parameters and output format.

rolling_sem(fold_axis, rolling_kwargs, *args, **kwargs)#

Create rolling window and compute sem for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing sem for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the sem for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.sem for more information about parameters and output format.

rolling_skew(fold_axis, rolling_kwargs, **kwargs)#

Create rolling window and compute unbiased skewness for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • **kwargs (dict) –

Returns:

New QueryCompiler containing unbiased skewness for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the unbiased skewness for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.skew for more information about parameters and output format.

rolling_std(fold_axis, rolling_kwargs, ddof=1, *args, **kwargs)#

Create rolling window and compute standard deviation for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing standard deviation for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the standard deviation for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.std for more information about parameters and output format.

rolling_sum(fold_axis, rolling_kwargs, *args, **kwargs)#

Create rolling window and compute sum for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing sum for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the sum for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.sum for more information about parameters and output format.

rolling_var(fold_axis, rolling_kwargs, ddof=1, *args, **kwargs)#

Create rolling window and compute variance for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • rolling_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing variance for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the variance for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.var for more information about parameters and output format.

round(**kwargs)#

Round every numeric value up to specified number of decimals.

Parameters:
  • decimals (int or list-like) – Number of decimals to round each column to.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler with rounded values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.round for more information about parameters and output format.

rowwise_query(expr, **kwargs)#

Query columns of the QueryCompiler with a boolean expression row-wise.

Parameters:
  • expr (str) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing the rows where the boolean expression is satisfied.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.query for more information about parameters and output format.

rpow(other, **kwargs)#

Perform element-wise exponential power (other ** self).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

rsub(other, **kwargs)#

Perform element-wise subtraction (other - self).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

rtruediv(other, **kwargs)#

Perform element-wise division (other / self).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

searchsorted(**kwargs)#

Find positions in a sorted self where value should be inserted to maintain order.

Parameters:
  • value (list-like) –

  • side ({"left", "right"}) –

  • sorter (list-like, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler which contains indices to insert.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.searchsorted for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

sem(**kwargs)#

Get the standard deviation of the mean for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • ddof (int) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the standard deviation of the mean for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sem for more information about parameters and output format.

series_corr(**kwargs)#

Compute correlation with other Series, excluding missing values.

The two Series objects are not required to be the same length and will be aligned internally before the correlation function is applied.

Returns:

Correlation with other.

Return type:

float

Notes

Please refer to modin.pandas.Series.corr for more information about parameters and output format.

series_to_dict(into=<class 'dict'>)#

Convert the Series to a dictionary.

Return type:

dict or into instance

Notes

Please refer to modin.pandas.Series.to_dict for more information about parameters and output format.

series_update(other, **kwargs)#

Update values of self using values of other at the corresponding indices.

Parameters:
  • other (BaseQueryCompiler) – One-column query compiler with updated values.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with updated values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.update for more information about parameters and output format.

series_view(**kwargs)#

Reinterpret underlying data with new dtype.

Parameters:
  • dtype (dtype) – Data type to reinterpret underlying data with.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler of the same data in memory, with reinterpreted values.

Return type:

BaseQueryCompiler

Notes

  • Be aware, that if this method do fallback to pandas, then newly created QueryCompiler will be the copy of the original data.

  • Please refer to modin.pandas.Series.view for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

set_index_from_columns(keys: List[Hashable], drop: bool = True, append: bool = False)#

Create new row labels from a list of columns.

Parameters:
  • keys (list of hashable) – The list of column names that will become the new index.

  • drop (bool, default: True) – Whether or not to drop the columns provided in the keys argument.

  • append (bool, default: True) – Whether or not to add the columns in keys as new levels appended to the existing index.

Returns:

A new QueryCompiler with updated index.

Return type:

BaseQueryCompiler

set_index_name(name, axis=0)#

Set index name for the specified axis.

Parameters:
  • name (hashable) – New index name.

  • axis ({0, 1}, default: 0) – Axis to set name along.

set_index_names(names, axis=0)#

Set index names for the specified axis.

Parameters:
  • names (list) – New index names.

  • axis ({0, 1}, default: 0) – Axis to set names along.

setitem(axis, key, value)#

Set the row/column defined by key to the value provided.

Parameters:
  • axis ({0, 1}) – Axis to set value along. 0 means set row, 1 means set column.

  • key (label) – Row/column label to set value in.

  • value (BaseQueryCompiler, list-like or scalar) – Define new row/column value.

Returns:

New QueryCompiler with updated key value.

Return type:

BaseQueryCompiler

setitem_bool(row_loc, col_loc, item)#

Set an item to the given location based on row_loc and col_loc.

Parameters:
  • row_loc (BaseQueryCompiler) – Query Compiler holding a Series of booleans.

  • col_loc (label) – Column label in self.

  • item (scalar) – An item to be set.

Returns:

New QueryCompiler with the inserted item.

Return type:

BaseQueryCompiler

Notes

Currently, this method is only used to set a scalar to the given location.

sizeof()#

Compute the total memory usage for self.

Returns:

Result that holds either a value or Series of values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sizeof for more information about parameters and output format.

skew(**kwargs)#

Get the unbiased skew for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the unbiased skew for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.skew for more information about parameters and output format.

sort_columns_by_row_values(rows, ascending=True, **kwargs)#

Reorder the columns based on the lexicographic order of the given rows.

Parameters:
  • rows (label or list of labels) – The row or rows to sort by.

  • ascending (bool, default: True) – Sort in ascending order (True) or descending order (False).

  • kind ({"quicksort", "mergesort", "heapsort"}) –

  • na_position ({"first", "last"}) –

  • ignore_index (bool) –

  • key (callable(pandas.Index) -> pandas.Index, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler that contains result of the sort.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sort_values for more information about parameters and output format.

sort_index(**kwargs)#

Sort data by index or column labels.

Parameters:
  • axis ({0, 1}) –

  • level (int, label or list of such) –

  • ascending (bool) –

  • inplace (bool) –

  • kind ({"quicksort", "mergesort", "heapsort"}) –

  • na_position ({"first", "last"}) –

  • sort_remaining (bool) –

  • ignore_index (bool) –

  • key (callable(pandas.Index) -> pandas.Index, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler containing the data sorted by columns or indices.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sort_index for more information about parameters and output format.

sort_rows_by_column_values(columns, ascending=True, **kwargs)#

Reorder the rows based on the lexicographic order of the given columns.

Parameters:
  • columns (label or list of labels) – The column or columns to sort by.

  • ascending (bool, default: True) – Sort in ascending order (True) or descending order (False).

  • kind ({"quicksort", "mergesort", "heapsort"}) –

  • na_position ({"first", "last"}) –

  • ignore_index (bool) –

  • key (callable(pandas.Index) -> pandas.Index, optional) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler that contains result of the sort.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sort_values for more information about parameters and output format.

stack(level, dropna)#

Stack the prescribed level(s) from columns to index.

Parameters:
  • level (int or label) –

  • dropna (bool) –

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.stack for more information about parameters and output format.

std(**kwargs)#

Get the standard deviation for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • ddof (int) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the standard deviation for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.std for more information about parameters and output format.

str___getitem__(key)#

Apply “__getitem__” function to each string value in QueryCompiler.

Parameters:

key (object) –

Returns:

New QueryCompiler containing the result of execution of the “__getitem__” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.__getitem__ for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_capitalize()#

Apply “capitalize” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “capitalize” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.capitalize for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_casefold()#

Apply “casefold” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “casefold” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.casefold for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_cat(others, sep=None, na_rep=None, join='left')#

Apply “cat” function to each string value in QueryCompiler.

Parameters:
  • others (Series, Index, DataFrame, np.ndarray or list-like,) –

  • sep (str, default: '',) –

  • na_rep (str or None, default: None,) –

  • join ({'left', 'right', 'outer', 'inner'}, default: 'left') –

Returns:

New QueryCompiler containing the result of execution of the “cat” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.cat for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_center(width, fillchar=' ')#

Apply “center” function to each string value in QueryCompiler.

Parameters:
  • width (int) –

  • fillchar (str, default: ' ') –

Returns:

New QueryCompiler containing the result of execution of the “center” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.center for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_contains(pat, case=True, flags=0, na=None, regex=True)#

Apply “contains” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • case (bool, default: True) –

  • flags (int, default: 0) –

  • na (object, default: None) –

  • regex (bool, default: True) –

Returns:

New QueryCompiler containing the result of execution of the “contains” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.contains for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_count(pat, flags=0)#

Apply “count” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • flags (int, default: 0) –

Returns:

New QueryCompiler containing the result of execution of the “count” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.count for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_decode(encoding, errors)#

Apply “decode” function to each string value in QueryCompiler.

Parameters:
  • encoding (str,) –

  • errors (str, default = 'strict') –

Returns:

New QueryCompiler containing the result of execution of the “decode” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.decode for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_encode(encoding, errors)#

Apply “encode” function to each string value in QueryCompiler.

Parameters:
  • encoding (str,) –

  • errors (str, default = 'strict') –

Returns:

New QueryCompiler containing the result of execution of the “encode” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.encode for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_endswith(pat, na=None)#

Apply “endswith” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • na (object, default: None) –

Returns:

New QueryCompiler containing the result of execution of the “endswith” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.endswith for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_extract(pat, flags=0, expand=True)#

Apply “extract” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • flags (int, default: 0) –

  • expand (bool, default: True) –

Returns:

New QueryCompiler containing the result of execution of the “extract” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.extract for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_extractall(pat, flags=0)#

Apply “extractall” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • flags (int, default: 0) –

Returns:

New QueryCompiler containing the result of execution of the “extractall” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.extractall for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_find(sub, start=0, end=None)#

Apply “find” function to each string value in QueryCompiler.

Parameters:
  • sub (str) –

  • start (int, default: 0) –

  • end (int, optional) –

Returns:

New QueryCompiler containing the result of execution of the “find” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.find for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_findall(pat, flags=0)#

Apply “findall” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • flags (int, default: 0) –

Returns:

New QueryCompiler containing the result of execution of the “findall” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.findall for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_fullmatch(pat, case=True, flags=0, na=None)#

Apply “fullmatch” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • case (bool, default: True) –

  • flags (int, default: 0) –

  • na (object, default: None) –

Returns:

New QueryCompiler containing the result of execution of the “fullmatch” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.fullmatch for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_get(i)#

Apply “get” function to each string value in QueryCompiler.

Parameters:

i (int) –

Returns:

New QueryCompiler containing the result of execution of the “get” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.get for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_get_dummies(sep)#

Apply “get_dummies” function to each string value in QueryCompiler.

Parameters:

sep (str) –

Returns:

New QueryCompiler containing the result of execution of the “get_dummies” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.get_dummies for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_index(sub, start=0, end=None)#

Apply “index” function to each string value in QueryCompiler.

Parameters:
  • sub (str) –

  • start (int, default: 0) –

  • end (int, optional) –

Returns:

New QueryCompiler containing the result of execution of the “index” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.index for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_isalnum()#

Apply “isalnum” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “isalnum” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.isalnum for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_isalpha()#

Apply “isalpha” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “isalpha” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.isalpha for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_isdecimal()#

Apply “isdecimal” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “isdecimal” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.isdecimal for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_isdigit()#

Apply “isdigit” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “isdigit” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.isdigit for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_islower()#

Apply “islower” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “islower” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.islower for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_isnumeric()#

Apply “isnumeric” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “isnumeric” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.isnumeric for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_isspace()#

Apply “isspace” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “isspace” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.isspace for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_istitle()#

Apply “istitle” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “istitle” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.istitle for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_isupper()#

Apply “isupper” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “isupper” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.isupper for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_join(sep)#

Apply “join” function to each string value in QueryCompiler.

Parameters:

sep (str) –

Returns:

New QueryCompiler containing the result of execution of the “join” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.join for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_len()#

Apply “len” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “len” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.len for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_ljust(width, fillchar=' ')#

Apply “ljust” function to each string value in QueryCompiler.

Parameters:
  • width (int) –

  • fillchar (str, default: ' ') –

Returns:

New QueryCompiler containing the result of execution of the “ljust” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.ljust for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_lower()#

Apply “lower” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “lower” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.lower for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_lstrip(to_strip=None)#

Apply “lstrip” function to each string value in QueryCompiler.

Parameters:

to_strip (str, optional) –

Returns:

New QueryCompiler containing the result of execution of the “lstrip” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.lstrip for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_match(pat, case=True, flags=0, na=None)#

Apply “match” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • case (bool, default: True) –

  • flags (int, default: 0) –

  • na (object, default: None) –

Returns:

New QueryCompiler containing the result of execution of the “match” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.match for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_normalize(form)#

Apply “normalize” function to each string value in QueryCompiler.

Parameters:

form ({'NFC', 'NFKC', 'NFD', 'NFKD'}) –

Returns:

New QueryCompiler containing the result of execution of the “normalize” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.normalize for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_pad(width, side='left', fillchar=' ')#

Apply “pad” function to each string value in QueryCompiler.

Parameters:
  • width (int) –

  • side ({'left', 'right', 'both'}, default: 'left') –

  • fillchar (str, default: ' ') –

Returns:

New QueryCompiler containing the result of execution of the “pad” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.pad for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_partition(sep=' ', expand=True)#

Apply “partition” function to each string value in QueryCompiler.

Parameters:
  • sep (str, default: ' ') –

  • expand (bool, default: True) –

Returns:

New QueryCompiler containing the result of execution of the “partition” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.partition for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_removeprefix(prefix)#

Apply “removeprefix” function to each string value in QueryCompiler.

Parameters:

prefix (str) –

Returns:

New QueryCompiler containing the result of execution of the “removeprefix” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.removeprefix for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_removesuffix(suffix)#

Apply “removesuffix” function to each string value in QueryCompiler.

Parameters:

suffix (str) –

Returns:

New QueryCompiler containing the result of execution of the “removesuffix” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.removesuffix for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_repeat(repeats)#

Apply “repeat” function to each string value in QueryCompiler.

Parameters:

repeats (int) –

Returns:

New QueryCompiler containing the result of execution of the “repeat” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.repeat for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_replace(pat, repl, n=-1, case=None, flags=0, regex=None)#

Apply “replace” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • repl (str or callable) –

  • n (int, default: -1) –

  • case (bool, optional) –

  • flags (int, default: 0) –

  • regex (bool, default: None) –

Returns:

New QueryCompiler containing the result of execution of the “replace” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.replace for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_rfind(sub, start=0, end=None)#

Apply “rfind” function to each string value in QueryCompiler.

Parameters:
  • sub (str) –

  • start (int, default: 0) –

  • end (int, optional) –

Returns:

New QueryCompiler containing the result of execution of the “rfind” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.rfind for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_rindex(sub, start=0, end=None)#

Apply “rindex” function to each string value in QueryCompiler.

Parameters:
  • sub (str) –

  • start (int, default: 0) –

  • end (int, optional) –

Returns:

New QueryCompiler containing the result of execution of the “rindex” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.rindex for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_rjust(width, fillchar=' ')#

Apply “rjust” function to each string value in QueryCompiler.

Parameters:
  • width (int) –

  • fillchar (str, default: ' ') –

Returns:

New QueryCompiler containing the result of execution of the “rjust” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.rjust for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_rpartition(sep=' ', expand=True)#

Apply “rpartition” function to each string value in QueryCompiler.

Parameters:
  • sep (str, default: ' ') –

  • expand (bool, default: True) –

Returns:

New QueryCompiler containing the result of execution of the “rpartition” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.rpartition for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_rsplit(pat=None, *, n=-1, expand=False)#

Apply “rsplit” function to each string value in QueryCompiler.

Parameters:
  • pat (str, optional) –

  • n (int, default: -1) –

  • expand (bool, default: False) –

Returns:

New QueryCompiler containing the result of execution of the “rsplit” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.rsplit for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_rstrip(to_strip=None)#

Apply “rstrip” function to each string value in QueryCompiler.

Parameters:

to_strip (str, optional) –

Returns:

New QueryCompiler containing the result of execution of the “rstrip” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.rstrip for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_slice(start=None, stop=None, step=None)#

Apply “slice” function to each string value in QueryCompiler.

Parameters:
  • start (int, optional) –

  • stop (int, optional) –

  • step (int, optional) –

Returns:

New QueryCompiler containing the result of execution of the “slice” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.slice for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_slice_replace(start=None, stop=None, repl=None)#

Apply “slice_replace” function to each string value in QueryCompiler.

Parameters:
  • start (int, optional) –

  • stop (int, optional) –

  • repl (str or callable, optional) –

Returns:

New QueryCompiler containing the result of execution of the “slice_replace” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.slice_replace for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_split(pat=None, *, n=-1, expand=False, regex=None)#

Apply “split” function to each string value in QueryCompiler.

Parameters:
  • pat (str, optional) –

  • n (int, default: -1) –

  • expand (bool, default: False) –

  • regex (bool, default: None) –

Returns:

New QueryCompiler containing the result of execution of the “split” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.split for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_startswith(pat, na=None)#

Apply “startswith” function to each string value in QueryCompiler.

Parameters:
  • pat (str) –

  • na (object, default: None) –

Returns:

New QueryCompiler containing the result of execution of the “startswith” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.startswith for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_strip(to_strip=None)#

Apply “strip” function to each string value in QueryCompiler.

Parameters:

to_strip (str, optional) –

Returns:

New QueryCompiler containing the result of execution of the “strip” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.strip for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_swapcase()#

Apply “swapcase” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “swapcase” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.swapcase for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_title()#

Apply “title” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “title” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.title for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_translate(table)#

Apply “translate” function to each string value in QueryCompiler.

Parameters:

table (dict) –

Returns:

New QueryCompiler containing the result of execution of the “translate” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.translate for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_upper()#

Apply “upper” function to each string value in QueryCompiler.

Returns:

New QueryCompiler containing the result of execution of the “upper” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.upper for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_wrap(width, **kwargs)#

Apply “wrap” function to each string value in QueryCompiler.

Parameters:
  • width (int) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing the result of execution of the “wrap” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.wrap for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

str_zfill(width)#

Apply “zfill” function to each string value in QueryCompiler.

Parameters:

width (int) –

Returns:

New QueryCompiler containing the result of execution of the “zfill” function against each string element.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.str.zfill for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

struct_dtypes()#

Return the dtype object of each child field of the struct.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.struct.dtypes for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

struct_explode()#

Extract all child fields of a struct as a DataFrame.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.struct.explode for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

struct_field(name_or_index)#

Extract a child field of a struct as a Series.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Series.struct.field for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

sub(other, **kwargs)#

Perform element-wise subtraction (self - other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

sum(**kwargs)#

Get the sum for each column or row.

Parameters:
  • axis ({0, 1}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the sum for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sum for more information about parameters and output format.

sum_min_count(**kwargs)#

Get the sum for each column or row.

Parameters:
  • axis ({0, 1}) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the sum for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.sum for more information about parameters and output format.

support_materialization_in_worker_process() bool#

Whether it’s possible to call function to_pandas during the pickling process, at the moment of recreating the object.

Return type:

bool

take_2d_labels(index, columns)#

Take the given labels.

Parameters:
  • index (slice, scalar, list-like, or BaseQueryCompiler) – Labels of rows to grab.

  • columns (slice, scalar, list-like, or BaseQueryCompiler) – Labels of columns to grab.

Returns:

Subset of this QueryCompiler.

Return type:

BaseQueryCompiler

take_2d_positional(index=None, columns=None)#

Index QueryCompiler with passed keys.

Parameters:
  • index (list-like of ints, optional) – Positional indices of rows to grab.

  • columns (list-like of ints, optional) – Positional indices of columns to grab.

Returns:

New masked QueryCompiler.

Return type:

BaseQueryCompiler

abstract to_dataframe(nan_as_null: bool = False, allow_copy: bool = True)#

Get a DataFrame exchange protocol object representing data of the Modin DataFrame.

See more about the protocol in https://data-apis.org/dataframe-protocol/latest/index.html.

Parameters:
  • nan_as_null (bool, default: False) – A keyword intended for the consumer to tell the producer to overwrite null values in the data with NaN (or NaT). This currently has no effect; once support for nullable extension dtypes is added, this value should be propagated to columns.

  • allow_copy (bool, default: True) – A keyword that defines whether or not the library is allowed to make a copy of the data. For example, copying data would be necessary if a library supports strided buffers, given that this protocol specifies contiguous buffers. Currently, if the flag is set to False and a copy is needed, a RuntimeError will be raised.

Returns:

A dataframe object following the DataFrame protocol specification.

Return type:

ProtocolDataframe

to_datetime(*args, **kwargs)#

Convert columns of the QueryCompiler to the datetime dtype.

Parameters:
  • *args (iterable) –

  • **kwargs (dict) –

Returns:

QueryCompiler with all columns converted to datetime dtype.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.to_datetime for more information about parameters and output format.

to_list()#

Return a list of the values.

These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period).

Return type:

list

to_numeric(*args, **kwargs)#

Convert underlying data to numeric dtype.

Parameters:
  • errors ({"ignore", "raise", "coerce"}) –

  • downcast ({"integer", "signed", "unsigned", "float", None}) –

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

New QueryCompiler with converted to numeric values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.to_numeric for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

to_numpy(**kwargs)#

Convert underlying query compilers data to NumPy array.

Parameters:
  • dtype (dtype) – The dtype of the resulted array.

  • copy (bool) – Whether to ensure that the returned value is not a view on another array.

  • na_value (object) – The value to replace missing values with.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

The QueryCompiler converted to NumPy array.

Return type:

np.ndarray

abstract to_pandas()#

Convert underlying query compilers data to pandas.DataFrame.

Returns:

The QueryCompiler converted to pandas.

Return type:

pandas.DataFrame

to_timedelta(unit='ns', errors='raise')#

Convert argument to timedelta.

Parameters:
  • unit (str, default: "ns") – Denotes the unit of the arg for numeric arg. Defaults to “ns”.

  • errors ({"ignore", "raise", "coerce"}, default: "raise") –

Returns:

New QueryCompiler with converted to timedelta values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.to_timedelta for more information about parameters and output format.

Warning

This method is supported only by one-column query compilers.

transpose(*args, **kwargs)#

Transpose this QueryCompiler.

Parameters:
  • copy (bool) – Whether to copy the data after transposing.

  • *args (iterable) – Serves the compatibility purpose. Does not affect the result.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Transposed new QueryCompiler.

Return type:

BaseQueryCompiler

truediv(other, **kwargs)#

Perform element-wise division (self / other).

If axes are not equal, perform frames alignment first.

Parameters:
  • other (BaseQueryCompiler, scalar or array-like) – Other operand of the binary operation.

  • broadcast (bool, default: False) – If other is a one-column query compiler, indicates whether it is a Series or not. Frames and Series have to be processed differently, however we can’t distinguish them at the query compiler level, so this parameter is a hint that is passed from a high-level API.

  • level (int or label) – In case of MultiIndex match index values on the passed level.

  • axis ({{0, 1}}) – Axis to match indices along for 1D other (list or QueryCompiler that represents Series). 0 is for index, when 1 is for columns.

  • fill_value (float or None) – Value to fill missing elements during frame alignment.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

Result of binary operation.

Return type:

BaseQueryCompiler

tz_convert(tz, axis=0, level=None, copy=True)#

Convert tz-aware axis to target time zone.

Parameters:
  • tz (str or tzinfo object or None) – Target time zone. Passing None will convert to UTC and remove the timezone information.

  • axis (int, default: 0) – The axis to localize.

  • level (int, str, default: None) – If axis is a MultiIndex, convert a specific level. Otherwise must be None.

  • copy (bool, default: True) – Also make a copy of the underlying data.

Returns:

A new query compiler with the converted axis.

Return type:

BaseQueryCompiler

tz_localize(tz, axis=0, level=None, copy=True, ambiguous='raise', nonexistent='raise')#

Localize tz-naive index of a Series or DataFrame to target time zone.

Parameters:
  • tz (tzstr or tzinfo or None) – Time zone to localize. Passing None will remove the time zone information and preserve local time.

  • axis (int, default: 0) – The axis to localize.

  • level (int, str, default: None) – If axis is a MultiIndex, localize a specific level. Otherwise must be None.

  • copy (bool, default: True) – Also make a copy of the underlying data.

  • ambiguous (str, bool-ndarray, NaT, default: "raise") – Behaviour on ambiguous times.

  • nonexistent (str, default: "raise") – What to do with nonexistent times.

Returns:

A new query compiler with the localized axis.

Return type:

BaseQueryCompiler

unique(keep='first', ignore_index=True, subset=None)#

Get unique rows of self.

Parameters:
  • keep ({"first", "last", False}, default: "first") – Which duplicates to keep.

  • ignore_index (bool, default: True) – If True, the resulting axis will be labeled 0, 1, …, n - 1.

  • subset (list, optional) – Only consider certain columns for identifying duplicates, if None, use all of the columns.

Returns:

New QueryCompiler with unique values.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.drop_duplicates for more information about parameters and output format.

unstack(level, fill_value)#

Pivot a level of the (necessarily hierarchical) index labels.

Parameters:
  • level (int or label) –

  • fill_value (scalar or dict) –

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.unstack for more information about parameters and output format.

var(**kwargs)#

Get the variance for each column or row.

Parameters:
  • axis ({{0, 1}}) –

  • numeric_only (bool, optional) –

  • skipna (bool, default: True) –

  • ddof (int) –

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

One-column QueryCompiler with index labels of the specified axis, where each row contains the variance for the corresponding row or column.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.var for more information about parameters and output format.

where(cond, other, **kwargs)#

Update values of self using values from other at positions where cond is False.

Parameters:
  • cond (BaseQueryCompiler) – Boolean mask. True - keep the self value, False - replace by other value.

  • other (BaseQueryCompiler or pandas.Series) – Object to grab replacement values from.

  • axis ({0, 1}) – Axis to align frames along if axes of self, cond and other are not equal. 0 is for index, when 1 is for columns.

  • level (int or label, optional) – Level of MultiIndex to align frames along if axes of self, cond and other are not equal. Currently level parameter is not implemented, so only None value is acceptable.

  • **kwargs (dict) – Serves the compatibility purpose. Does not affect the result.

Returns:

QueryCompiler with updated data.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.DataFrame.where for more information about parameters and output format.

wide_to_long(**kwargs)#

Unpivot a DataFrame from wide to long format.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.wide_to_long for more information about parameters and output format.

window_mean(fold_axis, window_kwargs, *args, **kwargs)#

Create window of the specified type and compute mean for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • window_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing mean for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the mean for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.mean for more information about parameters and output format.

window_std(fold_axis, window_kwargs, ddof=1, *args, **kwargs)#

Create window of the specified type and compute standard deviation for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • window_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing standard deviation for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the standard deviation for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.std for more information about parameters and output format.

window_sum(fold_axis, window_kwargs, *args, **kwargs)#

Create window of the specified type and compute sum for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • window_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing sum for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the sum for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.sum for more information about parameters and output format.

window_var(fold_axis, window_kwargs, ddof=1, *args, **kwargs)#

Create window of the specified type and compute variance for each window over the given axis.

Parameters:
  • fold_axis ({0, 1}) –

  • window_kwargs (list) – Rolling windows arguments with the same signature as modin.pandas.DataFrame.rolling.

  • ddof (int, default: 1) –

  • *args (iterable) –

  • **kwargs (dict) –

Returns:

New QueryCompiler containing variance for each window, built by the following rules:

  • Output QueryCompiler has the same shape and axes labels as the source.

  • Each element is the variance for the corresponding window.

Return type:

BaseQueryCompiler

Notes

Please refer to modin.pandas.Rolling.var for more information about parameters and output format.

write_items(row_numeric_index, col_numeric_index, item, need_columns_reindex=True)#

Update QueryCompiler elements at the specified positions by passed values.

In contrast to setitem this method allows to do 2D assignments.

Parameters:
  • row_numeric_index (list of ints) – Row positions to write value.

  • col_numeric_index (list of ints) – Column positions to write value.

  • item (Any) – Values to write. If not a scalar will be broadcasted according to row_numeric_index and col_numeric_index.

  • need_columns_reindex (bool, default: True) – In the case of assigning columns to a dataframe (broadcasting is part of the flow), reindexing is not needed.

Returns:

New QueryCompiler with updated values.

Return type:

BaseQueryCompiler