Base pandas Dataset API#

The class implements functionality that is common to Modin’s pandas API for both DataFrame and Series classes.

Public API#

class modin.pandas.base.BasePandasDataset

Implement most of the common code that exists in DataFrame/Series.

Since both objects share the same underlying representation, and the algorithms are the same, we use this object to define the general behavior of those objects and then use those objects to define the output type.

Notes

See pandas API documentation for pandas.DataFrame, pandas.Series for more.

abs()

Return a BasePandasDataset with absolute numeric value of each element.

Notes

See pandas API documentation for pandas.DataFrame.abs, pandas.Series.abs for more.

add(other, axis='columns', level=None, fill_value=None)

Return addition of BasePandasDataset and other, element-wise (binary operator add).

Notes

See pandas API documentation for pandas.DataFrame.add, pandas.Series.add for more.

agg(func=None, axis=0, *args, **kwargs)

Aggregate using one or more operations over the specified axis.

Notes

See pandas API documentation for pandas.DataFrame.aggregate, pandas.Series.aggregate for more.

aggregate(func=None, axis=0, *args, **kwargs)

Aggregate using one or more operations over the specified axis.

Notes

See pandas API documentation for pandas.DataFrame.aggregate, pandas.Series.aggregate for more.

align(other, join='outer', axis=None, level=None, copy=True, fill_value=None, method=None, limit=None, fill_axis=0, broadcast_axis=None)

Align two objects on their axes with the specified join method.

Notes

See pandas API documentation for pandas.DataFrame.align, pandas.Series.align for more.

all(axis=0, bool_only=None, skipna=True, level=None, **kwargs)

Return whether all elements are True, potentially over an axis.

Notes

See pandas API documentation for pandas.DataFrame.all, pandas.Series.all for more.

any(axis=0, bool_only=None, skipna=True, level=None, **kwargs)

Return whether any element is True, potentially over an axis.

Notes

See pandas API documentation for pandas.DataFrame.any, pandas.Series.any for more.

apply(func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, convert_dtype=True, args=(), **kwds)

Apply a function along an axis of the BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.apply, pandas.Series.apply for more.

asfreq(freq, method=None, how=None, normalize=False, fill_value=None)

Convert time series to specified frequency.

Notes

See pandas API documentation for pandas.DataFrame.asfreq, pandas.Series.asfreq for more.

asof(where, subset=None)

Return the last row(s) without any NaNs before where.

Notes

See pandas API documentation for pandas.DataFrame.asof, pandas.Series.asof for more.

astype(dtype, copy=True, errors='raise')

Cast a Modin object to a specified dtype dtype.

Notes

See pandas API documentation for pandas.DataFrame.astype, pandas.Series.astype for more.

property at

Get a single value for a row/column label pair.

Notes

See pandas API documentation for pandas.DataFrame.at, pandas.Series.at for more.

at_time(time, asof=False, axis=None)

Select values at particular time of day (e.g., 9:30AM).

Notes

See pandas API documentation for pandas.DataFrame.at_time, pandas.Series.at_time for more.

backfill(axis=None, inplace=False, limit=None, downcast=None)

Synonym for DataFrame.fillna with method='bfill'.

Notes

See pandas API documentation for pandas.DataFrame.backfill, pandas.Series.backfill for more.

between_time(start_time, end_time, include_start: bool_t | NoDefault = NoDefault.no_default, include_end: bool_t | NoDefault = NoDefault.no_default, inclusive: str | None = None, axis=None)

Select values between particular times of the day (e.g., 9:00-9:30 AM).

Notes

See pandas API documentation for pandas.DataFrame.between_time, pandas.Series.between_time for more.

bfill(axis=None, inplace=False, limit=None, downcast=None)

Synonym for DataFrame.fillna with method='bfill'.

Notes

See pandas API documentation for pandas.DataFrame.backfill, pandas.Series.backfill for more.

bool()

Return the bool of a single element BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.bool, pandas.Series.bool for more.

clip(lower=None, upper=None, axis=None, inplace=False, *args, **kwargs): Trim values at input threshold(s).

combine(other, func, fill_value=None, **kwargs)

Perform combination of BasePandasDataset-s according to func.

Notes

See pandas API documentation for pandas.DataFrame.combine, pandas.Series.combine for more.

combine_first(other)

Update null elements with value in the same location in other.

Notes

See pandas API documentation for pandas.DataFrame.combine_first, pandas.Series.combine_first for more.

convert_dtypes(infer_objects: modin.pandas.base.BasePandasDataset.bool = True, convert_string: modin.pandas.base.BasePandasDataset.bool = True, convert_integer: modin.pandas.base.BasePandasDataset.bool = True, convert_boolean: modin.pandas.base.BasePandasDataset.bool = True, convert_floating: modin.pandas.base.BasePandasDataset.bool = True)

Convert columns to best possible dtypes using dtypes supporting pd.NA.

Notes

See pandas API documentation for pandas.DataFrame.convert_dtypes, pandas.Series.convert_dtypes for more.

copy(deep=True)

Make a copy of the object’s metadata.

Notes

See pandas API documentation for pandas.DataFrame.copy, pandas.Series.copy for more.

count(axis=0, level=None, numeric_only=False)

Count non-NA cells for BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.count, pandas.Series.count for more.

cummax(axis=None, skipna=True, *args, **kwargs)

Return cumulative maximum over a BasePandasDataset axis.

Notes

See pandas API documentation for pandas.DataFrame.cummax, pandas.Series.cummax for more.

cummin(axis=None, skipna=True, *args, **kwargs)

Return cumulative minimum over a BasePandasDataset axis.

Notes

See pandas API documentation for pandas.DataFrame.cummin, pandas.Series.cummin for more.

cumprod(axis=None, skipna=True, *args, **kwargs)

Return cumulative product over a BasePandasDataset axis.

Notes

See pandas API documentation for pandas.DataFrame.cumprod, pandas.Series.cumprod for more.

cumsum(axis=None, skipna=True, *args, **kwargs)

Return cumulative sum over a BasePandasDataset axis.

Notes

See pandas API documentation for pandas.DataFrame.cumsum, pandas.Series.cumsum for more.

describe(percentiles=None, include=None, exclude=None, datetime_is_numeric=False)

Generate descriptive statistics.

Notes

See pandas API documentation for pandas.DataFrame.describe, pandas.Series.describe for more.

diff(periods=1, axis=0)

First discrete difference of element.

Notes

See pandas API documentation for pandas.DataFrame.diff, pandas.Series.diff for more.

div(other, axis='columns', level=None, fill_value=None)

Get floating division of BasePandasDataset and other, element-wise (binary operator truediv).

Notes

See pandas API documentation for pandas.DataFrame.truediv, pandas.Series.truediv for more.

divide(other, axis='columns', level=None, fill_value=None)

Get floating division of BasePandasDataset and other, element-wise (binary operator truediv).

Notes

See pandas API documentation for pandas.DataFrame.truediv, pandas.Series.truediv for more.

drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

Drop specified labels from BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.drop, pandas.Series.drop for more.

drop_duplicates(keep='first', inplace=False, **kwargs)

Return BasePandasDataset with duplicate rows removed.

Notes

See pandas API documentation for pandas.DataFrame.drop_duplicates, pandas.Series.drop_duplicates for more.

droplevel(level, axis=0)

Return BasePandasDataset with requested index / column level(s) removed.

Notes

See pandas API documentation for pandas.DataFrame.droplevel, pandas.Series.droplevel for more.

dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)

Remove missing values.

Notes

See pandas API documentation for pandas.DataFrame.dropna, pandas.Series.dropna for more.

eq(other, axis='columns', level=None)

Get equality of BasePandasDataset and other, element-wise (binary operator eq).

Notes

See pandas API documentation for pandas.DataFrame.eq, pandas.Series.eq for more.

Provide exponentially weighted (EW) calculations.

Notes

See pandas API documentation for pandas.DataFrame.ewm, pandas.Series.ewm for more.

expanding(min_periods=1, center=None, axis=0, method='single')

Provide expanding window calculations.

Notes

See pandas API documentation for pandas.DataFrame.expanding, pandas.Series.expanding for more.

explode(column, ignore_index: modin.pandas.base.BasePandasDataset.bool = False)

Transform each element of a list-like to a row.

Notes

See pandas API documentation for pandas.DataFrame.explode, pandas.Series.explode for more.

ffill(axis=None, inplace=False, limit=None, downcast=None)

Synonym for DataFrame.fillna with method='ffill'.

Notes

See pandas API documentation for pandas.DataFrame.pad, pandas.Series.pad for more.

filter(items=None, like=None, regex=None, axis=None)

Subset the BasePandasDataset rows or columns according to the specified index labels.

Notes

See pandas API documentation for pandas.DataFrame.filter, pandas.Series.filter for more.

first(offset)

Select initial periods of time series data based on a date offset.

Notes

See pandas API documentation for pandas.DataFrame.first, pandas.Series.first for more.

first_valid_index()

Return index for first non-NA value or None, if no non-NA value is found.

Notes

See pandas API documentation for pandas.DataFrame.first_valid_index, pandas.Series.first_valid_index for more.

property flags

Get the properties associated with this BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.flags, pandas.Series.flags for more.

floordiv(other, axis='columns', level=None, fill_value=None)

Get integer division of BasePandasDataset and other, element-wise (binary operator floordiv).

Notes

See pandas API documentation for pandas.DataFrame.floordiv, pandas.Series.floordiv for more.

ge(other, axis='columns', level=None)

Get greater than or equal comparison of BasePandasDataset and other, element-wise (binary operator ge).

Notes

See pandas API documentation for pandas.DataFrame.ge, pandas.Series.ge for more.

get(key, default=None)

Get item from object for given key.

Notes

See pandas API documentation for pandas.DataFrame.get, pandas.Series.get for more.

gt(other, axis='columns', level=None)

Get greater than comparison of BasePandasDataset and other, element-wise (binary operator gt).

Notes

See pandas API documentation for pandas.DataFrame.gt, pandas.Series.gt for more.

head(n=5)

Return the first n rows.

Notes

See pandas API documentation for pandas.DataFrame.head, pandas.Series.head for more.

property iat

Get a single value for a row/column pair by integer position.

Notes

See pandas API documentation for pandas.DataFrame.iat, pandas.Series.iat for more.

idxmax(axis=0, skipna=True)

Return index of first occurrence of maximum over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.idxmax, pandas.Series.idxmax for more.

idxmin(axis=0, skipna=True)

Return index of first occurrence of minimum over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.idxmin, pandas.Series.idxmin for more.

property iloc

Purely integer-location based indexing for selection by position.

Notes

See pandas API documentation for pandas.DataFrame.iloc, pandas.Series.iloc for more.

property index

Get the index for this DataFrame.

Returns: The union of all indexes across the partitions.
Return type: pandas.Index

infer_objects()

Attempt to infer better dtypes for object columns.

Notes

See pandas API documentation for pandas.DataFrame.infer_objects, pandas.Series.infer_objects for more.

isin(values)

Whether elements in BasePandasDataset are contained in values.

Notes

See pandas API documentation for pandas.DataFrame.isin, pandas.Series.isin for more.

isna()

Detect missing values.

Notes

See pandas API documentation for pandas.DataFrame.isna, pandas.Series.isna for more.

isnull()

Detect missing values.

Notes

See pandas API documentation for pandas.DataFrame.isna, pandas.Series.isna for more.

kurt(axis: Axis | None | NoDefault = NoDefault.no_default, skipna=True, level=None, numeric_only=None, **kwargs)

Return unbiased kurtosis over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.kurt, pandas.Series.kurt for more.

kurtosis(axis: Axis | None | NoDefault = NoDefault.no_default, skipna=True, level=None, numeric_only=None, **kwargs)

Return unbiased kurtosis over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.kurt, pandas.Series.kurt for more.

last(offset)

Select final periods of time series data based on a date offset.

Notes

See pandas API documentation for pandas.DataFrame.last, pandas.Series.last for more.

last_valid_index()

Return index for last non-NA value or None, if no non-NA value is found.

Notes

See pandas API documentation for pandas.DataFrame.last_valid_index, pandas.Series.last_valid_index for more.

le(other, axis='columns', level=None)

Get less than or equal comparison of BasePandasDataset and other, element-wise (binary operator le).

Notes

See pandas API documentation for pandas.DataFrame.le, pandas.Series.le for more.

property loc

Get a group of rows and columns by label(s) or a boolean array.

Notes

See pandas API documentation for pandas.DataFrame.loc, pandas.Series.loc for more.

lt(other, axis='columns', level=None)

Get less than comparison of BasePandasDataset and other, element-wise (binary operator lt).

Notes

See pandas API documentation for pandas.DataFrame.lt, pandas.Series.lt for more.

mad(axis=None, skipna=True, level=None)

Return the mean absolute deviation of the values over the requested axis.

Notes

See pandas API documentation for pandas.DataFrame.mad, pandas.Series.mad for more.

mask(cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=NoDefault.no_default): Replace values where the condition is True.

max(axis: int | None | NoDefault = NoDefault.no_default, skipna=True, level=None, numeric_only=None, **kwargs)

Return the maximum of the values over the requested axis.

Notes

See pandas API documentation for pandas.DataFrame.max, pandas.Series.max for more.

mean(axis: int | None | NoDefault = NoDefault.no_default, skipna=True, level=None, numeric_only=None, **kwargs)

Return the mean of the values over the requested axis.

Notes

See pandas API documentation for pandas.DataFrame.mean, pandas.Series.mean for more.

median(axis: int | None | NoDefault = NoDefault.no_default, skipna=True, level=None, numeric_only=None, **kwargs)

Return the median of the values over the requested axis.

Notes

See pandas API documentation for pandas.DataFrame.median, pandas.Series.median for more.

memory_usage(index=True, deep=False)

Return the memory usage of the BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.memory_usage, pandas.Series.memory_usage for more.

min(axis: int | None | NoDefault = NoDefault.no_default, skipna=True, level=None, numeric_only=None, **kwargs)

Return the minimum of the values over the requested axis.

Notes

See pandas API documentation for pandas.DataFrame.min, pandas.Series.min for more.

mod(other, axis='columns', level=None, fill_value=None)

Get modulo of BasePandasDataset and other, element-wise (binary operator mod).

Notes

See pandas API documentation for pandas.DataFrame.mod, pandas.Series.mod for more.

mode(axis=0, numeric_only=False, dropna=True)

Get the mode(s) of each element along the selected axis.

Notes

See pandas API documentation for pandas.DataFrame.mode, pandas.Series.mode for more.

mul(other, axis='columns', level=None, fill_value=None)

Get multiplication of BasePandasDataset and other, element-wise (binary operator mul).

Notes

See pandas API documentation for pandas.DataFrame.mul, pandas.Series.mul for more.

multiply(other, axis='columns', level=None, fill_value=None)

Get multiplication of BasePandasDataset and other, element-wise (binary operator mul).

Notes

See pandas API documentation for pandas.DataFrame.mul, pandas.Series.mul for more.

ne(other, axis='columns', level=None)

Get Not equal comparison of BasePandasDataset and other, element-wise (binary operator ne).

Notes

See pandas API documentation for pandas.DataFrame.ne, pandas.Series.ne for more.

notna()

Detect existing (non-missing) values.

Notes

See pandas API documentation for pandas.DataFrame.notna, pandas.Series.notna for more.

notnull()

Detect existing (non-missing) values.

Notes

See pandas API documentation for pandas.DataFrame.notna, pandas.Series.notna for more.

nunique(axis=0, dropna=True)

Return number of unique elements in the BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.nunique, pandas.Series.nunique for more.

pad(axis=None, inplace=False, limit=None, downcast=None)

Synonym for DataFrame.fillna with method='ffill'.

Notes

See pandas API documentation for pandas.DataFrame.pad, pandas.Series.pad for more.

pct_change(periods=1, fill_method='pad', limit=None, freq=None, **kwargs)

Percentage change between the current and a prior element.

Notes

See pandas API documentation for pandas.DataFrame.pct_change, pandas.Series.pct_change for more.

pipe(func, *args, **kwargs)

Apply chainable functions that expect BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.pipe, pandas.Series.pipe for more.

pop(item)

Return item and drop from frame. Raise KeyError if not found.

Notes

See pandas API documentation for pandas.DataFrame.pop, pandas.Series.pop for more.

pow(other, axis='columns', level=None, fill_value=None)

Get exponential power of BasePandasDataset and other, element-wise (binary operator pow).

Notes

See pandas API documentation for pandas.DataFrame.pow, pandas.Series.pow for more.

quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')

Return values at the given quantile over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.quantile, pandas.Series.quantile for more.

radd(other, axis='columns', level=None, fill_value=None)

Return addition of BasePandasDataset and other, element-wise (binary operator add).

Notes

See pandas API documentation for pandas.DataFrame.add, pandas.Series.add for more.

rank(axis=0, method: str = 'average', numeric_only: bool_t | None | NoDefault = NoDefault.no_default, na_option: str = 'keep', ascending: bool_t = True, pct: bool_t = False)

Compute numerical data ranks (1 through n) along axis.

Notes

See pandas API documentation for pandas.DataFrame.rank, pandas.Series.rank for more.

rdiv(other, axis='columns', level=None, fill_value=None)

Get floating division of BasePandasDataset and other, element-wise (binary operator rtruediv).

Notes

See pandas API documentation for pandas.DataFrame.rtruediv, pandas.Series.rtruediv for more.

reindex(index=None, columns=None, copy=True, **kwargs)

Conform BasePandasDataset to new index with optional filling logic.

Notes

See pandas API documentation for pandas.DataFrame.reindex, pandas.Series.reindex for more.

reindex_like(other, method=None, copy=True, limit=None, tolerance=None)

Return an object with matching indices as other object.

Notes

See pandas API documentation for pandas.DataFrame.reindex_like, pandas.Series.reindex_like for more.

rename_axis(mapper=None, index=None, columns=None, axis=None, copy=True, inplace=False)

Set the name of the axis for the index or columns.

Notes

See pandas API documentation for pandas.DataFrame.rename_axis, pandas.Series.rename_axis for more.

reorder_levels(order, axis=0)

Rearrange index levels using input order.

Notes

See pandas API documentation for pandas.DataFrame.reorder_levels, pandas.Series.reorder_levels for more.

resample(rule, axis=0, closed=None, label=None, convention='start', kind=None, loffset=None, base: Optional[int] = None, on=None, level=None, origin: Union[str, Timestamp, datetime.datetime, numpy.datetime64, int, numpy.int64, float] = 'start_day', offset: Optional[Union[Timedelta, datetime.timedelta, numpy.timedelta64, int, numpy.int64, float, str]] = None)

Resample time-series data.

Notes

See pandas API documentation for pandas.DataFrame.resample, pandas.Series.resample for more.

reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')

Reset the index, or a level of it.

Notes

See pandas API documentation for pandas.DataFrame.reset_index, pandas.Series.reset_index for more.

rfloordiv(other, axis='columns', level=None, fill_value=None)

Get integer division of BasePandasDataset and other, element-wise (binary operator rfloordiv).

Notes

See pandas API documentation for pandas.DataFrame.rfloordiv, pandas.Series.rfloordiv for more.

rmod(other, axis='columns', level=None, fill_value=None)

Get modulo of BasePandasDataset and other, element-wise (binary operator rmod).

Notes

See pandas API documentation for pandas.DataFrame.rmod, pandas.Series.rmod for more.

rmul(other, axis='columns', level=None, fill_value=None)

Get multiplication of BasePandasDataset and other, element-wise (binary operator mul).

Notes

See pandas API documentation for pandas.DataFrame.mul, pandas.Series.mul for more.

rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None, method='single')

Provide rolling window calculations.

Notes

See pandas API documentation for pandas.DataFrame.rolling, pandas.Series.rolling for more.

round(decimals=0, *args, **kwargs)

Round a BasePandasDataset to a variable number of decimal places.

Notes

See pandas API documentation for pandas.DataFrame.round, pandas.Series.round for more.

rpow(other, axis='columns', level=None, fill_value=None)

Get exponential power of BasePandasDataset and other, element-wise (binary operator rpow).

Notes

See pandas API documentation for pandas.DataFrame.rpow, pandas.Series.rpow for more.

rsub(other, axis='columns', level=None, fill_value=None)

Get subtraction of BasePandasDataset and other, element-wise (binary operator rsub).

Notes

See pandas API documentation for pandas.DataFrame.rsub, pandas.Series.rsub for more.

rtruediv(other, axis='columns', level=None, fill_value=None)

Get floating division of BasePandasDataset and other, element-wise (binary operator rtruediv).

Notes

See pandas API documentation for pandas.DataFrame.rtruediv, pandas.Series.rtruediv for more.

sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None, ignore_index=False)

Return a random sample of items from an axis of object.

Notes

See pandas API documentation for pandas.DataFrame.sample, pandas.Series.sample for more.

sem(axis=None, skipna=True, level=None, ddof=1, numeric_only=None, **kwargs)

Return unbiased standard error of the mean over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.sem, pandas.Series.sem for more.

set_axis(labels, axis=0, inplace=False)

Assign desired index to given axis.

Notes

See pandas API documentation for pandas.DataFrame.set_axis, pandas.Series.set_axis for more.

set_flags(*, copy: modin.pandas.base.BasePandasDataset.bool = False, allows_duplicate_labels: Optional[modin.pandas.base.BasePandasDataset.bool] = None)

Return a new BasePandasDataset with updated flags.

Notes

See pandas API documentation for pandas.DataFrame.set_flags, pandas.Series.set_flags for more.

shift(periods=1, freq=None, axis=0, fill_value=NoDefault.no_default)

Shift index by desired number of periods with an optional time freq.

Notes

See pandas API documentation for pandas.DataFrame.shift, pandas.Series.shift for more.

property size

Return an int representing the number of elements in this BasePandasDataset object.

Notes

See pandas API documentation for pandas.DataFrame.size, pandas.Series.size for more.

skew(axis: int | None | NoDefault = NoDefault.no_default, skipna=True, level=None, numeric_only=None, **kwargs)

Return unbiased skew over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.skew, pandas.Series.skew for more.

sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index: modin.pandas.base.BasePandasDataset.bool = False, key: Optional[Callable[[Index], Union[Index, ExtensionArray, numpy.ndarray, Series]]] = None)

Sort object by labels (along an axis).

Notes

See pandas API documentation for pandas.DataFrame.sort_index, pandas.Series.sort_index for more.

sort_values(by, axis=0, ascending=True, inplace: modin.pandas.base.BasePandasDataset.bool = False, kind='quicksort', na_position='last', ignore_index: modin.pandas.base.BasePandasDataset.bool = False, key: Optional[Callable[[Index], Union[Index, ExtensionArray, numpy.ndarray, Series]]] = None)

Sort by the values along either axis.

Notes

See pandas API documentation for pandas.DataFrame.sort_values, pandas.Series.sort_values for more.

std(axis=None, skipna=True, level=None, ddof=1, numeric_only=None, **kwargs)

Return sample standard deviation over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.std, pandas.Series.std for more.

sub(other, axis='columns', level=None, fill_value=None)

Get subtraction of BasePandasDataset and other, element-wise (binary operator sub).

Notes

See pandas API documentation for pandas.DataFrame.sub, pandas.Series.sub for more.

subtract(other, axis='columns', level=None, fill_value=None)

Get subtraction of BasePandasDataset and other, element-wise (binary operator sub).

Notes

See pandas API documentation for pandas.DataFrame.sub, pandas.Series.sub for more.

swapaxes(axis1, axis2, copy=True)

Interchange axes and swap values axes appropriately.

Notes

See pandas API documentation for pandas.DataFrame.swapaxes, pandas.Series.swapaxes for more.

swaplevel(i=- 2, j=- 1, axis=0)

Swap levels i and j in a MultiIndex.

Notes

See pandas API documentation for pandas.DataFrame.swaplevel, pandas.Series.swaplevel for more.

tail(n=5)

Return the last n rows.

Notes

See pandas API documentation for pandas.DataFrame.tail, pandas.Series.tail for more.

take(indices, axis=0, is_copy=None, **kwargs)

Return the elements in the given positional indices along an axis.

Notes

See pandas API documentation for pandas.DataFrame.take, pandas.Series.take for more.

to_clipboard(excel=True, sep=None, **kwargs)

Copy object to the system clipboard.

Notes

See pandas API documentation for pandas.DataFrame.to_clipboard, pandas.Series.to_clipboard for more.

to_csv(path_or_buf=None, sep=',', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, mode='w', encoding=None, compression='infer', quoting=None, quotechar='"', line_terminator=None, chunksize=None, date_format=None, doublequote=True, escapechar=None, decimal='.', errors: str = 'strict', storage_options: Optional[Dict[str, Any]] = None)

Write object to a comma-separated values (csv) file.

Notes

See pandas API documentation for pandas.DataFrame.to_csv, pandas.Series.to_csv for more.

to_dict(orient='dict', into=<class 'dict'>)

Convert the BasePandasDataset to a dictionary.

Notes

See pandas API documentation for pandas.DataFrame.to_dict, pandas.Series.to_dict for more.

to_excel(excel_writer, sheet_name='Sheet1', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None, merge_cells=True, encoding=None, inf_rep='inf', verbose=True, freeze_panes=None, storage_options: Optional[Dict[str, Any]] = None)

Write object to an Excel sheet.

Notes

See pandas API documentation for pandas.DataFrame.to_excel, pandas.Series.to_excel for more.

to_hdf(path_or_buf, key, format='table', **kwargs)

Write the contained data to an HDF5 file using HDFStore.

Notes

See pandas API documentation for pandas.DataFrame.to_hdf, pandas.Series.to_hdf for more.

to_json(path_or_buf=None, orient=None, date_format=None, double_precision=10, force_ascii=True, date_unit='ms', default_handler=None, lines=False, compression='infer', index=True, indent=None, storage_options: Optional[Dict[str, Any]] = None)

Convert the object to a JSON string.

Notes

See pandas API documentation for pandas.DataFrame.to_json, pandas.Series.to_json for more.

to_latex(buf=None, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None, float_format=None, sparsify=None, index_names=True, bold_rows=False, column_format=None, longtable=None, escape=None, encoding=None, decimal='.', multicolumn=None, multicolumn_format=None, multirow=None, caption=None, label=None, position=None)

Render object to a LaTeX tabular, longtable, or nested table.

Notes

See pandas API documentation for pandas.DataFrame.to_latex, pandas.Series.to_latex for more.

to_markdown(buf=None, mode: str = 'wt', index: modin.pandas.base.BasePandasDataset.bool = True, storage_options: Optional[Dict[str, Any]] = None, **kwargs)

Print BasePandasDataset in Markdown-friendly format.

Notes

See pandas API documentation for pandas.DataFrame.to_markdown, pandas.Series.to_markdown for more.

to_numpy(dtype=None, copy=False, na_value=NoDefault.no_default)

Convert the BasePandasDataset to a NumPy array.

Notes

See pandas API documentation for pandas.DataFrame.to_numpy, pandas.Series.to_numpy for more.

to_period(freq=None, axis=0, copy=True)

Convert BasePandasDataset from DatetimeIndex to PeriodIndex.

Notes

See pandas API documentation for pandas.DataFrame.to_period, pandas.Series.to_period for more.

to_pickle(path, compression: Optional[Union[Literal['infer', 'gzip', 'bz2', 'zip', 'xz', 'zstd'], Dict[str, Any]]] = 'infer', protocol: int = 5, storage_options: Optional[Dict[str, Any]] = None)

Pickle (serialize) object to file.

Notes

See pandas API documentation for pandas.DataFrame.to_pickle, pandas.Series.to_pickle for more.

to_sql(name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)

Write records stored in a BasePandasDataset to a SQL database.

Notes

See pandas API documentation for pandas.DataFrame.to_sql, pandas.Series.to_sql for more.

to_string(buf=None, columns=None, col_space=None, header=True, index=True, na_rep='NaN', formatters=None, float_format=None, sparsify=None, index_names=True, justify=None, max_rows=None, min_rows=None, max_cols=None, show_dimensions=False, decimal='.', line_width=None, max_colwidth=None, encoding=None)

Render a BasePandasDataset to a console-friendly tabular output.

Notes

See pandas API documentation for pandas.DataFrame.to_string, pandas.Series.to_string for more.

to_timestamp(freq=None, how='start', axis=0, copy=True)

Cast to DatetimeIndex of timestamps, at beginning of period.

Notes

See pandas API documentation for pandas.DataFrame.to_timestamp, pandas.Series.to_timestamp for more.

to_xarray()

Return an xarray object from the BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.to_xarray, pandas.Series.to_xarray for more.

transform(func, axis=0, *args, **kwargs)

Call func on self producing a BasePandasDataset with the same axis shape as self.

Notes

See pandas API documentation for pandas.DataFrame.transform, pandas.Series.transform for more.

truediv(other, axis='columns', level=None, fill_value=None)

Get floating division of BasePandasDataset and other, element-wise (binary operator truediv).

Notes

See pandas API documentation for pandas.DataFrame.truediv, pandas.Series.truediv for more.

truncate(before=None, after=None, axis=None, copy=True)

Truncate a BasePandasDataset before and after some index value.

Notes

See pandas API documentation for pandas.DataFrame.truncate, pandas.Series.truncate for more.

tshift(periods=1, freq=None, axis=0)

Shift the time index, using the index’s frequency if available.

Notes

See pandas API documentation for pandas.DataFrame.tshift, pandas.Series.tshift for more.

tz_convert(tz, axis=0, level=None, copy=True)

Convert tz-aware axis to target time zone.

Notes

See pandas API documentation for pandas.DataFrame.tz_convert, pandas.Series.tz_convert for more.

tz_localize(tz, axis=0, level=None, copy=True, ambiguous='raise', nonexistent='raise')

Localize tz-naive index of a BasePandasDataset to target time zone.

Notes

See pandas API documentation for pandas.DataFrame.tz_localize, pandas.Series.tz_localize for more.

value_counts(subset: Optional[Sequence[Hashable]] = None, normalize: modin.pandas.base.BasePandasDataset.bool = False, sort: modin.pandas.base.BasePandasDataset.bool = True, ascending: modin.pandas.base.BasePandasDataset.bool = False, dropna: modin.pandas.base.BasePandasDataset.bool = True)

Count unique values in the BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.value_counts, pandas.Series.value_counts for more.

property values

Return a NumPy representation of the BasePandasDataset.

Notes

See pandas API documentation for pandas.DataFrame.values, pandas.Series.values for more.

var(axis=None, skipna=True, level=None, ddof=1, numeric_only=None, **kwargs)

Return unbiased variance over requested axis.

Notes

See pandas API documentation for pandas.DataFrame.var, pandas.Series.var for more.

System Architecture

DataFrame Module Overview