pd.DataFrame supported APIsΒΆ

The following table lists both implemented and not implemented methods. If you have need of an operation that is listed as not implemented, feel free to open an issue on the GitHub repository, or give a thumbs up to already created issues. Contributions are also welcome!

The following table is structured as follows: The first column contains the method name. The second column is a flag for whether or not there is an implementation in Modin for the method in the left column. Y stands for yes, N stands for no, P stands for partial (meaning some parameters may not be supported yet), and D stands for default to pandas.

DataFrame method

pandas Doc link

Implemented? (Y/N/P/D)

Notes for Current implementation

T

T

Y

abs

abs

Y

add

add

Y

Shuffles data in operations between DataFrames

add_prefix

add_prefix

Y

add_suffix

add_suffix

Y

agg / aggregate

agg / aggregate

P

  • Dictionary func parameter defaults to pandas

  • Numpy operations default to pandas

align

align

D

all

all

Y

any

any

Y

append

append

Y

apply

apply

Y

See agg

applymap

applymap

Y

as_blocks

as_blocks

D

Becomes a non-parallel object

as_matrix

as_matrix

D

Becomes a non-parallel object

asfreq

asfreq

D

asof

asof

Y

assign

assign

Y

astype

astype

Y

at

at

Y

at_time

at_time

Y

axes

axes

Y

between_time

between_time

Y

bfill

bfill

Y

blocks

blocks

D

bool

bool

Y

boxplot

boxplot

D

clip

clip

Y

clip_lower

clip_lower

Y

clip_upper

clip_upper

Y

combine

combine

Y

combine_first

combine_first

Y

compare

compare

Y

copy

copy

Y

corr

corr

Y

Correlation floating point precision may slightly differ from pandas. For now pearson method is available only. For other methods defaults to pandas.

corrwith

corrwith

D

count

count

Y

cov

cov

Y

Covariance floating point precision may slightly differ from pandas.

cummax

cummax

Y

cummin

cummin

Y

cumprod

cumprod

Y

cumsum

cumsum

Y

describe

describe

Y

diff

diff

Y

div

div

Y

See add

divide

divide

Y

See add

dot

dot

Y

drop

drop

Y

droplevel

droplevel

Y

drop_duplicates

drop_duplicates

D

dropna

dropna

Y

dtypes

dtypes

Y

duplicated

duplicated

Y

empty

empty

Y

eq

eq

Y

See add

equals

equals

Y

Requires shuffle, can be further optimized

eval

eval

Y

ewm

ewm

D

expanding

expanding

D

explode

explode

D

ffill

ffill

Y

fillna

fillna

P

value parameter of type DataFrame defaults to pandas

filter

filter

Y

first

first

Y

first_valid_index

first_valid_index

Y

floordiv

floordiv

Y

See add

from_dict

from_dict

D

from_items

from_items

Y

from_records

from_records

D

ftypes

ftypes

Y

ge

ge

Y

See add

get

get

Y

groupby

groupby

Y

Not yet optimized for all operations

gt

gt

Y

See add

head

head

Y

hist

hist

D

iat

iat

Y

idxmax

idxmax

Y

idxmin

idxmin

Y

iloc

iloc

Y

infer_objects

infer_objects

D

info

info

Y

insert

insert

Y

interpolate

interpolate

D

isin

isin

Y

isna

isna

Y

isnull

isnull

Y

items

items

Y

iteritems

iteritems

P

Modin does not parallelize iteration in Python

iterrows

iterrows

P

Modin does not parallelize iteration in Python

itertuples

itertuples

P

Modin does not parallelize iteration in Python

join

join

P

When on is set to right or outer it defaults to pandas

keys

keys

Y

kurt

kurt

Y

kurtosis

kurtosis

Y

last

last

Y

last_valid_index

last_valid_index

Y

le

le

Y

See add

loc

loc

Y

We do not support: boolean array, callable

lookup

lookup

D

lt

lt

Y

See add

mad

mad

Y

mask

mask

D

max

max

Y

mean

mean

Y

median

median

Y

melt

melt

Y

memory_usage

memory_usage

Y

merge

merge

P

Implemented the following cases: left_index=True and right_index=True, how=left and how=inner for all values of parameters except left_index=True and right_index=False or left_index=False and right_index=True. Defaults to pandas otherwise.

min

min

Y

mod

mod

Y

mode

mode

Y

mul

mul

Y

See add

multiply

multiply

Y

See add

ndim

ndim

Y

ne

ne

Y

See add

nlargest

nlargest

Y

notna

notna

Y

notnull

notnull

Y

nsmallest

nsmallest

Y

nunique

nunique

Y

pct_change

pct_change

D

pipe

pipe

Y

pivot

pivot

Y

pivot_table

pivot_table

Y

plot

plot

D

pop

pop

Y

pow

pow

Y

See add

prod

prod

Y

product

product

Y

quantile

quantile

Y

query

query

P

Local variables not yet supported

radd

radd

Y

See add

rank

rank

Y

rdiv

rdiv

Y

See add

reindex

reindex

Y

Shuffles data

reindex_like

reindex_like

D

rename

rename

Y

rename_axis

rename_axis

Y

reorder_levels

reorder_levels

Y

replace

replace

Y

resample

resample

Y

reset_index

reset_index

Y

rfloordiv

rfloordiv

Y

See add

rmod

rmod

Y

See add

rmul

rmul

Y

See add

rolling

rolling

Y

round

round

Y

rpow

rpow

Y

See add

rsub

rsub

Y

See add

rtruediv

rtruediv

Y

See add

sample

sample

Y

select_dtypes

select_dtypes

Y

sem

sem

Y

set_axis

set_axis

Y

set_index

set_index

Y

shape

shape

Y

shift

shift

Y

size

size

Y

skew

skew

Y

slice_shift

slice_shift

Y

sort_index

sort_index

Y

sort_values

sort_values

Y

Shuffles data

sparse

sparse

N

squeeze

squeeze

Y

stack

stack

Y

std

std

Y

style

style

D

sub

sub

Y

See add

subtract

subtract

Y

See add

sum

sum

Y

swapaxes

swapaxes

Y

swaplevel

swaplevel

Y

tail

tail

Y

take

take

Y

to_clipboard

to_clipboard

D

to_csv

to_csv

Y

to_dense

to_dense

D

to_dict

to_dict

D

to_excel

to_excel

D

to_feather

to_feather

D

to_gbq

to_gbq

D

to_hdf

to_hdf

D

to_html

to_html

D

to_json

to_json

D

to_latex

to_latex

D

to_msgpack

to_msgpack

D

to_parquet

to_parquet

D

to_period

to_period

D

to_pickle

to_pickle

D

to_records

to_records

D

to_sparse

to_sparse

D

to_sql

to_sql

Y

to_stata

to_stata

D

to_string

to_string

D

to_timestamp

to_timestamp

D

to_xarray

to_xarray

D

transform

transform

Y

transpose

transpose

Y

truediv

truediv

Y

See add

truncate

truncate

Y

tshift

tshift

Y

tz_convert

tz_convert

Y

tz_localize

tz_localize

Y

unstack

unstack

Y

update

update

Y

values

values

Y

value_counts

value_counts

D

var

var

Y

where

where

Y