Config Module Overview¶
Using this module, the user can tune Modin’s behavior. To see all avaliable configs just run python -m modin.config, this command will print all Modin configs with descriptions.
Public API¶
Potentially, the source of configs can be any, but for now only environment
variables are implemented. Any environment variable originate from
EnvironmentVariable
, which contains most of
the config API implementation.
- class modin.config.envvars.EnvironmentVariable¶
Base class for environment variables-based configuration.
- classmethod get()¶
Get config value.
- Returns
Decoded and verified config value.
- Return type
Any
- classmethod get_help() str ¶
Generate user-presentable help for the config.
- Returns
- Return type
str
- classmethod get_value_source()¶
Get value source of the config.
- Returns
- Return type
int
- classmethod once(onvalue, callback)¶
Execute callback if config value matches onvalue value.
Otherwise accumulate callbacks associated with the given onvalue in the _once container.
- Parameters
onvalue (Any) – Config value to set.
callback (callable) – Callable that should be executed if config value matches onvalue.
- classmethod put(value)¶
Set config value.
- Parameters
value (Any) – Config value to set.
- classmethod subscribe(callback)¶
Add callback to the _subs list and then execute it.
- Parameters
callback (callable) – Callable to execute.
Modin Configs List¶
Config Name |
Env. Variable Name |
Default Value |
Description |
Options |
---|---|---|---|---|
AsvDataSizeConfig |
MODIN_ASV_DATASIZE_CONFIG |
Allows to override default size of data (shapes). |
||
AsvImplementation |
MODIN_ASV_USE_IMPL |
modin |
Allows to select a library that we will use for testing performance. |
(‘modin’, ‘pandas’) |
Backend |
MODIN_BACKEND |
Pandas |
Engine to run on a single node of distribution. |
(‘Pandas’, ‘OmniSci’, ‘Pyarrow’, ‘Cudf’) |
BenchmarkMode |
MODIN_BENCHMARK_MODE |
False |
Whether or not to perform computations synchronously. |
|
CpuCount |
MODIN_CPUS |
2 |
How many CPU cores to use during initialization of the Modin engine. |
|
DoLogRpyc |
MODIN_LOG_RPYC |
Whether to gather RPyC logs (applicable for remote context). |
||
DoTraceRpyc |
MODIN_TRACE_RPYC |
Whether to trace RPyC calls (applicable for remote context). |
||
DoUseCalcite |
MODIN_USE_CALCITE |
True |
Whether to use Calcite for OmniSci queries execution. |
|
Engine |
MODIN_ENGINE |
Ray |
Distribution engine to run queries by. |
(‘Ray’, ‘Dask’, ‘Python’, ‘Native’) |
GpuCount |
MODIN_GPUS |
How may GPU devices to utilize across the whole distribution. |
||
IsDebug |
MODIN_DEBUG |
Force Modin engine to be “Python” unless specified by $MODIN_ENGINE. |
||
IsExperimental |
MODIN_EXPERIMENTAL |
Whether to Turn on experimental features. |
||
IsRayCluster |
MODIN_RAY_CLUSTER |
Whether Modin is running on pre-initialized Ray cluster. |
||
Memory |
MODIN_MEMORY |
|
||
NPartitions |
MODIN_NPARTITIONS |
2 |
How many partitions to use for a Modin DataFrame (along each axis). |
|
OmnisciFragmentSize |
MODIN_OMNISCI_FRAGMENT_SIZE |
How big a fragment in OmniSci should be when creating a table (in rows). |
||
OmnisciLaunchParameters |
MODIN_OMNISCI_LAUNCH_PARAMETERS |
{‘enable_union’: 1, ‘enable_columnar_output’: 1, ‘enable_lazy_fetch’: 0, ‘null_div_by_zero’: 1, ‘enable_watchdog’: 0} |
|
|
PersistentPickle |
MODIN_PERSISTENT_PICKLE |
False |
Wheather serialization should be persistent. |
|
ProgressBar |
MODIN_PROGRESS_BAR |
False |
Whether or not to show the progress bar. |
|
RayRedisAddress |
MODIN_REDIS_ADDRESS |
Redis address to connect to when running in Ray cluster. |
||
RayRedisPassword |
MODIN_REDIS_PASSWORD |
random string |
What password to use for connecting to Redis. |
|
SocksProxy |
MODIN_SOCKS_PROXY |
SOCKS proxy address if it is needed for SSH to work. |
||
TestDatasetSize |
MODIN_TEST_DATASET_SIZE |
Dataset size for running some tests. |
(‘Small’, ‘Normal’, ‘Big’) |
|
TestRayClient |
MODIN_TEST_RAY_CLIENT |
False |
Set to true to start and connect Ray client before a testing session starts. |
|
TrackFileLeaks |
MODIN_TEST_TRACK_FILE_LEAKS |
True |
Whether to track for open file handles leakage during testing. |
Usage Guide¶
See example of interation with Modin configs below, as it can be seen config value can be set either by setting the environment variable or by using config API.
import os
# Setting `MODIN_BACKEND` environment variable.
# Also can be set outside the script.
os.environ["MODIN_BACKEND"] = "OmniSci"
import modin.config
import modin.pandas as pd
# Checking initially set `Backend` config,
# which corresponds to `MODIN_BACKEND` environment
# variable
print(modin.config.Backend.get()) # prints 'Omnisci'
# Checking default value of `NPartitions`
print(modin.config.NPartitions.get()) # prints '8'
# Changing value of `NPartitions`
modin.config.NPartitions.put(16)
print(modin.config.NPartitions.get()) # prints '16'