Modin Logging#

Modin logging offers users greater insight into their queries by logging internal Modin API calls, partition metadata, and profiling system memory. When Modin logging is enabled (default disabled), log files are written to a local .modin directory at the same directory level as the notebook/script used to run Modin. It is possible to configure whether to log system memory and additional metadata in addition to Modin API calls (see the usage examples below).

The logs generated by Modin Logging will be written to a .modin/logs/job_<uuid> directory, uniquely named after the job uuid. The logs that contain the Modin API stack traces are named trace.log. The logs that contain the memory utilization metrics are named memory.log. By default, if any log file exceeds 10MB (configurable with LogFileSize), that file will be saved and a separate log file will be created. For instance, if users have 20MB worth of Modin API logs, they can expect to find trace.log.1 and trace.log.2 in the .modin/logs/job_<uuid> directory. After 10 * LogFileSize MB or by default 100MB of logs, the logs will rollover and the original log files beginning with trace.log.1 will be overwritten with the new log lines.

Developer Warning: In some cases, running services like JupyterLab in the modin/modin directory may result in circular dependency issues. This is due to a naming conflict between the modin/logging directory and the Python logging module, which may be used as a default in such environments. To resolve this, please run Jupyterlab or other similar services from directories other than modin/modin.

Usage examples#

In the example below, we enable logging for internal Modin API calls.

import modin.pandas as pd
from modin.config import LogMode
LogMode.enable_api_only()

# User code goes here

In the next example, we add logging for not only internal Modin API calls, but also for partition metadata and memory profiling. We can set the granularity (in seconds) at which the system memory utilization is logged using LogMemoryInterval. We can also set the maximum size of the logs (in MBs) using LogFileSize.

import modin.pandas as pd
from modin.config import LogMode, LogMemoryInterval, LogFileSize
LogMode.enable()
LogMemoryInterval.put(2) # Defaults to 5 seconds, new interval is 2 seconds
LogFileSize.put(5) # Defaults to 10 MB per log file, new size is 5 MB

# User code goes here

Disable Modin logging like so:

import modin.pandas as pd
from modin.config import LogMode
LogMode.disable()

# User code goes here