IO details in cuDF backend

IO on cuDF backend is implemented using base classes BaseIO and CSVDispatcher.

cuDFOnRayIO

The class cuDFOnRayIO implements BaseIO base class using cuDF-backend entities (cuDFOnRayFrame, cuDFOnRayFramePartition etc.).

Public API

class modin.engines.ray.cudf_on_ray.io.io.cuDFOnRayIO

The class implements BaseIO class using cuDF-entities.

frame_cls

alias of modin.engines.ray.cudf_on_ray.frame.data.cuDFOnRayFrame

query_compiler_cls

alias of modin.backends.cudf.query_compiler.cuDFQueryCompiler

classmethod read_csv(*args, **kwargs)

Read data according passed args and kwargs.

Parameters
  • *args (iterable) – Positional arguments to be passed into _read function.

  • **kwargs (dict) – Keywords arguments to be passed into _read function.

Returns

query_compiler – Query compiler with imported data for further processing.

Return type

BaseQueryCompiler

Notes

read is high-level function that calls specific for defined backend, engine and dispatcher class _read function with passed parameters and performs some postprocessing work on the resulting query_compiler object.

cuDFCSVDispatcher

The cuDFCSVDispatcher class implements CSVDispatcher using cuDF backend.

class modin.engines.ray.cudf_on_ray.io.text.csv_dispatcher.cuDFCSVDispatcher

The class implements CSVDispatcher using cuDF backend.

This class handles utils for reading .csv files.

classmethod build_partition(partition_ids, row_lengths, column_widths)

Build array with partitions of cls.frame_partition_cls class.

Parameters
  • partition_ids (list) – Array with references to the partitions data.

  • row_lengths (list) – Partitions rows lengths.

  • column_widths (list) – Number of columns in each partition.

Returns

Array with shape equals to the shape of partition_ids and filed with partitions objects.

Return type

np.ndarray