pyspark.sql.datasource.DataSourceStreamReader.partitions#

DataSourceStreamReader.partitions(start, end)[source]#

Returns a list of InputPartition given the start and end offsets. Each InputPartition represents a data split that can be processed by one Spark task. This may be called with an empty offset range when start == end, in that case the method should return an empty sequence of InputPartition.

Parameters
startdict

The start offset of the microbatch to plan partitioning.

enddict

The end offset of the microbatch to plan partitioning.

Returns
sequence of InputPartitions

A sequence of partitions for this data source. Each partition value must be an instance of InputPartition or a subclass of it.