RDD.
sumApprox
Approximate operation to return the sum within a timeout or meet the confidence.
Examples
>>> rdd = sc.parallelize(range(1000), 10) >>> r = sum(range(1000)) >>> abs(rdd.sumApprox(1000) - r) / r < 0.05 True
previous
pyspark.RDD.sum
next
pyspark.RDD.take