pyspark.RDD.cartesian¶
-
RDD.cartesian(other: pyspark.rdd.RDD[U]) → pyspark.rdd.RDD[Tuple[T, U]][source]¶ Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements
(a, b)whereais in self andbis in other.New in version 0.7.0.
See also
Examples
>>> rdd = sc.parallelize([1, 2]) >>> sorted(rdd.cartesian(rdd).collect()) [(1, 1), (1, 2), (2, 1), (2, 2)]