pyspark.sql.DataFrame.toLocalIterator¶
-
DataFrame.toLocalIterator(prefetchPartitions: bool = False) → Iterator[pyspark.sql.types.Row][source]¶ Returns an iterator that contains all of the rows in this
DataFrame. The iterator will consume as much memory as the largest partition in thisDataFrame. With prefetch it may consume up to the memory of the 2 largest partitions.New in version 2.0.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- prefetchPartitionsbool, optional
If Spark should pre-fetch the next partition before it is needed.
Changed in version 3.4.0: This argument does not take effect for Spark Connect.
- Returns
- Iterator
Iterator of rows.
Examples
>>> df = spark.createDataFrame( ... [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"]) >>> list(df.toLocalIterator()) [Row(age=14, name='Tom'), Row(age=23, name='Alice'), Row(age=16, name='Bob')]