opomba,
Dostop do te strani zahteva pooblastilo. Poskusite se vpisati alispremeniti imenike.
Dostop do te strani zahteva pooblastilo. Poskusite lahko spremeniti imenike.
Returns all the records in the DataFrame as a list of Row.
Syntax
collect()
Returns
list: A list of Row objects, each representing a row in the DataFrame.
Notes
This method should only be used if the resulting list is expected to be small, as all the data is loaded into the driver's memory.
Examples
df = spark.createDataFrame([(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])
df.collect()
# [Row(age=14, name='Tom'), Row(age=23, name='Alice'), Row(age=16, name='Bob')]
df.filter(df.age > 15).collect()
# [Row(age=23, name='Alice'), Row(age=16, name='Bob')]
df.select("name").collect()
# [Row(name='Tom'), Row(name='Alice'), Row(name='Bob')]
from pyspark.sql.functions import upper
df.select(upper(df.name)).collect()
# [Row(upper(name)='TOM'), Row(upper(name)='ALICE'), Row(upper(name)='BOB')]
rows = df.collect()
[row["name"] for row in rows]
# ['Tom', 'Alice', 'Bob']
[row.asDict() for row in rows]
# [{'age': 14, 'name': 'Tom'}, {'age': 23, 'name': 'Alice'}, {'age': 16, 'name': 'Bob'}]