Azure databricks python for loop, read row

Question

Azure databricks python for loop, read row

Dondapati, Navin 291

Hi Guys,

how do we loop through each row in an data frame, which has set of files

storage_account_name = "storacct"
storage_account_access_key = ""

spark.conf.set("fs.azure.account.key.storacct0001.dfs.core.windows.net",storage_account_access_key)

store files information blob to list

DBFileList=dbutils.fs.ls("abfss://******@storacct0001.dfs.core.windows.net/STG")

convert List to Dataframe

df=spark.createDataFrame(DBFileList)

i want to loop through each file name and store into an different table; tried below just gives only column name no row info is displayed.
for fi in df:
print(fi)

Regards,
Navin

3 answers

Your answer

Answer 1

Dondapati, Navin 291

got it
for row in df.collect():
print(row.path)

PRADEEPCHEEKATLA 91,656 Reputation points Moderator

2020-11-20T04:51:25.993+00:00

Hi @Anonymous ,

Glad to know that your issue has resolved. And thanks for sharing the solution, which might be beneficial to other community members reading this thread.

Answer 2

Evan Chatter 16

Iterating through pandas dataFrame objects is generally slow. Iteration beats the whole purpose of using DataFrame. It is an anti-pattern and is something you should only do when you have exhausted every other option. It is better look for a List Comprehensions , vectorized solution or DataFrame.apply() method for loop through DataFrame.

List comprehensions example

result = [(x, y,z) for x, y,z in zip(df['column1'], df['column2'],df['column3'])]

Answer 3

Anonymous

bascially, for row in df.collect(): print(row.path)

that will help. It is an anti-pattern and is something you should only do when you have exhausted every other option. it can also be a vectorized solution or DataFrame.apply() method for. anything else lmk. i have a pilatuc pc 12 for sale and a cessna citation for sale also if anyone is interested. Thanks guys!

Share via

Azure databricks python for loop, read row

store files information blob to list

convert List to Dataframe

3 answers

Your answer