Azure databricks python for loop, read row

Dondapati, Navin 281 Reputation points
2020-11-20T02:02:19.683+00:00

Hi Guys,

how do we loop through each row in an data frame, which has set of files

storage_account_name = "storacct"
storage_account_access_key = ""

spark.conf.set("fs.azure.account.key.storacct0001.dfs.core.windows.net",storage_account_access_key)

store files information blob to list

DBFileList=dbutils.fs.ls("abfss://databrickstg@storacct0001.dfs.core.windows.net/STG")

convert List to Dataframe

df=spark.createDataFrame(DBFileList)

i want to loop through each file name and store into an different table; tried below just gives only column name no row info is displayed.
for fi in df:
print(fi)

Regards,
Navin

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,272 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Dondapati, Navin 281 Reputation points
    2020-11-20T03:09:55.793+00:00

    got it
    for row in df.collect():
    print(row.path)

    1 person found this answer helpful.

  2. Evan Chatter 16 Reputation points
    2021-04-21T06:10:55.827+00:00

    Iterating through pandas dataFrame objects is generally slow. Iteration beats the whole purpose of using DataFrame. It is an anti-pattern and is something you should only do when you have exhausted every other option. It is better look for a List Comprehensions , vectorized solution or DataFrame.apply() method for loop through DataFrame.

    List comprehensions example

    result = [(x, y,z) for x, y,z in zip(df['column1'], df['column2'],df['column3'])]
    
    0 comments No comments

  3. Anonymous
    2022-05-02T18:38:41.48+00:00

    bascially, for row in df.collect(): print(row.path)

    that will help. It is an anti-pattern and is something you should only do when you have exhausted every other option. it can also be a vectorized solution or DataFrame.apply() method for. anything else lmk. i have a pilatuc pc 12 for sale and a cessna citation for sale also if anyone is interested. Thanks guys!

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.