reading paruet files from azure blob storage

Rohit Kulkarni 731 Reputation points
2022-12-08T07:28:13.277+00:00

Hello Team,

I am trying to read the files from the existing folder in Azure blob storage. I am using the below query :

read content of file

filepath="abfss://******@storagename.dfs.core.windows.net/temp/bronze/salesforce/account/"
df = spark.read.parquet(filepath)

df.show(10)

But i am getting error :

A transaction log for Delta was found at abfss://******@storagename.dfs.core.windows.net/temp/bronze/salesforce/account/_delta_log,
but you are trying to read from abfss://******@storagename.dfs.core.windows.net/temp/bronze/salesforce/account/ using format("parquet"). You must use
'format("delta")' when reading and writing to a delta table.

Please advise.

Regards
Rohit

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha 19,547 Reputation points Microsoft Employee Moderator
    2022-12-09T00:30:19.337+00:00

    Hello @Rohit Kulkarni ,
    Thanks for the question and using MS Q&A platform.
    As we understand the ask here is read the paraquet file stored at the cloud storage , please do let us know if its not accurate.

    Delta is storing the data as parquet, just has an additional layer over it with advanced features, providing history of events, (transaction log) and more flexibility on changing the content like, update, delete and merge capabilities. This link delta explains quite good how the files organized. The error above does call out the location of transaction log .

    So if you try the below piece of code , you should be able to read the files

    filepath="abfss://******@storagename.dfs.core.windows.net/temp/bronze/salesforce/account/"
    df = spark.read.format("delta").load(filepath)

    Please do let me if you have any queries.
    Thanks
    Himanshu


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.