Read parquet file from a blob storage

Question

Read parquet file from a blob storage

Keerthana J 71

How can I read a snappy.parquet file which is in my blob container(already mounted) from azure databricks?

2 answers

Your answer

Answer 1

Hello @KEERTHANA JAYADEVAN

You can use the spark.read.parquet() method to read the Parquet file from a mounted blob container in Azure Databricks.

Here is an example:

dbutils.fs.mount( source = "wasbs://******@blobstoreaccount.blob.core.windows.net/", mount_point = "/mnt/nyctrip", extra_configs = {"fs.azure.account.key.blobstorageaccount.blob.core.windows.net":"key"})

-- Define the path to your Parquet file

parquet_file_path = "/mnt/nyctrip/NYCTripSmall.parquet"

--Read the Parquet file into a DataFrame

df = spark.read.parquet(parquet_file_path)

-- Show the DataFrame

df.show()

enter image description here

I hope this answers your question.

If this answers your question, please consider accepting the answer by hitting the Accept answer and up-vote as it helps the community look for answers to similar questions

Bhargava-MSFT 31,261 Reputation points Microsoft Employee Moderator

2024-02-05T23:24:15.8366667+00:00

Hello @KEERTHANA JAYADEVAN I am checking to see if you had a chance to look into the above answer

Answer 2

Hi , Thanks for the reply. I am facing issue while reading, I completed the mounting process. And I can see the list of files under the container as well. I tried for parquet,csv, excel. attaching excel code storage_account_name = "abcd" storage_account_key = "xxxx" container = "lmnop" spark.conf.set("fs.azure.account.key.{0}.blob.core.windows.net".format(storage_account_name), storage_account_key) dbutils.fs.mount( source = "wasbs://{0}@{1}.blob.core.windows.net".format(container, storage_account_name), mount_point = "/mnt/lmnop", extra_configs = {"fs.azure.account.key.{0}.blob.core.windows.net".format(storage_account_name): storage_account_key} ) CAL = pd.read_excel('/mnt/lmnop/demo.xlsx') CAL.head() error: FileNotFoundError: [Errno 2] No such file or directory: '/mnt/lmnop/demo.xlsx'

Share via

Read parquet file from a blob storage

2 answers

Your answer