Issue reading delta in storage account gen-2

Question

I was reading a delta table in storage account gen 2 on azure Databricks with spark and today the command didn't work and I'm receiving timeout error, it just continue running the command:
spark.read.format('delta').load(my table)

Answer

Hello Ferreira, Jeniffer (NAZ-V),

Welcome to the MS Q&A platform.

Below are the potential root causes for your issue.

Network connectivity issues: This could be caused by network connectivity issues between your Databricks cluster and the storage account. Check if any network issues could be causing the timeout error.
Large Delta table size: If the Delta table size is large, it may take a long time to load the data, causing the timeout error. In this case, you can try increasing the timeout setting for the Spark read operation.
Insufficient resources: If your Databricks cluster does not have sufficient resources, it may not be able to load the Delta table within the timeout period. You can try increasing the size of your cluster or using a more powerful instance type.
Permissions: If the credentials used to access the storage account do not have sufficient permissions, you may not be able to access the Delta table. Check the permissions for the storage account and ensure that the credentials used to access the storage account have sufficient permissions.
The version of the Delta Lake library you are using may be incompatible with the version of Spark you are running, causing the command to fail.
Try running the command again after a few minutes to see if the issue resolves itself.

If the issue still persists, you can try increasing the timeout value in your code to see if that resolves the issue. You can try setting the spark.sql.execution.arrow.maxRecordsPerBatch configuration property to a higher value to increase the timeout value.

I hope this helps. Please let us know if you have any further questions.

Issue reading delta in storage account gen-2

2 answers