Azure DataBricks ADF Azure Blob Storage connectivity issue

Chris Kief 0 Reputation points
2023-11-03T01:45:20.9266667+00:00

When trying to use a Data Factory pipeline to copy from Azure Databricks Delta Lake to Azure Blob Storage (DelimitedText), I'm getting the following error when using the copy data activity:

ErrorCode=AzureDatabricksCommandError,Hit an error when running the command in Azure Databricks. Error details: shaded.databricks.org.apache.hadoop.fs.azure.AzureException: shaded.databricks.org.apache.hadoop.fs.azure.AzureException: Unable to access container <my-container> in account <my-storage-account>.blob.core.windows.net using anonymous credentials, and no credentials found for them in the configuration. Caused by: shaded.databricks.org.apache.hadoop.fs.azure.AzureException: Unable to access container <my-container> in account <my-storage-account>.blob.core.windows.net using anonymous credentials, and no credentials found for them in the configuration. Caused by: hadoop_azure_shaded.com.microsoft.azure.storage.StorageException: Caused by: java.net.UnknownHostException: <my-storage-account>.blob.core.windows.net.

I can switch the source dataset to Azure SQL Database and it runs without issue. I can also read and write to the same blob storage container without issue in the same pipeline. The credentials are saved with the underlying Azure Blob Storage linked service and the connection validates just fine. So the issue can't be with the blob storage container.

Inputs:

{
    "source": {
        "type": "AzureDatabricksDeltaLakeSource",
        "exportSettings": {
            "type": "AzureDatabricksDeltaLakeExportCommand"
        }
    },
    "sink": {
        "type": "DelimitedTextSink",
        "storeSettings": {
            "type": "AzureBlobStorageWriteSettings"
        },
        "formatSettings": {
            "type": "DelimitedTextWriteSettings",
            "quoteAllText": true,
            "fileExtension": ".csv"
        }
    },
    "enableStaging": false
}

Seems like either the above error message isn't accurate, or there's a bug somewhere.

Has anyone else run into this?

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,549 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,019 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,909 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Ramya Harinarthini_MSFT 5,311 Reputation points Microsoft Employee
    2023-11-03T06:26:57.3266667+00:00

    @Chris Kief

    Welcome to Microsoft Q&A, thank you for posting your here!!

    The above error mainly happens because the staging is not enabled. We need to enable staging to copy data from delta Lake.

    Go to Azure Databricks inside cluster -> advance option and edit spark config as per the below format.

    spark.hadoop.fs.azure.account.key.<storage_account_name>.blob.core.windows.net <Access Key>
    

    After that you can follow this official document it has detail explanation about copy activity with delta lake.

    you can refer this Article by RishShah-4592

    Hope this helps!
    Kindly let us know if the above helps or you need further assistance on this issue.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.