Hi Rajeev,
To copy data from Blob Storage to Databricks Delta Lake, you can use Azure Data Factory (ADF) pipeline with a Delta Lake sink.
Here are the high-level steps to perform this task:
Create a new Azure Data Factory in the Azure portal. Create a new pipeline in the Data Factory. Add a Blob Storage source to the pipeline and configure it to read the data you want to copy. Add a Delta Lake sink to the pipeline and configure it to write the data to your Delta Lake table/db. You'll need to provide the JDBC URL, username, password, and table name. Optionally, you can use a staging directory to improve performance. In this case, you'll need to add a copy activity to the pipeline that first copies the data from Blob Storage to the staging directory, and then copies it from the staging directory to Delta Lake. Here's an example pipeline JSON that copies data from Blob Storage to Delta Lake:
json Copy code { "name": "CopyData", "properties": { "activities": [ { "name": "CopyFromBlob", "type": "Copy", "inputs": [ { "name": "BlobInput" } ], "outputs": [ { "name": "StagingOutput" } ], "typeProperties": { "source": { "type": "BlobSource", "recursive": true }, "sink": { "type": "FileSystemSink", "rootFolderPath": "wasbs://<your-storage-account-name>@<your-container-name>.blob.core.windows.net/staging/", "copyBehavior": "FlattenHierarchy", "writeBatchSize": 0, "writeBatchTimeout": "00:00:00" } }, "policy": { "timeout": "7.00:00:00", "concurrency": 1, "executionPriorityOrder": "NewestFirst" } }, { "name": "CopyToDeltaLake", "type": "Copy", "dependsOn": [ { "activity": "CopyFromBlob", "dependencyConditions": [ "Succeeded" ] } ], "inputs": [ { "name": "StagingOutput" } ], "outputs": [ { "name": "DeltaLakeOutput" } ], "typeProperties": { "source": { "type": "FileSystemSource", "recursive": true }, "sink": { "type": "DeltaLakeSink", "writeBatchSize": 0, "writeBatchTimeout": "00:00:00", "enableAutoCompaction": true, "maxBytesCompacted": "5368709120" }, "enableStaging": true, "stagingLinkedService": { "referenceName": "<your-storage-account-name>", "type": "AzureBlobFS", "typeProperties": { "url": "wasbs://<your-storage-account-name>@<your-container-name>.blob.core.windows.net/staging/", "recursive": true, "enableServerSideEncryption": true, "encryptionType": "ServiceManaged", "fileName": "" } } }, "policy": { "timeout": "7.00:00:00", "concurrency": 1, "executionPriorityOrder": "NewestFirst" } } ], "parameters": { "BlobInput": { "type": "String