Share via

How to Read Password-Protected Excel from Azure Blob Storage and Load into Azure SQL Using Azure Function + ADF

SYN 125 Reputation points
2025-07-15T14:44:54.3433333+00:00

Hi everyone,

I'm currently working on a data integration project where I need to copy data from a password-protected Excel file stored in Azure Blob Storage into an Azure SQL Database using Azure Data Factory (ADF).

Since ADF does not support reading password-protected Excel files directly, I am planning to use an Azure Function to first decrypt and convert the Excel file to a readable format (like CSV or unprotected Excel) and then use an ADF pipeline to load the data into Azure SQL.

Could someone please guide me through the step-by-step process to:

Create an Azure Function that:

Connects to Azure Blob Storage

Downloads and decrypts the password-protected Excel file

Converts it to a format that ADF can work with (e.g., CSV or unprotected Excel)

Saves the converted file back to Blob Storage

Design an ADF pipeline that:

Triggers the Azure Function

Reads the decrypted file from Blob Storage

Loads the data into Azure SQL Database

Detailed steps and best practices for implementation would be greatly appreciated!

Thanks in advance!

Azure Data Factory
Azure Data Factory

An Azure service for ingesting, preparing, and transforming data at scale.

0 comments No comments

1 answer

Sort by: Most helpful
  1. Venkat Reddy Navari 5,840 Reputation points Microsoft External Staff Moderator
    2025-07-15T19:16:22.8733333+00:00

    Hi YERNAIDU SIRAPARAPU You're absolutely right Azure Data Factory (ADF) doesn't natively support reading password-protected Excel files. Your approach using an Azure Function to preprocess the file is the correct and scalable solution.

    Below is a complete step-by-step guide to help you achieve this:

    Step 1: Azure Function – Decrypt and Convert Excel File

    1 Create the Azure Function

    • Use either a Timer or HTTP Trigger (HTTP is ideal if you want ADF to trigger it).
    • Languages like Python or C# work well. Here's a Python-based approach.

    2 Connect to Azure Blob Storage:

    Install SDKs:

    
    pip install azure-storage-blob pandas msoffcrypto-tool openpyxl
    

    Python code snippet:

    
    from azure.storage.blob import BlobServiceClient
    import msoffcrypto
    import pandas as pd
    
    # Connect to Blob
    blob_service_client = BlobServiceClient.from_connection_string("Your_Connection_String")
    blob_client = blob_service_client.get_blob_client(container="your-container", blob="protected.xlsx")
    
    # Download protected file
    with open("/tmp/protected.xlsx", "wb") as file:
        file.write(blob_client.download_blob().readall())
    
    # Decrypt Excel
    with open("/tmp/protected.xlsx", "rb") as f:
        office_file = msoffcrypto.OfficeFile(f)
        office_file.load_key(password="YourPassword")  # Ideally fetch from Azure Key Vault
        with open("/tmp/decrypted.xlsx", "wb") as decrypted:
            office_file.decrypt(decrypted)
    
    # Convert to CSV
    df = pd.read_excel("/tmp/decrypted.xlsx")
    df.to_csv("/tmp/output.csv", index=False)
    
    # Upload CSV back to Blob
    output_blob = blob_service_client.get_blob_client(container="your-container", blob="output.csv")
    with open("/tmp/output.csv", "rb") as data:
        output_blob.upload_blob(data, overwrite=True)
    

    Best Practice: Store your Excel password in Azure Key Vault and access it securely using managed identity.

    Step 2: ADF Pipeline – Orchestrate the Process

    1 Trigger Azure Function

    • Use a Web Activity to call your HTTP-triggered Azure Function.
    • Pass parameters like blob name or password if needed (you can also use secure system variables or key vault references).

    2 Copy Data from Blob to Azure SQL

    • Use Copy Data Activity: Source: Blob Storage (CSV file) Sink: Azure SQL Database
    • Define or auto-map columns in dataset schema mapping.

    Recommended Best Practices

    • Use Managed Identity for secure, keyless authentication to Azure resources.
    • Clean up temp files (/tmp/ folder in Azure Functions has limited space).
    • Enable Application Insights for logging and error monitoring.
    • Make the Azure Function idempotent to handle retries gracefully.

    Hope this helps. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.