Want to load the table data to excel file using sink in ADF

Anonymous
2023-06-16T15:24:39.5466667+00:00

Hi,

My requirement is, using table data and load in to excel file. how can we achieve this in ADF?

why because excel is not supported as sink in the ADF. Please help me with detail steps how to implement this requirement.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,625 questions
0 comments No comments
{count} votes

Accepted answer
  1. Subashri Vasudevan 11,226 Reputation points
    2023-06-19T09:44:05.8+00:00

    Bommisetty, Rakesh (Hudson IT Consultant)

    Hi Rakesh,

    The solution is two fold.

    1. you can create the file as delimited file (csv) in ADLS (using copy activity or data flow)
    2. then use azure function in python to convert csv to xlsx. Here is the code you can use with Az.Function.
    import logging
    import time
    import pandas as pd
    import azure.functions as func
    from azure.storage.blob import BlobServiceClient
    import os
    
    def main(req: func.HttpRequest) -> func.HttpResponse:
        logging.info('Python HTTP trigger function processed a request.')
        filename = req.params.get('name')
        targetFilename = filename.replace('.csv','.xlsx')
        STORAGEACCOUNTURL= 'https://storageacctname.blob.core.windows.net'
        STORAGEACCOUNTKEY= 'SA Key'
        LOCALFILENAME= '/tmp/'+str(filename)
    
        CONTAINERNAME= 'yourcontainer/yourinputfolder'
        BLOBNAME= str(filename) 
        #download from blob
        t1=time.time()
        blob_service_client_instance = BlobServiceClient(account_url=STORAGEACCOUNTURL, credential=STORAGEACCOUNTKEY)
        blob_client_instance = blob_service_client_instance.get_blob_client(CONTAINERNAME, BLOBNAME, snapshot=None)
        with open(LOCALFILENAME, "wb") as my_blob:
            blob_data = blob_client_instance.download_blob()
            blob_data.readinto(my_blob)
        t2=time.time()
        df = pd.read_csv(LOCALFILENAME)
        df.to_excel('/tmp/'+targetFilename,header=True)
        ROOT_DIR = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
        print(ROOT_DIR+'/'+LOCALFILENAME)
        container_client = blob_service_client_instance.get_container_client(container='yourcontainer/youroutputfolder')
        with open(file=os.path.join(ROOT_DIR, LOCALFILENAME), mode="rb") as data:
            blob_client = container_client.upload_blob(name=targetFilename, data=data, overwrite=True)
        return func.HttpResponse('success')
    
    
    

    What the above code does:

    It downloads the csv file from ADLS and converts it to XLSX locally, in az.function root directory and uploads back to ADLS folder.

    Then, using azure function activity, you can call the function, in my case its HttpTrigger3 and i am passing input.csv as filename

    Screenshot 2023-06-19 at 3.30.26 PM

    Please let me know if you have questions

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.