Azure function v2 (python) to chunk data in a for loop does not output to blob storage

jim01011 0 Reputation points
2024-04-22T09:14:36.15+00:00

Im trying to write a simple for loop that cut's a list into chunks within an Azure function v2 (python) and outputs these blobs to csvs.

import
import logging
import azure.functions as func


app = func.FunctionApp()

@app.blob_trigger(arg_name="myblob", path="old/{name}",connection="Storage") 
@app.blob_output(arg_name="outputblob", path="new/{name}",connection="Storage") 
def BlobTrigger(myblob: func.InputStream,outputblob: func.Out[str]):
    results = random.sample(range(1, 500), 7)
    step = 10
    for i in range(0, len(results), step): 
        x = i 
        results_chunked = results[x:x+step]
        df = pd.DataFrame(results_chunked)
        filename = '/tmp/'+ str(i) +'.csv'
        output_csv = df.to_csv(filename,index = False)
        outputblob.set(output_csv)

Blob function was executed successfully

enter image description here

However the chunked data (output_csv) is not being received in path="new/{name}" any assistance on this ?

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,300 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. SaravananGanesan-3177 1,665 Reputation points
    2024-04-28T17:46:06.99+00:00

    Hi,

    In your code, you're trying to output CSV files to Azure Blob Storage using an Azure Function. However, there are a couple of issues:

    1. The outputblob parameter in your function is of type func.Out[str], which expects a string as input. You're trying to set it with the output of df.to_csv(), which returns None. Instead, you should write the CSV data to a file and then upload that file to blob storage.
    2. You're using a local file path (/tmp/) which won't work in Azure Functions as they are serverless and have limited access to the file system.

    import logging

    import azure.functions as func

    import pandas as pd

    import random

    import os

    @app.blob_trigger(name="myblob", path="old/{name}", connection="Storage")

    @app.blob_output(name="outputblob", path="new/{name}", connection="Storage")

    def BlobTrigger(myblob: func.InputStream, outputblob: func.Out[str]):

    results = random.sample(range(1, 500), 7)
    
    step = 3  # Adjust step size as needed
    
    for i in range(0, len(results), step): 
    
        results_chunked = results[i:i+step]
    
        df = pd.DataFrame(results_chunked)
    
        csv_data = df.to_csv(index=False)
    
        # Generate a unique file name
    
        file_name = str(i) + '.csv'
    
        # Write CSV data to a temporary file
    
        temp_file_path = os.path.join(os.environ['TMP'], file_name)
    
        with open(temp_file_path, 'w') as temp_file:
    
            temp_file.write(csv_data)
    
        # Upload the temporary file to blob storage
    
        outputblob.set(open(temp_file_path, 'rb'))
    
        # Delete the temporary file
    
        os.remove(temp_file_path)
    

    This code generates CSV files in chunks and uploads them to Azure Blob Storage using the Azure Function's outputblob parameter. Make sure to adjust the step variable according to your chunk size requirements.

    0 comments No comments