Hi @AzeemK ,
Thanks for the ask and using this forum.
I believe you are looking for this tutorial which explains how to run Python scripts through Azure Data Factory using Custom Activity (Azure Batch)
Here is the tutorial link: Run Python scripts through Azure Data Factory using Azure Batch
The following Python script is used in the tutorial, which loads the iris.csv
dataset from input container, performs a data manipulation process, and saves the results back to the output container.
# Load libraries
from azure.storage.blob import BlobServiceClient
import pandas as pd
# Define parameters
storageAccountURL = "<storage-account-url>"
storageKey = "<storage-account-key>"
containerName = "output"
# Establish connection with the blob storage account
blob_service_client = BlockBlobService(account_url=storageAccountURL,
credential=storageKey
)
# Load iris dataset from the task node
df = pd.read_csv("iris.csv")
# Subset records
df = df[df['Species'] == "setosa"]
# Save the subset of the iris dataframe locally in task node
df.to_csv("iris_setosa.csv", index = False)
# Upload iris dataset
container_client = blob_service_client.get_container_client(containerName)
with open("iris_setosa.csv", "rb") as data:
blob_client = container_client.upload_blob(name="iris_setosa.csv", data=data)
Hope this info helps.
----------
Thank you
Please do consider to click on "Accept Answer" and "Upvote" on the post that helps you, as it can be beneficial to other community members.