Hello Babs,
Thanks for reaching out on Microsoft Q&A!
I understand that you want to run Python scripts through Azure Data Factory (ADF). Below are the steps you can follow to get this working:
- Ensure you have an active Azure subscription This is required to create and use Azure services like Storage, Batch, and ADF.
- Create a Storage Account and a Batch Account
- Sign in to the Azure Portal.
- Navigate to "Create a resource", then search and create both:
- A Storage Account (where your Python scripts can be uploaded)
- An Azure Batch Account (used to run your scripts on virtual machines)
- Set up your ADF pipeline
- Go to Azure Data Factory Studio
- Create a new pipeline
- In the Activities pane, expand "Batch Service", and drag the Custom Activity onto the pipeline canvas
- Select the Azure Batch tab, and then select New.
- Complete the New linked service form as follows:
- Name: Enter a name for the linked service, such as AzureBatch1.
- Access key: Enter the primary access key you copied from your Batch account.
- Account name: Enter your Batch account name.
- Batch URL: Enter the account endpoint you copied from your Batch account, such as https://batchdotnet.eastus.batch.azure.com
- Pool name: Enter custom-activity-pool, the pool you created in Batch Explorer.
- Storage account linked service name: Select New. On the next screen, enter a Name for the linked storage service, such as AzureBlobStorage1, select your Azure subscription and linked storage account, and then select Create.
- Select the Settings tab, and enter the following settings:
- Command: Enter cmd /C python <<Your file name>>.py.
- Resource linked service: Select the linked storage service you created, such as AzureBlobStorage1, and test the connection to make sure it's successful.
- Folder path: Select the folder icon, and then select the input container and select OK. The files from this folder download from the container to the pool nodes before the Python script runs.
- Once pipeline established successfully, please validate and debug the pipeline
Please refer this Microsoft documentation for more clarity:
Tutorial: Run a Batch job through Azure Data Factory - Azure Batch | Microsoft Learn
YouTube Url: How to create etl/python pipeline in azure data factory | Azure Data Factory Execute Python script
Let me know if you require any additional information from our end. If these answers your query, do click the "Upvote" and click "Accept the answer" of which might be beneficial to other community members reading this thread.
Thanks,
Kalyani