How to Execute a Shell Script from Azure Data Factory (ADF)

Firthouse M G 60 Reputation points
2024-11-26T09:02:19.2+00:00

Need assistance with executing a shell script located on a Linux batch server hosted on an Azure Virtual Machine. The requirement involves loading data into a target OLAP system, and after the data is loaded, a Linux shell script located on the batch Linux server needs to be executed. This script will contain complex logic to call APIs and transfer data from the OLAP to the OLTP database.

What steps are needed to achieve this? What are the different methods available, and what prerequisites are required?

Hoping to get answers as soon as possible.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,996 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Nandan Hegde 32,906 Reputation points MVP
    2024-11-26T09:22:35.1033333+00:00

    Based on my understanding, ADF in no way can directly trigger a shell script located in a VM.

    You can use Azure automation runbook for that purpose and trigger the runbook via ADF within a dataflow leveraging the webhook activity

    1 person found this answer helpful.

  2. Ganesh Gurram 1,845 Reputation points Microsoft Vendor
    2024-11-29T11:57:52.2033333+00:00

    @Firthouse M G - Thanks for the question and using MS Q&A forum.

    Here’s a step-by-step guide to create an Azure Automation runbook using PowerShell to run a Linux shell script on an Azure VM, and then call this runbook from Azure Data Factory (ADF) using a Webhook activity:

    Create an Azure Automation Account:

    • Navigate to the Azure portal and create a new Automation Account.
    • Create a Runbook within the Automation Account.

    Add the PowerShell Script:

    param (
        [string]$vmName,
        [string]$resourceGroupName,
        [string]$scriptPath,
        [string]$sshUsername,
        [string]$sshPassword
    )
    
    # Authenticate with Azure
    $connectionName = "AzureRunAsConnection"
    $servicePrincipalConnection = Get-AutomationConnection -Name $connectionName
    Connect-AzAccount -ServicePrincipal -Tenant $servicePrincipalConnection.TenantId `
        -ApplicationId $servicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
    
    # Get the public IP of the VM
    $publicIP = (Get-AzPublicIpAddress -ResourceGroupName $resourceGroupName -Name "$vmName-ip").IpAddress
    
    # Use SSH to run the shell script on the Linux VM
    $command = "sshpass -p $sshPassword ssh -o StrictHostKeyChecking=no $sshUsername@$publicIP 'bash -s' < $scriptPath"
    Invoke-Expression $command
    
    
    
    • Save and publish the runbook.

    Create a Webhook for the Runbook

    1. Create a Webhook:
    • In the Automation Account, navigate to the runbook you just created.
      • Click on Webhooks under Resources.
      • Click Add a webhook.
      • Fill in the details and set the parameters you need (e.g., vmName, resourceGroupName, scriptPath, sshUsername, sshPassword).
      • Copy the webhook URL (you will need this for ADF).

    Use Webhook Activity in Azure Data Factory (ADF)

    1. Create a new Pipeline:
      • In ADF, create a new pipeline.
    2. Add a Web Activity:
      • From the Activities pane, drag a Web activity to the pipeline canvas.
      • Click on the Web activity to configure it.
    3. Configure the Web Activity:
      • In the Settings tab, set the following:
      • URL: Paste the webhook URL you copied earlier.
      • Method: POST
      • Headers: Set Content-Type to application/json.
      • Body: Construct the JSON body with the parameters required by your runbook, e.g.:
              {
                  "vmName": "your-vm-name",
                  "resourceGroupName": "your-resource-group",
                  "scriptPath": "/path/to/your/script.sh",
                  "sshUsername": "your-ssh-username",
                  "sshPassword": "your-ssh-password"
              }
              
        
    • Debug the pipeline to ensure it works correctly. Once confirmed, you can trigger the pipeline based on your requirements.

    This setup involves creating an Azure Automation Runbook with PowerShell to execute a Linux shell script on an Azure VM and using an ADF Webhook activity to trigger the runbook. Make sure to replace placeholder values with your actual Azure resources and credentials.

    Hope this helps. Do let us know if you have any further issues!


    If this answers your query, do click `Accept Answer` and `Yes` for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.