Run batch endpoints from Event Grid events in storage
APPLIES TO:
Azure CLI ml extension v2 (current)
Python SDK azure-ai-ml v2 (current)
Event Grid is a fully managed service that enables you to easily manage events across many different Azure services and applications. It simplifies building event-driven and serverless applications. In this tutorial, we learn how to trigger a batch endpoint's job to process files as soon as they are created in a storage account. In this architecture, we use a Logic App to subscribe to those events and trigger the endpoint.
The workflow looks as follows:
A file created event is triggered when a new blob is created in a specific storage account.
The event is sent to Event Grid to get processed to all the subscribers.
A Logic App is subscribed to listen to those events. Since the storage account can contain multiple data assets, event filtering will be applied to only react to events happening in a specific folder inside of it. Further filtering can be done if needed (for instance, based on file extensions).
The Logic App will be triggered, which in turns will:
It will get an authorization token to invoke batch endpoints using the credentials from a Service Principal
It will trigger the batch endpoint (default deployment) using the newly created file as input.
The batch endpoint will return the name of the job that was created to process the file.
Important
When using Logic App connected with event grid to invoke batch endpoint, you are generateing one job per each blob file created in the sotrage account. Keep in mind that since batch endpoints distribute the work at the file level, there will not be any parallelization happening. Instead, you will be taking advantage of batch endpoints's capability of executing multiple jobs under the same compute cluster. If you need to run jobs on entire folders in an automatic fashion, we recommend you to switch to Invoking batch endpoints from Azure Data Factory.
Prerequisites
- This example assumes that you have a model correctly deployed as a batch endpoint. This architecture can perfectly be extended to work with Pipeline component deployments if needed.
- This example assumes that your batch deployment runs in a compute cluster called
batch-cluster
. - The Logic App we are creating will communicate with Azure Machine Learning batch endpoints using REST. To know more about how to use the REST API of batch endpoints read Create jobs and input data for batch endpoints.
Authenticating against batch endpoints
Azure Logic Apps can invoke the REST APIs of batch endpoints by using the HTTP activity. Batch endpoints support Microsoft Entra ID for authorization and hence the request made to the APIs require a proper authentication handling.
We recommend to using a service principal for authentication and interaction with batch endpoints in this scenario.
Create a service principal following the steps at Register an application with Microsoft Entra ID and create a service principal.
Create a secret to use for authentication as explained at Option 3: Create a new client secret.
Take note of the client secret Value that is generated. This is only displayed once.
Take note of the
client ID
and thetenant id
in the Overview pane of the application.Grant access for the service principal you created to your workspace as explained at Grant access. In this example the service principal will require:
- Permission in the workspace to read batch deployments and perform actions over them.
- Permissions to read/write in data stores.
Enabling data access
We will be using cloud URIs provided by Event Grid to indicate the input data to send to the deployment job. Batch endpoints use the identity of the compute to mount the data while keeping the identity of the job to read it once mounted. Hence, we need to assign a user-assigned managed identity to the compute cluster in order to ensure it does have access to mount the underlying data. Follow these steps to ensure data access:
Create a managed identity resource:
Update the compute cluster to use the managed identity we created:
Note
This examples assumes you have a compute cluster created named
cpu-cluster
and it is used for the default deployment in the endpoint.Go to the Azure portal and ensure the managed identity has the right permissions to read the data. To access storage services, you must have at least Storage Blob Data Reader access to the storage account. Only storage account owners can change your access level via the Azure portal.
Create a Logic App
In the Azure portal, sign in with your Azure account.
On the Azure home page, select Create a resource.
On the Azure Marketplace menu, select Integration > Logic App.
On the Create Logic App pane, on the Basics tab, provide the following information about your logic app resource.
Property Required Value Description Subscription Yes <Azure-subscription-name> Your Azure subscription name. This example uses Pay-As-You-Go. Resource Group Yes LA-TravelTime-RG The Azure resource group where you create your logic app resource and related resources. This name must be unique across regions and can contain only letters, numbers, hyphens ( -
), underscores (_
), parentheses ((
,)
), and periods (.
).Name Yes LA-TravelTime Your logic app resource name, which must be unique across regions and can contain only letters, numbers, hyphens ( -
), underscores (_
), parentheses ((
,)
), and periods (.
).Before you continue making selections, go to the Plan section. For Plan type, select Consumption to show only the settings for a Consumption logic app workflow, which runs in multi-tenant Azure Logic Apps.
The Plan type property also specifies the billing model to use.
Plan type Description Standard This logic app type is the default selection and runs in single-tenant Azure Logic Apps and uses the Standard billing model. Consumption This logic app type runs in global, multi-tenant Azure Logic Apps and uses the Consumption billing model. Important
For private-link enabled workspaces, you need to use the Standard plan for Logic Apps with allow private networking configuration.
Now continue with the following selections:
Property Required Value Description Region Yes West US The Azure datacenter region for storing your app's information. This example deploys the sample logic app to the West US region in Azure.
Note: If your subscription is associated with an integration service environment, this list includes those environments.Enable log analytics Yes No This option appears and applies only when you select the Consumption logic app type. Change this option only when you want to enable diagnostic logging. For this tutorial, keep the default selection. When you're done, select Review + create. After Azure validates the information about your logic app resource, select Create.
After Azure deploys your app, select Go to resource.
Azure opens the workflow template selection pane, which shows an introduction video, commonly used triggers, and workflow template patterns.
Scroll down past the video and common triggers sections to the Templates section, and select Blank Logic App.
Configure the workflow parameters
This Logic App uses parameters to store specific pieces of information that you will need to run the batch deployment.
On the workflow designer, under the tool bar, select the option Parameters and configure them as follows:
To create a parameter, use the Add parameter option:
Create the following parameters.
Parameter Description Sample value tenant_id
Tenant ID where the endpoint is deployed. 00000000-0000-0000-00000000
client_id
The client ID of the service principal used to invoke the endpoint. 00000000-0000-0000-00000000
client_secret
The client secret of the service principal used to invoke the endpoint. ABCDEFGhijkLMNOPQRstUVwz
endpoint_uri
The endpoint scoring URI. https://<endpoint_name>.<region>.inference.ml.azure.com/jobs
Important
endpoint_uri
is the URI of the endpoint you are trying to execute. The endpoint must have a default deployment configured.Tip
Use the values configured at Authenticating against batch endpoints.
Add the trigger
We want to trigger the Logic App each time a new file is created in a given folder (data asset) of a Storage Account. The Logic App uses the information of the event to invoke the batch endpoint and pass the specific file to be processed.
On the workflow designer, under the search box, select Built-in.
In the search box, enter event grid, and select the trigger named When a resource event occurs.
Configure the trigger as follows:
Property Value Description Subscription Your subscription name The subscription where the Azure Storage Account is placed. Resource Type Microsoft.Storage.StorageAccounts
The resource type emitting the events. Resource Name Your storage account name The name of the Storage Account where the files will be generated. Event Type Item Microsoft.Storage.BlobCreated
The event type. Click on Add new parameter and select Prefix Filter. Add the value
/blobServices/default/containers/<container_name>/blobs/<path_to_data_folder>
.Important
Prefix Filter allows Event Grid to only notify the workflow when a blob is created in the specific path we indicated. In this case, we are assumming that files will be created by some external process in the folder
<path_to_data_folder>
inside the container<container_name>
in the selected Storage Account. Configure this parameter to match the location of your data. Otherwise, the event will be fired for any file created at any location of the Storage Account. See Event filtering for Event Grid for more details.The trigger will look as follows:
Configure the actions
Click on + New step.
On the workflow designer, under the search box, select Built-in and then click on HTTP:
Configure the action as follows:
Property Value Notes Method POST
The HTTP method URI concat('https://login.microsoftonline.com/', parameters('tenant_id'), '/oauth2/token')
Click on Add dynamic context, then Expression, to enter this expression. Headers Content-Type
with valueapplication/x-www-form-urlencoded
Body concat('grant_type=client_credentials&client_id=', parameters('client_id'), '&client_secret=', parameters('client_secret'), '&resource=https://ml.azure.com')
Click on Add dynamic context, then Expression, to enter this expression. The action will look as follows:
Click on + New step.
On the workflow designer, under the search box, select Built-in and then click on HTTP:
Configure the action as follows:
Property Value Notes Method POST
The HTTP method URI endpoint_uri
Click on Add dynamic context, then select it under parameters
.Headers Content-Type
with valueapplication/json
Headers Authorization
with valueconcat('Bearer ', body('Authorize')['access_token'])
Click on Add dynamic context, then Expression, to enter this expression. In the parameter Body, click on Add dynamic context, then Expression, to enter the following expression:
replace('{ "properties": { "InputData": { "mnistinput": { "JobInputType" : "UriFile", "Uri" : "<JOB_INPUT_URI>" } } } }', '<JOB_INPUT_URI>', triggerBody()?[0]['data']['url'])
Tip
The previous payload correspond to a Model deployment. If you are working with a Pipeline component deployment, please adapt the format according to the expectations of the pipeline's inputs. Learn more about how to structure the input in REST calls at Create jobs and input data for batch endpoints (REST).
The action will look as follows:
Note
Notice that this last action will trigger the batch job, but it will not wait for its completion. Azure Logic Apps is not designed for long-running applications. If you need to wait for the job to complete, we recommend you to switch to Run batch endpoints from Azure Data Factory.
Click on Save.
The Logic App is ready to be executed and it will trigger automatically each time a new file is created under the indicated path. You will notice the app has successfully received the event by checking the Run history of it:
Next steps
Feedback
Submit and view feedback for