Tutorial: Trigger a Batch job using Azure Functions

In this tutorial, you'll learn how to trigger a Batch job using Azure Functions. We'll walk through an example in which documents added to an Azure Storage blob container have optical character recognition (OCR) applied to them via Azure Batch. To streamline the OCR processing, we will configure an Azure function that runs a Batch OCR job each time a file is added to the blob container. You learn how to:

  • Use Batch Explorer to create pools and jobs
  • Use Storage Explorer to create blob containers and a shared access signature (SAS)
  • Create a blob-triggered Azure Function
  • Upload input files to Storage
  • Monitor task execution
  • Retrieve output files


Sign in to Azure

Sign in to the Azure portal.

Create a Batch pool and Batch job using Batch Explorer

In this section, you'll use Batch Explorer to create the Batch pool and Batch job that will run OCR tasks.

Create a pool

  1. Sign in to Batch Explorer using your Azure credentials.
  2. Create a pool by selecting Pools on the left side bar, then the Add button above the search form.
    1. Choose an ID and display name. We'll use ocr-pool for this example.
    2. Set the scale type to Fixed size, and set the dedicated node count to 3.
    3. Select Ubuntuserver > 18.04-lts as the operating system.
    4. Choose Standard_f2s_v2 as the virtual machine size.
    5. Enable the start task and add the command /bin/bash -c "sudo update-locale LC_ALL=C.UTF-8 LANG=C.UTF-8; sudo apt-get update; sudo apt-get -y install ocrmypdf". Be sure to set the user identity as Task user (Admin), which allows start tasks to include commands with sudo.
    6. Select OK.

Create a job

  1. Create a job on the pool by selecting Jobs on the left side bar, then the Add button above the search form.
    1. Choose an ID and display name. We'll use ocr-job for this example.
    2. Set the pool to ocr-pool, or whatever name you chose for your pool.
    3. Select OK.

Create blob containers

Here you'll create blob containers that will store your input and output files for the OCR Batch job. In this example, the input container is named input and is where all documents without OCR are initially uploaded for processing. The output container is named output and is where the Batch job writes processed documents with OCR.

  1. Sign in to Storage Explorer using your Azure credentials.
  2. Using the storage account linked to your Batch account, create two blob containers (one for input files, one for output files) by following the steps at Create a blob container.
  3. Create a shared access signature for your output container in Storage Explorer by right-clicking the output container and selecting Get Shared Access Signature.... Under Permissions, select Write. No other permissions are necessary.

Create an Azure Function

In this section you'll create the Azure Function that triggers the OCR Batch job whenever a file is uploaded to your input container.

  1. Follow the steps in Create a function triggered by Azure Blob storage to create a function.
    1. For runtime stack, choose .NET. We'll write our function in C# to leverage the Batch .NET SDK.
    2. When prompted for a storage account under Hosting, use the same storage account that you linked to your Batch account.
    3. While creating the Azure Blob storage account trigger, be sure to set the path as input/{name} (to match the name of your input container).
  2. Once the blob-triggered function is created, select Code + Test. Use the run.csx and function.proj from GitHub in the Function. function.proj doesn't exist by default, so select the Upload button to upload it into your development workspace.
    • run.csx is run when a new blob is added to your input blob container.
    • function.proj lists the external libraries in your Function code, for example, the Batch .NET SDK.
  3. Change the placeholder values of the variables in the Run() function of the run.csx file to reflect your Batch and storage credentials. You can find your Batch and storage account credentials in the Azure portal in the Keys section of your Batch account.
    • Retrieve your Batch and storage account credentials in the Azure portal in the Keys section of your Batch account.

Trigger the function and retrieve results

Upload any or all of the scanned files from the input_files directory on GitHub to your input container. Monitor Batch Explorer to confirm that a task gets added to ocr-pool for each file. After a few seconds, the file with OCR applied is added to the output container. The file is then visible and retrievable on Storage Explorer.

Additionally, you can watch the logs file at the bottom of the Azure Functions web editor window, where you'll see messages like this for every file you upload to your input container:

2019-05-29T19:45:25.846 [Information] Creating job...
2019-05-29T19:45:25.847 [Information] Accessing input container <inputContainer>...
2019-05-29T19:45:25.847 [Information] Adding <fileName> as a resource file...
2019-05-29T19:45:25.848 [Information] Name of output text file: <outputTxtFile>
2019-05-29T19:45:25.848 [Information] Name of output PDF file: <outputPdfFile>
2019-05-29T19:45:26.200 [Information] Adding OCR task <taskID> for <fileName> <size of fileName>...

To download the output files from Storage Explorer to your local machine, first select the files you want and then select the Download on the top ribbon.


The downloaded files are searchable if opened in a PDF reader.

Clean up resources

You are charged for the pool while the nodes are running, even if no jobs are scheduled. When you no longer need the pool, delete it with the following steps:

  1. In the account view, select Pools and the name of the pool.
  2. Select Delete.

When you delete the pool, all task output on the nodes is deleted. However, the output files remain in the storage account. When no longer needed, you can also delete the Batch account and the storage account.

Next steps

For more examples of using the .NET API to schedule and process Batch workloads, see the samples on GitHub.