다음을 통해 공유


Azure Compute: Introduction To Azure Batch

Introduction

In this post, we talk about Azure Batch Service. This service was created to execute parallels batch workloads without having the stress of the resources that they need to run these jobs. And that happens because Azure Batch is responsible to create and manage pools of compute nodes. Imagine a lot of parallel batch workloads separated each from another. In a quick thought we can say that is similar to another Azure service VMSS (Virtual Machine Scale Sets).

We can start processing parallels workloads via Azure Batch service ONLY by using APIs and tools. These can be used to Create and manage a pool of compute nodes, and then we can schedule to run different jobs and tasks.

For Azure Batch service we need to pay for the used resources (compute & storage).

 

Azure Batch Use Cases

In the table below we can see some of the Azure Batch use cases with an example to understand where Batch service used in every case.

Use Case Example
Financial Data Analysis Imagine the financial data analysis for a bank
Software Testing A developing software team can run multiple parallel tests for an application
ΑI Solution A good example is a Face recognition service

Azure Batch Architecture

At the image below we can see a sample of Azure Batch architecture.

 

Note

A Pool execute a job and the Nodes(VMs) of the Pool executes one or more tasks/jobs.

 

Parallel Job Processing

What Azure Batch service do well  is to break down a job to several tasks, with this technique the application runs independently in each VM of the pool and each result of these tasks complete the parts of the work until we have the final result. The following example show exactly how this works.

 

Back to top

Azure Batch Features

Azure Batch service features are divided in three major categories:

  1. Pools Management
  2. Jobs And Tasks Management
  3. Batch Solutions Monitoring

Pools Management

The resource management feature refers to Nodes, Auto-Scaling, Low Priority Nodes and Application Packages.

Nodes 

In this section there are three categories, Cloud Service (Web and Worker Roles), Virtual Machines, and Custom Image VM.

Auto-Scaling

In Azure Batch service we can define parameters based on the deployment needs which enable the auto-scale.

Low Priority Nodes

Azure Batch also offers Low Priority VMs which, are offered for low-priority workloads. This feature is NOT for critical workloads and is recommended if someone needs to reduce the Batch Workload cost.

Application Packages

The Application feature is about the app packages we are able to upload to the Batch Service Account and then are automatically deployed on one or more nodes in the pool.

Jobs And Tasks Management

Azure Batch manages Jobs and Tasks execution and scheduling. In some cases the input of task A could be the output of task B, that means that the tasks could depend on the one from another. Another great benefit is that the tasks can also run on multiple computer nodes.

Batch Solutions Monitoring

There are several tools we can monitor the nodes and the jobs with tasks in the Azure Batch Service. These are:

Back to top

Azure Batch Concepts

Service Quotas And Limits

In this point of the post, we are going to read about the Quotas and the Limits of Azure Batch Service. We should read carefully the following tables because if we don't understand the meaning of the values for the quotas and the limitations might have future problems with the workloads.

 

Resource Quotas

Service Quota is quite important for the Azure Batch workloads because it is very likely in a rough design might be reached this limit.

Resource Default Limit Maximum Limit
Batch accounts per region per subscription 1-3 50
Dedicated cores per Batch account 10-100 N/A
Low-priority cores per Batch account 10-100 N/A
Active jobs and job schedules per Batch account 100-300 1000
Pools per Batch account 20-100 500

Pool Size Limits

The Pool Size is the number of the nodes, as a single node consider a virtual machine.

The next table shows the limits for the pool size.

Resource Maximum Limit
Compute nodes in inter-node communication enabled pool
Batch service pool allocation mode 100
Batch subscription pool allocation mode 80
Compute nodes in  
Dedicated nodes 2000
Low-priority nodes 1000

 

Note

We can choose high-priority nodes, which are dedicated VMs and low-priority nodes, of course there some limitations which can found in this section of the post.

Other Limits

All the other limits are relevant with the Azure Batch Workloads details.

Resource Maximum Limit
Concurrent tasks per compute node 4 x number of node cores
Applications per Batch account 20 
Application packages per application 40
Maximum task lifetime 180 days

If the workloads need to increase the quota on an Azure Batch Account, then we can follow the directions at this link.

Supported VM Sizes 

When we create an Azure Batch Pool, it is very important to select the correct VM size for the nodes of the Pool.

At the tables below we can see what are the sizes that the Azure Batch Pool DOES NOT support.

Family Unsupported sizes
Basic A-series Basic_A0(A0)
A-series Standard_A0
B-series All
DC series All
Extreme memory optimized All
Hb-series* All
Hc-series* All
Lsv2-series* All
NDv2-series* All
NVv2-series* All
SAP-HANA All

 

Note

* Not currently supported, but will be supported in the future

VM size which are supported for Low-Priority nodes

Family Supported Sizes
M-Series Standard_M64ms
M-Series Standard_M128ms

Virtual Machine Image Type

There are two types of images that we can choose between Pre-configured and Custom. Of course, there are some differences between those two types :

Pre-configured Image Custom Image
The image already exists Need to create a new one
No need for updating & patching Need patching & updating
All custom software need to be installed via pool config No need for large changes in the pool config

 

Back to top

How It Works

In the following steps, we will see a quick demo of the Azure Batch Service.

Prerequisites

To proceed further with the demo we must be sure that we have all the following:

Create The Azure Batch Account

Search for the Azure Batch Service

From the Azure Portal left main blade, select + Create a resource, type [Batch Service], and select to Create the Batch Service.

Basics Tab

In the Basics Tab we have to  fill in few fields and move to the tab "Advanced"

Setting  Value
Subscription Select a valid subscription
Resource group Select an existing or create a new Resource group
Account name Type a name for the Azure Batch Account, MUST be unique
Location Select a Location for the Batch Instance
Select a Storage account Select an existing storage account or after deployment complete, create a storage account and link it with the Azure Batch account.

Advanced Tab

In the Advanced tab, we must choose a Pool allocation mode. The choices are two Batch service and User subscription, for the demo purposes we select Batch service.

Batch service The pool VMs are created using behind-the-scenes Batch service subscriptions.
User subscription The pool VMs are created directly in the same subscription as the Batch account.

Check this blog post about Azure Batch capabilities for more details. 

Review + Create Tab

In the Review + create tab, we just need to check if the validation passed and click Create to start the Azure Batch Account deployment.

After the Azure Batch account is created we are ready to see how this works. And that is the juicy part of this post.

Note

For the part of the demo we will use an existing project from GitHub, which is coded from the user dlepow.

 

Back to top

Azure Batch Sample

First, we must connect to GitHub and move on the "Azure-Samples/batch-dotnet-ffmpeg-tutorial" section, by clicking here.

Run the file BatchDotnetTutorialFfmpeg.sln

As we can see there are some Dependencies missing, for that reason we select Built - Rebuild Project

We must download and install the .Net Core SDK, and then Build the Project (Build - Build Solution).

After the Build is complete successfully, the Dependencies looks fine.

Note

If after the build we get the following error, then we should close and open the VS.

Download & Install Application Packages

One of the prerequisites for this Azure Batch app is to download and upload the ffmpeg3.4 to the Azure Batch Service. This can be done by following the next steps:

  1. Download the 64bit ffmpeg 3.4 file from this here.
  2. Upload the zip file "ffmpeg-3.4-win64-static.zip" to Azure Batch Service, from the left menu blade Features - Applications - + Add

 

The Code Part

After we successfully complete the Build of the solution then we must make some changes to the code.

public class  Program
    {
        // Update the Batch and Storage account credential strings below with the values unique to your accounts.
        // These are used when constructing connection strings for the Batch and Storage client objects.
 
        // Batch account credentials
        private const  string BatchAccountName = "xxxxxxxxx";
        private const  string BatchAccountKey = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==";
        private const  string BatchAccountUrl = "https://xxxxxxxxxxxx.westeurope.batch.azure.com";
 
        // Storage account credentials
        private const  string StorageAccountName = "xxxxxxxxxxxxxxxxxxxxx";
        private const  string StorageAccountKey = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx==";

The Batch and Storage Account Credentials are in the Batch Account dashboard in Azure Portal (see the image below).

 

Back to top

Running The App

We complete all the necessary steps with download, installations, configurations etc. And the next thing to do is just Running the Application.

 

Note

This App is processing media files in parallel using the ffmpeg tool

The first thing when the app runs is the following cmd prompt console

Create Containers [Input] - [Output]

The Console App creates two Storage Containers (Input, Output).

Upload The Media Files

The second step is to begin media files uploading

Create The Batch Pool

The App creates 5 low-priority nodes inside the Batch Pool and the 5 Tasks that will run parallel in every node. At the two next images we can see exactly the Pool with the nodes in the Azure Portal.

The Running Tasks

At the image below we can clearly see, what about Azure Batch parallel workload works

The Final Results

For the final step we don't have something to do, all the 5 files are processed and created in the Output folder.

Conclusion

In this post, we made a quick intro to Azure Batch a service that is basically addressed to developers but also can be useful and a very important tool for other groups like IT or in our days much better DevOps.

See Also

Back to top