Introducing Linux support on Azure Batch
We are excited to announce today a preview of Azure Batch for Linux virtual machines. This brings the power of Batch “scheduling-as-a-service” to customers with Linux applications and workflows across industries and scientific research.
Under the covers, Batch is using Virtual Machine Scale Sets to deploy and manage Linux virtual machines. The Batch agent that manages job and task execution on compute nodes is written in Python for portability. This compliments the support in Batch for Cloud Services. VM Scale Sets will provide us with additional features down the road such as custom VM images.
This is a short guide about how to use Linux on Batch. A detail document will be available on Azure Batch documentation site shortly.
- Prerequisite
- Using Linux on Batch
- Command Line Support (Azure xplat CLI)
- Troubleshooting - Using SSH
- Linux Support Matrix
- Pricing Information
- Reference
Prerequisite
Batch Account
Batch account can be created on azure management portal. See here for detail instruction.
Client SDK
Batch service provides a set of SDK to interact with Batch service.
- Dot Net SDK can be downloaded from Nuget.
- Python SDK is part of the Azure Python SDK on PyPi.
- Node.js SDK can be found on npm.
- Java support will come shortly in the future.
Using Linux on Batch
If you are not familiar with Azure Batch service, here is a good step by step tutorial on using Batch.
A typical Batch application involves 4 steps - creating a client, creating a pool, creating a job, and adding tasks to the job. The behavior is the same as normal Batch client with the only difference being the way to specify OS of the pool virtual machines.
With Linux on Batch, instead of OS family and version, the client must specify publisher, offer, sku, and version of the VM image, plus node agent sku id. You can find supported Linux distribution in the section later in the blog post. For example, the code in section #2 creates a pool based on latest version of Ubuntu 14.04-LTS.
1. Initializing Batch Client
creds = SharedKeyCredentials(account, key)
config = BatchServiceConfiguration(creds, base_url = batch_url)
client = BatchService(config)
2. Create a pool using Python SDK.
new_pool = PoolAddParameter(id = "<pool name>", vm_size = "e.g. STANDARD_A4")
new_pool.target_dedicated = 4
start_task = StartTask()
start_task.run_elevated = Truestart_task.command_line = '<pool prep task cli>'
new_pool.start_task = start_task
ir = ImageReference(
publisher = "Canonical",
offer = "UbuntuServer",
sku = "14.04.2-LTS",
version = "latest")
vmc = VirtualMachineConfiguration(
image_reference = ir,
node_agent_sku_id = "Batch.Node.Ubuntu 14.04")
new_pool.virtual_machine_configuration = vmc
3. Submit a job using Python SDK.
pool_info = PoolInformation(pool_id = pool_id)
job = JobAddParameter(id = jobId, pool_info = pool_info)
job_preparation_task = JobPreparationTask()
job_preparation_task.command_line = "<start up cli> "
job_preparation_task.run_elevated = True # If specified, task will run as sujob.job_preparation_task = job_preparation_task
client.job.add(job)
4. Add task to the job is also easy.
taskId = '<task name>'
taskCmdLine = '<task cli>'
task = TaskAddParameter(id = taskId, command_line=taskCmdLine)
task.run_elevated = True # If specified, task will run as su
client.task.add(bound_job.id, task)
For advanced features, refer to MSDN and Python API reference document.
Command Line Support (Azure x-plat CLI)
Azure Batch support has been added to the latest version of Azure Command-Line Interface (xplat cli). Install Azure CLI via npm to interact with Batch service through console.
The initial release will provide account management and job/task submission through "azure batch" sub commands. Support of Linux pool management will come shortly.
Troubleshooting - Using SSH
RDP cannot be used against Linux VMs. Instead, Batch expose SSH access to all nodes in the pool. The following Python code shows how to find SSH information of the nodes.
nodes = client.compute_node.list(pool_id)
for node in nodes:
login = client.compute_node.get_remote_login_settings(pool_id,
node.id)
print("{0} {1} {2} {3}".format(node.id,
node.state,
login.remote_login_ip_address,
login.remote_login_port)
Sample output. You will find the IP and SSH port of the node.
tvm-3436469628_1-20160320t055249z ComputeNodeState.idle 52.160.94.74 50002
tvm-3436469628_2-20160320t055249z ComputeNodeState.idle 52.160.94.74 50003
tvm-3436469628_3-20160320t055249z ComputeNodeState.idle 52.160.94.74 50000
tvm-3436469628_4-20160320t055249z ComputeNodeState.idle 52.160.94.74 50001
Remember to call add user beforehand for SSH logon credential. Once logged on, one can setup key based authentication just like normal Linux VM. Here is Python code to add user. Note that one will be prompted for entering a password, however, it's also possible to specify a public key for key based authentication.
import getpass
pool_id = ...
username = ...
password = getpass.getpass()
user = ComputeNodeUser()
user.name = username
user.password = password
user.is_admin = True
user.expiry_time = (datetime.datetime.today() + datetime.timedelta(days=30)).isoformat()
nodes = client.compute_node.list(pool_id)
for node in nodes:
client.compute_node.add_user(pool_id, node.id, user)
Linux Distribution Support Matrix
Distro | Publisher | Offer | SKU | NodeAgentSKUId |
Ubuntu | Canonical | UbuntuServer | 14.04.0-LTS | batch.node.ubuntu 14.04 |
14.04.1-LTS | batch.node.ubuntu 14.04 | |||
14.04.2-LTS | batch.node.ubuntu 14.04 | |||
14.04.3-LTS | batch.node.ubuntu 14.04 | |||
14.04.4-LTS | batch.node.ubuntu 14.04 | |||
15.10 | batch.node.debian 8 | |||
Debian | Credativ | Debian | 8 | batch.node.debian 8 |
SUSE | SUSE | openSUSE | 13.2 | batch.node.opensuse 13.2 |
openSUSE-Leap | 42.1 | batch.node.opensuse 42.1 | ||
SLES | 12 | batch.node.opensuse 42.1 | ||
SLES | 12-SP1 | batch.node.opensuse 42.1 | ||
SLES-HPC | 12 | batch.node.opensuse 42.1 | ||
CentOS | OpenLogic | CentOS | 7.0 | batch.node.centos 7 |
7.1 | batch.node.centos 7 | |||
7.2 | batch.node.centos 7 | |||
Oracle Linux | Oracle | Oracle-Linux-7 | OL70 | batch.node.centos 7 |
Note, this is not an exhaustive list and may subject to change. The client should use ListNodeAgentSKU REST API call to get a full list of supported image on the account.
Pricing
Azure Batch is built on Azure Cloud Service and Azure Virtual Machine technology. Batch itself is offered in free tier which means you are only charged for the compute resource you are using. When creating a pool (either through https://portal.azure.com or API), you can choose what type of pool you want to create. If you choose Cloud Service which is Windows only, you will be charged based on the Cloud Service pricing meters. If you choose Virtual Machine which provides Linux OS, you will be charged based on Linux Virtual Machine pricing meters. At the time of this blog, Linux VM price is lower than Cloud Service/Windows VM.
References
MSDN API reference (Coming soon.)
Samples (Coming soon.)
Comments
- Anonymous
March 30, 2016
Do the Linux batch instances cost less than Windows instances, just like they do in IaaS? - Anonymous
March 30, 2016
@ImanA - That's correct. Batch is free tier so user is only charged for the VM one uses. If they create a Linux node pool, the unit price will be lower than Windows node pool. - Anonymous
March 30, 2016
Thanks Yiding. Are the prices based on the VM or Cloud Services? Because on the Azure Pricing Calculator there are no prices for Linux instances of Cloud Services (only those of VMs). - Anonymous
March 30, 2016
@ImanA, The price is based on VM. Batch Linux support is built on top of Azure Virtual Machine Scale Set technology and thus VM utilization will be charged based on Virtual Machine price.
See this page for detail - https://azure.microsoft.com/en-us/pricing/details/batch/ - Anonymous
March 30, 2016
@ImanA, I updated the blog to add a pricing section. Thanks for your comments. - Anonymous
March 30, 2016
Thanks Yiding, it is pretty clear now. - Anonymous
April 01, 2016
The comment has been removed - Anonymous
April 07, 2016
Is there any location I could go to post questions about this? I'm trying to work with the .NET Azure Library and getting some errors. Guessing this isn't the best place to post them. Thanks for the awesome work! - Anonymous
April 07, 2016
Is it possible to use the PowerShell for Batch with Linux VMs (similar to https://azure.microsoft.com/en-us/documentation/articles/batch-powershell-cmdlets-get-started)? - Anonymous
April 12, 2016
@Samuel Bryfczynski - Yes. You can use our forum - https://social.msdn.microsoft.com/forums/azure/en-US/home?forum=azurebatch - Anonymous
April 12, 2016
@ImanA - Yes. You can use PowerShell for Batch to create Linux pool. - Anonymous
April 12, 2016
@TimJRoberts1 - Thanks. Fixed. - Anonymous
April 21, 2016
Hi Yiding. I am trying to create a Linux pool using New-AzureBatchPool. How can I specify the Image Reference and the sku id? It doesn't seem to allow such options. - Anonymous
April 25, 2016
@ImanA, the PowerShell support for creating a Linux pool will not be available until the next Azure PowerShell release, currently scheduled for early May. Since you're familiar with PowerShell, maybe our .NET library would be an acceptable substitute if you need this functionality now?
.NET library available here: https://www.nuget.org/packages/Azure.Batch/ - Anonymous
May 07, 2016
The link here is still broken "here is a good step by step tutorial on using Batch."- Anonymous
June 09, 2016
Thanks. Fixed.
- Anonymous
- Anonymous
May 12, 2016
How can I get root access for my tasks? I get the "sudo: sorry, you must have a tty to run sudo" error message when I sudo.- Anonymous
May 30, 2016
Hi! Is there any update on the root-access issue? Thanks!- Anonymous
June 06, 2016
Hi,You don't need sudo to run your task as root. You can simply set the "RunElevated" flag on the task when you submit it and the task will be run elevated automatically.- Anonymous
June 14, 2016
Great, thank you!
- Anonymous
- Anonymous
- Anonymous
- Anonymous
June 07, 2016
Can we run hive scripts on any of the supported VM images?- Anonymous
June 09, 2016
Nothing stops you from doing that. But you need to prepare Hadoop/Java/Hive by yourself on the VM.
- Anonymous
- Anonymous
June 08, 2016
Hey all of the links for the HPC Pack SDK no longer work on the Microsoft web site. Attempting to download any of the SDK versions results in an HTML 404 error. Is this product no longer being developed or supported?