Introduction

Completed

You're an IT operations professional at Contoso, an organization which helps client organizations deploy and operate high performance computing (HPC) technologies. Recent projects include economic forecasting, financial services, industrial design and, artificial intelligence. Contoso relies heavily on Slurm (Simple Linux Utility for Resource Management) as a job scheduler and resource manager for the Linux HPC clusters on which these projects run. As their existing hardware ages and requires replacement, Contoso is exploring the feasibility of moving some of their HPC workloads into Azure using the Azure CycleCloud HPC management platform. As an IT professional responsible for managing Contoso's HPC technologies, you're interested in understanding how you can integrate Slurm with Azure CycleCloud to meet your organization's HPC computing project needs.

Learning objectives

By the end of this module, you'll be able to describe the following:

  • Describe the Slurm job scheduler and resource manager.
  • Understand how Slurm integrates with Azure CycleCloud.
  • Troubleshoot common problems for Slurm-managed jobs that run in Azure CycleCloud.

Prerequisites

For the best learning experience from this module, you should already have the following knowledge and experience:

  • Basic understanding of Azure CycleCloud.
  • Basic understanding of HPC job management.