Managing Your Azure Data Lake Analytics Compute Resources (Overview)
Azure Data Lake Analytics (ADLA) is a powerful job service that allows organizations to run small or large U-SQL analytics jobs, on demand. You only pay for the compute capacity that you request for those jobs. Because the capacity is automatically scaled to fulfil the job requirements, there is no need to provision capacity for your peak requirements and then worry about underutilization.
That said, as part of our ongoing feedback loop, customers have asked us for to provide them with more ways to ensure that their most business-critical jobs are running efficiently side-by-side with experimental/ad-hoc jobs without substantial increase in costs.
ADLA has now introduced two levels of policies that are designed to help you manage your compute resources:
- Account level policies control how many jobs can run simultaneously, how many AUs are available to these jobs, etc. These policies apply to all jobs.
- Job level policies control the maximum AUs and priority of each job being submitted based on different users or security groups.
You can find the policy settings in the Azure portal. Navigate to your Data Lake Analytics account, then click Properties.
You can find more details about account level policies and job level policies in our blog postings. Here let's look at how ADLA policies can be used to solve some common scenarios.
Typical Use Cases
Scenario 1: Prevent one job from blocking all other jobs
Customer has 10 developers sharing the same ADLA account. To ensure that they can develop in parallel and no single job can take all the resources, they want to limit the AUs per job to a maximum of 25 so that at least 10 jobs can run concurrently. This way, developers can quickly and iteratively develop/debug their jobs.
Default Policy:
- Job AU limit: 25
By configuring the job AU limit, you can make sure that no single job can take up all the AUs available in this account. Otherwise, a single low priority job that is running with all the available AUs in the account can block higher priority jobs.
Scenario 2: Set One Specific Group to Different Limits
New members are joining and sharing the same ADLA account. To prevent any new members, who are just learning ADLA, from mistakenly submitting a job that consumes too much compute resource (increasing cost and blocking other jobs), customers want to set the maximum AU per job for new employees at 30 AUs while others can submit jobs with up to 100 AUs.
Default Policy:
- Job AU limit: 100
- Priority limit: 1
Exception Policy: New Employee Policy
- Job AU limit: 30
- Priority limit: 200
- Group: New Employee Group
First add the new employees into a NewEmployee group. Then with the above setting, new employees are only allowed to submit jobs with maximum of 30 AUs and low priority.
Scenario 3: Guarantee AUs for High Priority Jobs
The ADLA account is being used for both critical production jobs as well as development/ad-hoc experimental jobs. Customer wants to ensure that production jobs can be submitted with a maximum of 100 AUs and at the highest priority of 10. The development jobs can use up to 30 AU per job and highest priority of 200. All the other jobs can only use 10 AUs each with the highest priority of 500. This ensures that production jobs are executed as soon as possible (because nobody else can submit higher priority jobs) with the appropriate number of AUs.
Default Policy:
- Job AU limit: 10
- Priority limit: 500
Exception Policy 1: Production Jobs Policy
- Job AU limit: 100
- Priority limit: 10
- Group: Production Job Submission Machines
Exception Policy 2: Development Jobs Policy
- Job AU limit: 30
- Priority limit: 200
- Group: Engineers in the team
With the above policy, production jobs can be submitted with many AUs (up to 100) and with higher priority (10 or more). Development jobs submitted by engineers on the team can have a reasonable number of AUs (up to 30) but with a lower priority (no higher than 200). All other ad-hoc jobs will be given very limited resources; no more than 10 AUs and priority no higher than 500.