Analysing Enterprise Azure spend by Tags
Azure Resource Manager allows you to assign custom "tags" to resources such as VMs or storage accounts. This allows you to put custom metadata onto your resources, such as which application it belongs to, which department owns it and which environment it is a part of. While the tag information is available through Azure's billing APIs, the format of this information makes it tricky to analyse your overall Azure spend by tag values. This post shows one way to put the tag data into a structured format that will allow you to easily analyse billing data by tags using Excel or Power BI.
Note : Azure’s capabilities and APIs change rapidly. This information is current as of March 2016. If you’re reading this in the distant future, you may want to check with other sources on current capabilities and best practices.
Scenarios
If you've deployed just one or two solutions to the cloud, your spend is probably pretty predictable - it may go up or down over time, but you should be able to correlate this with things like usage of your application and scaling up of resources. However once you have a lot of different teams deploying to the cloud, each with different applications in different stages of their lifecycle and multiple environments, it can be very difficult to understand what's deployed and what it is all costing.
To help you find some order in this chaos, any Azure resources deployed using the Azure Resource Manager model can be annotated with Tags. Tags are just name-value pairs of text strings and can be used to convey whatever you want - but the most obvious uses are to categorise resources according to some of the dimensions I listed above, such as application name, owning team/department or environment. Because you can apply multiple tags to a single resource, you can categorise your resources in multiple ways - for example you could view all the resources that make up Project Vulcan, all resources owned by Q-Branch or all resources deployed to the Staging environment.
Of course, tagging is only useful if the tags are applied consistently to your resources. To facilitate this, you can define Policies that will require all resoruces to be tagged in accordance with your chosen taxonomy, or the deployment will fail.
Accessing Azure Billing Data
To enable people to analyse their Azure usage costs, the billing data is exposed through a couple of different APIs. If you have a Pay-as-you-go subscription you can use the Resource Usage and Resource Ratecard APIs. However organisations with a large number of Azure resources deployed are more likely to have Enterprise Agreement (EA) subscriptions, in which case you can use the EA Billing and Usage API to get this data. The sample in this post uses the EA approach.
Once you have the data downloaded, you can analyse it using tools like Excel or Power BI. In fact if you're using Power BI you can avoid directly accessing the API altogether, as the Microsoft Azure Enterprise content pack for Power BI can automatically connect to the API, download your data and show it on a pretty interactive dashboard.
So, problem solved, right? Alas no... while the data returned from the APIs does include the resource tags, the complete set of tags assigned to each resource are trapped within a single field in JSON format, as shown below. So while we technically have all the data required to analyse usage and cost by tag, it's not in a format that tools like Excel or Power BI can make use of.
Making this Work
To allow for meaningful analysis by tag, we need to take that single blob of JSON in the Tags field and split it into its component elements. To do this, I started with a script that Christer Ljung put together which uses the EA Usage API to download usage data and save to CSV. I then fixed a few things to cater for changes in the API since he wrote his script and added some additional logic to split the tags into individual columns. The complete PowerShell is below; to call it copy it to a .ps1 file and call it with .\Download_BillingData.ps1 <EnrollmentNbr> <Key> <Month> where you can get your enrollment number and key from https://ea.azure.com and <Month> is formatted like 2016-03.
Param(
[string]$EnrollmentNbr = "",
[string]$Key = "",
[string]$Month = ""
)
# access token is "bearer " and the the long string of garbage
$AccessToken = "Bearer $Key"
$urlbase = 'https://ea.azure.com'
$csvAll = @()
Write-Verbose "$(Get-Date -format 's'): Azure Enrollment $EnrollmentNbr"
# function to invoke the api, download the data, import it, and merge it to the global array
Function DownloadUsageReport( [string]$LinkToDownloadDetailReport, $csvAll )
{
Write-Verbose "$(Get-Date -format 's'): $urlbase/$LinkToDownloadDetailReport)"
$webClient = New-Object System.Net.WebClient
$webClient.Headers.add('api-version','1.0')
$webClient.Headers.add('Authorization', "$AccessToken")
$data = $webClient.DownloadString("$urlbase/$LinkToDownloadDetailReport")
# remove the funky stuff in the leading rows - skip to the first header column value
$pos = $data.IndexOf("AccountOwnerId")
$data = $data.Substring($pos-1)
# convert from CSV into an ps variable
$csvM = ($data | ConvertFrom-CSV)
# merge with previous
$csvAll = $csvAll + $csvM
Write-Verbose "Rows = $($csvM.length)"
return $csvAll
}
if ( $Month -eq "" )
{
# if no month specified, invoke the API to get all available months
Write-Verbose "$(Get-Date -format 's'): Downloading available months list"
$webClient = New-Object System.Net.WebClient
$webClient.Headers.add('api-version','1.0')
$webClient.Headers.add('Authorization', "$AccessToken")
$months = ($webClient.DownloadString("$urlbase/rest/$EnrollmentNbr/usage-reports") | ConvertFrom-Json)
# loop through the available months and download data.
# List is sorted in most recent month first, so start at end to get oldest month first
# and avoid sorting in Excel
for ($i=$months.AvailableMonths.length-1; $i -ge 0; $i--) {
$csvAll = DownloadUsageReport $($months.AvailableMonths.LinkToDownloadDetailReport[$i]) $csvAll
}
}
else
{
# Month was specified as a parameter, so go ahead and just download that month
$csvAll = DownloadUsageReport "rest/$EnrollmentNbr/usage-report?month=$Month&type=detail" $csvAll
}
Write-Host "Total Rows = $($csvAll.length)"
# data is in US format wrt Date (MM/DD/YYYY) and decimal values (3.14)
# so loop through and convert columns to local format so that Excel can be happy
Write-verbose "$(Get-Date -format 's'): Fixing datatypes..."
for ($i=0; $i -lt $csvAll.length; $i++) {
$csvAll[$i].Date = [datetime]::ParseExact( $csvAll[$i].Date, 'dd/mm/yyyy', $null).ToString("d")
$csvAll[$i].ExtendedCost = [float]$csvAll[$i].ExtendedCost
$csvAll[$i].ResourceRate = [float]$csvAll[$i].ResourceRate
$csvAll[$i].'Consumed Quantity' = [float]$csvAll[$i].'Consumed Quantity'
# Expand tags
$tags = $csvAll[$i].Tags | ConvertFrom-Json
if ($tags -ne $null) {
$tags.psobject.properties | ForEach {
$tagName = "Tag-$($_.Name)"
Add-Member -InputObject $csvAll[$i] $tagName $_.Value
# Add to first row, as that's what is used to format the CSV
if ($csvAll[0].psobject.Properties[$tagName] -eq $null) {
Add-Member -InputObject $csvAll[0] $tagName $null -Force
}
}
}
}
# save the data to a CSV file
$filename = ".\$($EnrollmentNbr)_UsageDetail$($Month)_$(Get-Date -format 'yyyyMMdd').csv"
Write-Host "$(Get-Date -format 's'): Saving to file $filename"
$csvAll | Export-Csv $filename -NoTypeInformation -Delimiter ","
The Result
After running the script, you should have a nice neat CSV file with your Azure usage data for the chosen month.
If you open this file in Excel, you should see that, in addition to the original JSON-formatted Tags column, you'll have a new column called Tag-something for each tag that was found for at least one Azure resource.
While there's unfortunately no way to get the Enterprise Azure Content Pack for Power BI to do the same, you can load the same CSV file into Power BI and create your own dashboards. You should see the same Tag-something columns available to put into your charts.
Happy analysing!
Comments
- Anonymous
June 21, 2016
You can now use Power BI desktop and achieve the same. Parse the tags column using JSON.