Breyta

Deila með


Estimate the cost of archiving data

The archive tier is an offline tier for storing data that is rarely accessed. The archive access tier has the lowest storage cost. However, this tier has higher data retrieval costs with a higher latency as compared to the hot, cool, and cold tiers.

This article explains how to calculate the cost of using archive storage and then presents a few example scenarios.

Calculate costs

The cost to archive data is derived from these three components:

  • Cost to write data to the archive tier
  • Cost to store data in the archive tier
  • Cost to rehydrate data from the archive tier

The following sections show you how to calculate each component.

This article uses fictitious prices in all calculations. You can find these sample prices in the Sample prices section at the end of this article. These prices are meant only as examples, and shouldn't be used to calculate your costs.

For official prices, see Azure Blob Storage pricing or Azure Data Lake Storage pricing. For more information about how to choose the correct pricing page, see Understand the full billing model for Azure Blob Storage.

The cost to write

You can calculate the cost of writing to the archive tier by multiplying the number of write operations by the price of each operation. The price of an operation depends on which ones you use to write data to the archive tier.

Put Blob

If you use the Put Blob operation, then the number of operations is the same as the number of blobs. For example, if you plan to write 30,000 blobs to the archive tier, then that requires 30,000 operations. Each operation is charged the price of an archive write operation.

Tip

Operations are billed per 10,000. Therefore, if the price per 10,000 operations is $0.10, then the price of a single operation is $0.10 / 10,000 = $0.00001.

Put Block and Put Block List

If you upload a blob by using the Put Block and Put Block List operations, then an upload requires multiple operations, and each of those operations are charged separately. Each Put Block operation is charged at the price of a write operation for the accounts default access tier. The number of Put Block operations that you need depends on the block size that you specify to upload the data. For example, if the blob size is 100 MiB and you choose block size to 10 MiB when you upload that blob, you would use 10 Put Block operations. Blocks are written (committed) to the archive tier by using the Put Block List operation. That operation is charged the price of an archive write operation. Therefore, to upload a single blob, your cost is (number of blocks * price of a hot write operation) + price of an archive write operation.

Note

If you're not using an SDK or the REST API directly, you might have to investigate which operations your data transfer tool is using to upload files. You might be able to determine this by reaching out the tool provider or by using storage logs.

Set Blob Tier

If you use the Set Blob Tier operation to move a blob from the cool, cold, or hot tier to the archive tier, you're charged the price of an archive write operation.

The cost to store

You can calculate the storage costs by multiplying the size of the data in GB by the price of archive storage.

For example (assuming the sample pricing), if you plan to store 10 TB to the archive tier, the capacity cost is $0.002 * 10 * 1024 = $20.48 per month.

The cost to rehydrate

Blobs in the archive tier are offline and can't be read or modified. To read or modify data in an archived blob, you must first rehydrate the blob to an online tier (either the hot cool, or cold tier).

You can calculate the cost to rehydrate data by adding the cost to retrieve data to the cost of reading the data.

Assuming sample pricing, the cost of retrieving 1 GB of data from the archive tier would be 1 * $0.022 = $0.022.

Read operations are billed per 10,000. Therefore, if the cost per 10,000 operations is $5.50, then the cost of a single operation is $5.50 / 10,000 = $0.00055. The cost of reading 1000 blobs at standard priority is 1000 * $0.0005 = $0.50.

In this example, the total cost to rehydrate (retrieving + reading) would be $0.022 + $0.50 = $0.52.

Note

If you set the rehydration priority to high, then the data retrieval and read rates increase.

If you plan to rehydrate data, you should try to avoid an early deletion fee. To review your options, see Blob rehydration from the archive tier.

Scenario: One-time data backup

This scenario assumes that you plan to remove on-premises tapes or file servers by migrating backup data to cloud storage. If you don't expect users to access that data often, then it might make sense to migrate that data directly to the archive tier. In the first month, you'd assume the cost of writing data to the archive tier. In the remaining months, you'd pay only for the cost to store the data and the cost to rehydrate data as needed for the occasional read operation.

Using the Sample prices that appear in this article, the following table demonstrates three months of spending.

This scenario assumes an initial ingest of 2,000,000 files totaling 102,400 GB in size to archive. It also assumes one-time read each month of about 1% of archived capacity. The operation used this scenario is the Put Blob operation. This scenario also assumes that blobs are rehydrated by copying blobs instead of changing the blob's access tier.

Cost factor January February March Projected annual
Write operations 2,000,000 0 0 2,000,000
Price of a single write operation $0.000011 $0.000011 $0.000011 $0.000011
Cost to write (operations * price of a write operation) $22.00 $0.00 $0.00 $22.00
Total file size (GB) 102,400 102,400 102,400 1,228,800
Data prices (pay-as-you-go) $0.002 $0.002 $0.002 $0.002
Cost to store (file size * data price) $204.80 $204.80 $204.80 $2,457.60
Data retrieval size (1% of file size) 1,024 1,024 1,024 12,288
Price of data retrieval $0.022 $0.022 $0.022 $0.022
Cost to retrieve (data retrieval size * price of retrieval) $22.53 $22.53 $22.53 $270.34
Number of read operations (File count * 1%) 20,000 20,000 20,000 240,000
Price of a single read operation $0.00055 $0.0005 5 $0.00055 $0.00055
Cost to read (operations * price of a read operation) $11.00 $11.00 $11.00 $132.00
Cost to rehydrate (cost to retrieve + cost to read) $33.53 $33.53 $33.53 $402.34
Total cost (write + storage + rehydrate) $260.33 $238.33 $238.33 $2,881.94

Tip

To model costs over 12 months, open the One-Time Backup tab of this workbook. You can update the prices and values in that worksheet to estimate your costs.

Scenario: Continuous tiering

This scenario assumes that you plan to periodically move data to the archive tier. Perhaps you're using Blob Storage inventory reports to gauge which blobs are accessed less frequently, and then using lifecycle management policies to automate the archival process.

Each month, you'd assume the cost of writing to the archive tier. The cost to store and then rehydrate data would increase over time as you archive more blobs.

Using the Sample prices that appear in this article, the following table demonstrates three months of spending.

This scenario assumes a monthly ingest of 200,000 files totaling 10,240 GB in size to archive. It also assumes a one-time read each month of about 1% of archived capacity. The operation used this scenario is the Put Blob operation.

Cost factor January February March Projected annual
Write operations 200,000 200,000 200,000 2,400,000
Price of a single write operation $0.000011 $0.000011 $0.000011
Cost to write (operations * price of a write operation) $2.20 $2.20 $2.20 $26.40
Number of files 200,000 400,000 600,000 2,400,000
Total file size (GB) 10,240 20,480 39,720 122,880
Data prices (pay-as-you-go) $0.002 $0.002 $0.002
Cost to store (file size * data price) $10.14 $20.28 $30.41 $1,597.44
Data retrieval size (1% of file size) 102 205 307 7,987
Price of data retrieval $0.022 $0.022 $0.022
Cost to retrieve (data retrieval size * price of retrieval) $2.25 $4.51 $6.76 $175.72
Number of read operations (File count * 1% storage read) 2,000 4,000 6,000 156,000
Price of a single read operation $0.00055 $0.00055 $0.00055
Cost to read (operations * price to read) $1.10 $2.20 $3.30 $85.80
Cost to rehydrate (cost to retrieve + cost to read) $3.35 $6.71 $10.06 $261.52
Total cost $26.03 $49.87 $73.70 $1,885.36

Tip

To model costs over 12 months, open the Continuous Tiering tab of this workbook. You can update the prices and values in that worksheet to estimate your costs.

Archive versus cold and cool

Archive storage is the lowest cost tier. However, it can take up to 15 hours to rehydrate 10-GiB files. To learn more, see Blob rehydration from the archive tier. The archive tier might not be the best fit if your workloads must read data quickly. The cool tier offers a near real-time read latency with a lower price than that the hot tier. Understanding your access requirements helps you to choose between the cool, cold, and archive tiers.

The following table compares the cost of archive storage with the cost of cool and cold storage by using the Sample prices that appear in this article. This scenario assumes a monthly ingest of 200,000 files totaling 10,240 GB in size to archive. It also assumes 1 read each month about 10% of stored capacity (1,024 GB), and 10% of total operations (20,000).

Cost factor Archive Cold Cool
Write operations 200,000 200,000 200,000
Price of a single write operation $0.000011 $0.000018 $0.00001
Cost to write (operations * price of a write operation) $2.20 $3.60 $2.00
Total number of files 200,000 200,000 200,000
Total file size (GB) 10,240 10,240 10,240
Data prices (pay-as-you-go) $0.0020 $0.0045 $0.0115
Cost to store (file size * data price) $20.48 $46.08 $117.76
Data retrieval size (10% of file size) 1,024 1,024 1,024
Price of data retrieval per GB $0.022 $0.03 $0.01
Number of read operations (file count * 10% storage read) 20,000 20,000 20,000
Price of a single read operation $0.00055 $0.00001 $0.000001
Cost to read (operations * price to read) $11.00 $.20 $.02
Cost to rehydrate (cost to retrieve + cost to read) $30.48 $30.92 $10.26
Monthly cost $42.62 $71.38 $167.91

Tip

To model your costs, open the Choose Tiers tab of this workbook. You can update the prices and values in that worksheet to estimate your costs.

The following chart shows the impact on monthly spending given various read percentages. This chart assumes a monthly ingest of 1,000,000 files totaling 10,240 GB in size. Assuming sample pricing, this chart shows a break-even point at or around the 25% read level. After that level, the cost of archive storage begins to rise relative to the cost of cool storage.

Cool versus archive monthly spending

Sample prices

The following table includes sample (fictitious) prices for each request to the Blob Service endpoint (blob.core.windows.net).

Important

These prices are meant only as examples, and shouldn't be used to calculate your costs. For official prices, see the Azure Blob Storage pricing or Azure Data Lake Storage pricing pages. For more information about how to choose the correct pricing page, see Understand the full billing model for Azure Blob Storage.

Price factor Hot Cool Cold Archive
Price of write operations (per 10,000) $0.055 $0.10 $0.18 $0.11
Price of read operations (per 10,000) $0.0044 $0.01 $0.10 $5.50
List and container operations (per 10,000) $0.055 $0.055 $0.065 $.055
All other operations (per 10,000) $0.0044 $0.0044 $0.0052 $.0044
Price of data retrieval (per GB) Free $0.01 $0.03 $.022
Price of Data storage first 50 TB (pay-as-you-go) $0.0208 $0.0115 $0.0045 $0.002
Price of Data storage next 450 TB (pay-as-you-go) $0.020 $0.0115 $0.0045 $0.002
Price of 100 TB (One-year reserved capacity) $1,747 $966 Not available $183
Price of 100 TB (Three-year reserved capacity) $1,406 $872 Not available $168
Network bandwidth between regions within North America (per GB) $0.02 $0.02 $0.02 $0.02
Price of high priority read operations (per 10,000) Not applicable Not applicable Not applicable $65.00
Price of high priority data retrieval (per GB) Not applicable Not applicable Not applicable $0.13

The following table includes sample prices (fictitious) prices for each request to the Data Lake Storage endpoint (dfs.core.windows.net). For official prices, see Azure Data Lake Storage pricing.

Price factor Hot Cool Cold Archive
Price of write operations (every 4 MiB, per 10,000) $0.07120 $0.13 $0.234 $0.143
Price of read operations (every 4 MiB, per 10,000) $0.0057 $0.013 $0.13 $7.15
Iterative write operations (per 100) $0.0715 $0.0715 $0.0715 $0.0715
Iterative read operations (per 10,000) $0.0715 $0.0715 $0.0845 $0.0715
Price of data retrieval (per GB) Free $0.01 $0.03 $0.022
Network bandwidth between regions within North America (per GB) $0.02 $0.02 $0.02 $0.02
Data storage prices first 50 TB (pay-as-you-go) $0.021 $0.012 $0.0045 $0.002
Data storage prices next 450 TB (pay-as-you-go) $0.020 $0.012 $0.0045 $0.002
Price of 100 TB (One-year reserved capacity) $1,747 $966 Not available $183
Price of 100 TB (Three-year reserved capacity) $1,406 $872 Not available $168
Price of high priority read operations (per 10,000) Not applicable Not applicable Not applicable $84.50
Price of high priority data retrieval (per GB) Not applicable Not applicable Not applicable $0.13
Index (GB / month) $0.0297 Not applicable Not applicable Not applicable

Next steps