How to Interpret Dataflow costs

Question

How to interpret 'Azure Synapse Analytics - Dataflow' consumed quantity from monthly usage report?

Any scenario based examples to estimate the pricing of Azure Synapse Analytics Dataflow?

Answer

Hi @AzureUser-9588

Thanks for the question and using MS Q&A platform.
To interpret the 'Azure Synapse Analytics - Dataflow' consumed quantity from a monthly usage report, you need to understand how Azure Synapse Analytics Dataflow is billed.

Azure Data Factory is a serverless and elastic data integration service built for cloud scale. There isn't a fixed-size compute that you need to plan for peak load; rather you specify how much resource to allocate on demand per operation, which allows you to design the ETL processes in a much more scalable manner. In addition, ADF is billed on a consumption-based plan, which means you only pay for what you use.
User's image For additional information please refer: How you're charged for Azure Data Factory.

When you view your monthly usage report, you will see the consumed quantity of Azure Synapse Analytics Dataflow, which represents the total amount of compute and storage resources used during that month.
Data Flow is a powerful tool in ETL process in Data Factory. You can not only copy the data from one place to another but also perform many transformations, as well as partitioning. Data Flows are executed as activities that use scale-out Apache Spark clusters. The minimum cluster size to run a Data Flow is 8 vCores. You are charged for cluster execution and debugging time per vCore-hour.
User's image For more details please refer: https://azure.microsoft.com/en-us/pricing/details/data-factory/data-pipeline/

It is recommended to create your own Azure Integration Runtimes with a defined region, Compute Type, Core Counts and Time To Live feature. What is really interesting, is that you can dynamically adjust the Core Count and Compute Type properties by sizing the incoming source dataset data. You can do it simply by using activities such as Lookup and Get Metadata. It could be a useful solution when you cope with different dataset sizes of your data.

Region, I took as West Europe, to sum up, in terms of Data Flows in general you are charged only for cluster execution and debugging time per vCore-hour, so it is significant to configure these parameters optimally. If you want to use one basic cluster (general purpose) for one hour and use a minimum number of Core Count, the total price of execution is equal to:

$0.268 * 8 vCores * 1 hour = $2.144

The monthly price is equal to:

$0.268 * 8 vCores * 30 days * 1hour = $64.32

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Share via

How to Interpret Dataflow costs

1 answer

Your answer