Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
In this scenario, Dataflow Gen2 and Virtual Network Data Gateway were used to load 2 GB of Parquet data stored in Azure Data Lake Storage (ADLS) Gen2 to a Lakehouse table in Microsoft Fabric. We used the NYC Taxi-green sample data for the Parquet data.
The prices used in the following example are hypothetical and don’t intend to imply exact actual pricing. These are just to demonstrate how you can estimate, plan, and manage cost for Data Factory projects in Microsoft Fabric. Also, since Fabric capacities are priced uniquely across regions, we use the pay-as-you-go pricing for a Fabric capacity at US West 2 (a typical Azure region), at $0.18 per CU per hour. Refer here to Microsoft Fabric - Pricing to explore other Fabric capacity pricing options.
Configuration
To accomplish this scenario, you need to create a dataflow with the following steps:
- Initialize Dataflow: Get 2 GB Parquet files data from ADLS Gen2 storage account.
- Setup Virtual Network Data Gateway with 1 instance and 30 minutes time-to-live.
- Configure Power Query.
- Configure Lakehouse as the data output destination.
Cost estimation using the Fabric Metrics App
When running a dataflow to load data through the Virtual Network Data Gateway, the overall consumption is divided into two main components: dataflow refresh and Virtual Network Data Gateway uptime. Charges for the Virtual Network Data Gateway are based on its uptime, which includes both the workload execution time and its time-to-live whenever the gateway is active.
The load operation consumed about 2 minutes with 970.6228 CU seconds on Dataflow Gen2 Refresh and 7480.6466 CU seconds on Virtual Network Data Gateway uptime.
Note
Although reported as a metric, the actual duration of the run isn't relevant when calculating the effective CU hours with the Fabric Metrics App since the CU seconds metric it also reports already accounts for its duration.
Metric | Compute Consumption |
---|---|
Dataflow Gen2 Refresh | 970.6228 CU seconds |
Virtual Network Data Gateway Uptime | 7480.6466 CU seconds |
Total run cost at $0.18/CU hour = (970.6228 + 7480.6466) / (60 * 60) CU-hours * ($0.18/CU hour) ~= $0.42