question

MH-3502 avatar image
0 Votes"
MH-3502 asked PRADEEPCHEEKATLA-MSFT commented

Azure Databricks read from S3 - costs

Hi,

i know that one can directly use data stored in an AWS S3-Bucket in an Azure Databricks Notebook. What i wonder is: will there be any additional (apart from the "normal" Databricks cluster costs) costs generated by doing so? I´m thinking about data egress from AWS and/or transfer bandwith etc.

If yes, how could we estimate the amount of data that would be billed? Would only data that is physically "saved" to e.g. an Azure Storage Account after processing it in Databricks be taken in to account, or everything that is originally read into Databricks or....

Thank you!

azure-databricksazure-cost-management
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

PRADEEPCHEEKATLA-MSFT avatar image
0 Votes"
PRADEEPCHEEKATLA-MSFT answered PRADEEPCHEEKATLA-MSFT commented

Hello @MH-3502,

There will be no additional charge from Azure Databricks End.

If you are saving the data into Azure Storage Account, then you will be billed for the amount of data stored.

You need to pay data transfer out from Amazon S3 to internet.

From Amazon S3 pricing page, here is the data transfer cost.

13589-image.png

Hope this helps. Do let us know if you any further queries.


Do click on "Accept Answer" and Upvote on the post that helps you, this can be beneficial to other community members.



image.png (42.1 KiB)
· 7
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @MH-3502,
Just checking in to see if the above answer helped. If this answers your query, do click “Accept Answer” and Up-Vote for the same. And, if you have any further query do let us know.



0 Votes 0 ·

Hello @MH-3502,
Following up to see if the above suggestion was helpful. And, if you have any further query do let us know.


0 Votes 0 ·
MH-3502 avatar image MH-3502 PRADEEPCHEEKATLA-MSFT ·

Hi @PRADEEPCHEEKATLA-MSFT,
sorry for late respone - unfortunately didn`t receive the notification that there was an answer.

Thank you for the answer - does it mean if i create a dataframe from a S3 bucket containing e.g. 5GB of parquet files
- this 5GB will be transferred to Azure environment (in some kind of temp storage)
- i wil be billed for this 5GB (every time i read the files again)

?

Thank you!



0 Votes 0 ·
Show more comments

Hi,

When using Azure Databricks with Azure Storage account, I see that you said that there will be storage costs ("If you are saving the data into Azure Storage Account, then you will be billed for the amount of data stored."), but I was wondering if there any egress costs for that? Would hosting Azure Databricks in a different region than the Storage Account, cause an egress cost?

0 Votes 0 ·

Hello @rajrao,

Every Azure Databricks workspace associated with the managed resource group. This Azure-managed group of resources allows Azure to provide Databricks as a managed service. Initially this managed resource group will contain only a few workspace resources (a virtual network, a security group and a storage account).

If you are utilize the default storage account for storage data it will be charged for the amount of data storage along with ingress/egress charges for accessing data.

Note: Azure Databricks workspace and the managed resource group will be in same regions.

Hope this helps.

0 Votes 0 ·