Azure Databricks read from S3 - costs

M_H 21 Reputation points
2020-07-24T06:05:37.95+00:00

Hi,

i know that one can directly use data stored in an AWS S3-Bucket in an Azure Databricks Notebook. What i wonder is: will there be any additional (apart from the "normal" Databricks cluster costs) costs generated by doing so? I´m thinking about data egress from AWS and/or transfer bandwith etc.

If yes, how could we estimate the amount of data that would be billed? Would only data that is physically "saved" to e.g. an Azure Storage Account after processing it in Databricks be taken in to account, or everything that is originally read into Databricks or....

Thank you!

Azure Cost Management
Azure Cost Management
A Microsoft offering that enables tracking of cloud usage and expenditures for Azure and other cloud providers.
2,021 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,913 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 76,746 Reputation points Microsoft Employee
    2020-07-24T11:03:32.447+00:00

    Hello @M_H ,

    There will be no additional charge from Azure Databricks End.

    If you are saving the data into Azure Storage Account, then you will be billed for the amount of data stored.

    You need to pay data transfer out from Amazon S3 to internet.

    From Amazon S3 pricing page, here is the data transfer cost.

    13589-image.png

    Hope this helps. Do let us know if you any further queries.

    ----------------------------------------------------------------------------------------

    Do click on "Accept Answer" and Upvote on the post that helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful