Azure Databricks read from S3 - costs

MH-3502 21 Reputation points
2020-07-24T06:05:37.95+00:00

Hi,

i know that one can directly use data stored in an AWS S3-Bucket in an Azure Databricks Notebook. What i wonder is: will there be any additional (apart from the "normal" Databricks cluster costs) costs generated by doing so? I´m thinking about data egress from AWS and/or transfer bandwith etc.

If yes, how could we estimate the amount of data that would be billed? Would only data that is physically "saved" to e.g. an Azure Storage Account after processing it in Databricks be taken in to account, or everything that is originally read into Databricks or....

Thank you!

Azure Cost Management
Azure Cost Management
A Microsoft offering that enables tracking of cloud usage and expenditures for Azure and other cloud providers.
3,606 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,526 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA 90,646 Reputation points Moderator
    2020-07-24T11:03:32.447+00:00

    Hello @MH-3502 ,

    There will be no additional charge from Azure Databricks End.

    If you are saving the data into Azure Storage Account, then you will be billed for the amount of data stored.

    You need to pay data transfer out from Amazon S3 to internet.

    From Amazon S3 pricing page, here is the data transfer cost.

    13589-image.png

    Hope this helps. Do let us know if you any further queries.

    ----------------------------------------------------------------------------------------

    Do click on "Accept Answer" and Upvote on the post that helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.