How testers will use databricks to test the scenarios

Birajdar, Sujata 61 Reputation points
2021-11-18T07:07:48.047+00:00

HI Team,

We created pyspark notebook for development where dataframe is reading 300 million records and nowhere in development notebook we are displaying the data or reading it to any tempviews.

But, here our scenario is like now testers should evaluate the data after transformations using sql. So if we store our transformed data to any temp view per a particular note book what is the limit to store the data to Hive metastore?
Can we store 300 million records and what is the preferable compute type and config?

Do we have any other way to store the data for testing purpose apart from creating views on top of dataframes?

And what is the best use case to give data to testers to validate their scenarios.

Please add your comments.

Thank you

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,330 questions
{count} votes

Accepted answer
  1. HimanshuSinha-msft 19,476 Reputation points Microsoft Employee
    2021-11-19T18:42:31.71+00:00

    Hello @Birajdar, Sujata ,
    Thanks for the ask and using Microsoft Q&A platform .
    Validating 300 million records is going to tough without scripts . Anyways you can always write the transformed data in a blob and the testers can use that using SQL API .

    Please do let me know how it goes .
    Thanks
    Himanshu

    -------------------------------------------------------------------------------------------------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.