Share via

Re-usable spark frameworks

Bhagavan-Azure, Pavan 1 Reputation point
2022-11-08T15:37:39.273+00:00

Hi,

Do we have any pyspark/scala re-usable framework's (notebooks) for SCD Type 1 ,2,3 so that developer can simply pass the parameters like source table name, target table name, primary keys ........other parameters instead of building from scratch every time

Thanks,
Pavan

Azure Synapse Analytics
Azure Synapse Analytics

An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.


1 answer

Sort by: Most helpful
  1. HimanshuSinha 19,637 Reputation points Microsoft Employee Moderator
    2022-11-09T22:03:00.147+00:00

    Hello @Bhagavan-Azure, Pavan ,
    Thanks for the question and using MS Q&A platform.
    As we understand the ask here is is there a framework / library to implement SCD , please do let us know if its not accurate.
    At this time I don't think that there is anything like that whic you can use . But there are tons of resources on how to implament SCD on spark .
    I have found that all these blogs code are longer as they are creating the dataframe to show the implemenation . The actual SCD code is very small and simple .

    https://towardsdatascience.com/processing-a-slowly-changing-dimension-type-2-using-pyspark-in-aws-9f5013a36902

    Please do let me if you have any queries.
    Thanks
    Himanshu


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.