An Apache Spark-based analytics platform optimized for Azure.
Hi @Anshal ,
As I understand your query here is you want to look for suggestion on the approach for designing the workflow using Databricks where the requirement is to create schema based on the files arrived and store it in Azure datalake or datawarehouse and process the same for reporting and analysis. Please let me know if there is any gap in my understanding.
As It's a question about the whole project architecture, I would suggest you to take it one step at a time. I am trying to provide reference for each use cases here:
- How to handle dynamically changing schema and implement Real time Data Lake : https://www.youtube.com/watch?v=No55ImP-Jic&t=205s
- Integrate ADLS with Databricks: https://www.cloudiqtech.com/integrating-azure-data-lake-storage-with-databricks/
- Accessing Azure Data Lake Gen 2 Files using Power BI : https://www.youtube.com/watch?v=pwsOz05Ah_s
Hope this will help. Please let us know if any further queries.
------------------------------
- Please don't forget to click on
or upvote
button whenever the information provided helps you.
Original posters help the community find answers faster by identifying the correct answer. Here is how - Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators