Design and implement data modeling with Azure Databricks

Module
14 Units

Intermediate

Data Engineer

Azure Databricks

Effective data modeling forms the foundation of a performant and maintainable data platform. This module explores how to design ingestion logic, select appropriate tools and table formats, implement partitioning schemes, manage slowly changing dimensions, choose appropriate data granularity, and optimize table performance through clustering strategies in Azure Databricks with Unity Catalog.

Learning objectives

By the end of this module, you'll be able to:

Design data ingestion logic and configure data source connections
Select the appropriate data ingestion tool for your scenario
Choose between Delta Lake, Apache Iceberg, and other table formats
Design and implement effective data partitioning schemes
Select and implement slowly changing dimension types
Design and implement temporal tables for change tracking and auditing
Choose appropriate data granularity for fact and dimension tables
Design and implement clustering strategies for query optimization
Evaluate when to use managed versus external tables

Prerequisites

The following prerequisites should be completed:

Basic understanding of Azure Databricks workspaces and Unity Catalog
Familiarity with SQL and data warehouse concepts
Knowledge of Delta Lake fundamentals

Start