Summary

Completed

In this module, you learned how Fabric notebooks provide an interactive environment for running Spark SQL and PySpark transformations, with the ability to connect to lakehouses, warehouses, KQL databases, and external sources.

You explored how notebooks work, what data stores they access, and common development patterns like interactive development, parameterized notebooks, and pipeline integration. You then applied core shaping techniques, including filtering rows, handling nulls, adding calculated columns, and converting data types. You combined data from multiple tables using joins, calculated summary metrics with aggregations, and applied window functions for rankings and running totals. Finally, you wrote your transformed results to Delta tables with appropriate write modes and sizing considerations.

These skills give you the tools to build repeatable transformation pipelines that turn raw data into reliable, structured outputs. The Spark SQL and PySpark patterns you practiced work across any data store that Spark can reach. The clean Delta tables you produce serve as the foundation for reports, semantic models, and AI-powered experiences like Fabric IQ data agents that query your data using natural language.

Learn more