Episode

Learn Together Microsoft Fabric Ep202: Use Apache Spark in Microsoft Fabric

with Heini Ilmarinen, Nikola Ilic, Kay Sauter, Sunoj Kumar

Apache Spark is a core technology for large-scale data analytics. Microsoft Fabric provides support for Spark clusters, enabling you to analyze and process data in a Lakehouse at scale.

Learning objectives

  • Configure Spark in a Microsoft Fabric workspace
  • Identify suitable scenarios for Spark notebooks and Spark jobs
  • Use Spark dataframes to analyze and transform data
  • Use Spark SQL to query data in tables and views
  • Visualize data in a Spark notebook

Chapters

  • 00:00 - Introduction
  • 05:12 - Fabric Career Hub
  • 10:10 - Learning objectives
  • 15:33 - Prepare to use Apache Spark
  • 18:04 - Overview of Spark integration
  • 33:47 - Demo - Create a lakehouse
  • 56:27 - Demo - Save a dataframe
  • 01:04:12 - Work with data using Spark SQL
  • 01:08:55 - Demo - Use Spark SQL
  • 01:22:14 - Summary

Connect

Intermediate
Data Analyst
Data Engineer
Microsoft Fabric