Episode

Microsoft Fabric Learn Together Ep02: Use Apache Spark in Microsoft Fabric

with Anupama Natarajan, Armando Lacerda, Andrei Khaidarov

Apache Spark is a core technology for large-scale data analytics. Microsoft Fabric provides support for Spark clusters, enabling you to analyze and process data in a Lakehouse at scale.

Learning objectives

  • Configure Spark in a Microsoft Fabric workspace
  • Identify suitable scenarios for Spark notebooks and Spark jobs
  • Use Spark dataframes to analyze and transform data
  • Use Spark SQL to query data in tables and views
  • Visualize data in a Spark notebook

Chapters

  • 00:00 - Introduction
  • 06:55 - Learning objectives
  • 12:44 - Prepare to use Apache Spark
  • 15:59 - Demo - Spark configuration
  • 25:21 - Run Spark code
  • 31:34 - Load data into a Spark dataframe
  • 37:09 - Transform data in a dataframe
  • 01:00:31 - Save a dataframe
  • 01:14:21 - Work with data using Spark SQL
  • 01:17:50 - Demo - Use Spark SQL
  • 01:27:17 - Summary and resources

Connect

Intermediate
Data Analyst
Data Engineer
Microsoft Fabric