Introduction

Completed

Linux foundation Delta Lake is an open-source storage layer for Spark that enables relational database capabilities for batch and streaming data. By using Delta Lake, you can implement a data lakehouse architecture in Spark to support SQL_based data manipulation semantics with support for transactions and schema enforcement. The result is an analytical data store that offers many of the advantages of a relational database system with the flexibility of data file storage in a data lake.

In this module, you'll learn how to:

  • Describe core features and capabilities of Delta Lake.
  • Create and use Delta Lake tables in a Synapse Analytics Spark pool.
  • Create Spark catalog tables for Delta Lake data.
  • Use Delta Lake tables for streaming data.
  • Query Delta Lake tables from a Synapse Analytics SQL pool.

Note

The version of Delta Lake available in an Azure Synapse Analytics pool depends on the version of Spark specified in the pool configuration. The information in this module reflects Delta Lake version 1.0, which is installed with Spark 3.1.