Edit

Share via


Fabric Spark best practices overview

This series of articles outlines best practices for optimizing the performance, security, and cost of Spark jobs when running Spark Notebooks and Spark Job Definitions (SJDs) on Microsoft Fabric. You should be familiar with basic data engineering concepts in Fabric. If you're new to Fabric, refer to Fabric data engineering documentation.

Articles in this series

Tip

If you're new to Spark, start with the Spark Basics article.

Acronyms

Here's a list of common acronyms used throughout the Fabric Spark best practices articles:

Acronym Meaning
AKV Azure Key Vault
AQE Adaptive Query Execution
CDC Change Data Capture
CU Capacity Unit
DAG Directed Acyclic Graph
HC High Concurrency
JVM Java Virtual Machine
MLV Materialized Lake View
MPE Managed Private Endpoint
NEE Native Execution Engine
OOM Out of Memory
PL Private Link
ORC Optimized Row Columnar
RDD Resilient Distributed Dataset
SJDs Spark Job Definitions
SPN Service Principal Name
SRE Site Reliability Engineer
UDF User Defined Function
UI User Interface
VM Virtual Machine
VNet Virtual Network
WS OAP Workspace Outbound Access Protection