Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This series of articles outlines best practices for optimizing the performance, security, and cost of Spark jobs when running Spark Notebooks and Spark Job Definitions (SJDs) on Microsoft Fabric. You should be familiar with basic data engineering concepts in Fabric. If you're new to Fabric, refer to Fabric data engineering documentation.
Articles in this series
- Fabric Spark Capacity and Cluster Planning: Guidelines for Sizing
- Fabric Spark Security
- Development and Monitoring
- Spark Basics
Tip
If you're new to Spark, start with the Spark Basics article.
Acronyms
Here's a list of common acronyms used throughout the Fabric Spark best practices articles:
| Acronym | Meaning |
|---|---|
| AKV | Azure Key Vault |
| AQE | Adaptive Query Execution |
| CDC | Change Data Capture |
| CU | Capacity Unit |
| DAG | Directed Acyclic Graph |
| HC | High Concurrency |
| JVM | Java Virtual Machine |
| MLV | Materialized Lake View |
| MPE | Managed Private Endpoint |
| NEE | Native Execution Engine |
| OOM | Out of Memory |
| PL | Private Link |
| ORC | Optimized Row Columnar |
| RDD | Resilient Distributed Dataset |
| SJDs | Spark Job Definitions |
| SPN | Service Principal Name |
| SRE | Site Reliability Engineer |
| UDF | User Defined Function |
| UI | User Interface |
| VM | Virtual Machine |
| VNet | Virtual Network |
| WS OAP | Workspace Outbound Access Protection |