Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
High-end BI performance depends on how effectively data is prepared and delivered from the Lakehouse. By adopting architectural patterns, applying a semantic structure, and using targeted optimizations, you can reduce query complexity, improve dashboard responsiveness, and lower compute costs.
The following table summarizes recommended practices, their expected impact, related documentation, and associated action items. This content is intended for data engineers, BI developers, and dashboard authors who design, optimize, and maintain analytics workloads in the Lakehouse.
Data prep
| Best practice | Impact | Docs | Action items |
|---|---|---|---|
| Adopt a medallion architecture | Speeds up turning raw data into ready-to-use, reliable data products for easy consumption. | Review and implement medallion layers | |
| Use liquid clustering | Improves query performance with file and data skipping. | Apply to large tables with filter patterns | |
| Use managed tables | Azure Databricks auto-governs and optimizes the storage layer and query performance. | Create managed tables for your data | |
| Use predictive optimization or optimize tables manually | Enables better query performance by optimizing file sizes and layout, deleting old files, and updating statistics. | Enable for production tables or schedule regular optimization and analyze tables after data changes | |
| Model data in a star schema pattern | Makes the data easy to query and consume. | Design fact and dimension tables | |
| Avoid wide data types and high-cardinality columns | Optimizes data model size and memory consumption, and improves query efficiency. | Review data types and cardinality | |
| Declare Primary and Foreign Keys (with RELY) | Optimizes queries by eliminating unnecessary joins and aggregations. | Define keys on fact and dimension tables | |
| Use auto-generated columns | Reduces the need to calculate values at query time. | Identify frequently calculated fields | |
| Use materialized views and persisted tables | Improves performance by pre-aggregating data for the most common and resource-intensive queries. | Create aggregated views for common queries |