Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This feature is currently in preview.
Materialized lake views are precomputed, stored results of SQL queries that can be refreshed on demand or on a schedule. Think of them as "smart tables" that contain the results of complex transformations, aggregations, or joins - with intelligent refresh strategies to keep data current.
Why use materialized lake views?
Materialized lake views solve common data engineering challenges:
- Performance: Instead of running expensive queries repeatedly, results are precomputed and stored
- Consistency: Everyone accesses the same transformed data, reducing discrepancies
- Efficiency: Only refresh when source data actually changes, saving compute resources
- Simplicity: Define transformations once using familiar SQL syntax
When should you use materialized lake views?
Consider materialized lake views when you have:
- Frequently accessed aggregations (daily sales totals, monthly metrics)
- Complex joins across multiple large tables that are queried often
- Data quality transformations that need to be applied consistently
- Reporting datasets that combine data from multiple sources
- Medallion architecture where you need bronze → silver → gold transformations
Don't use them for:
- One-time or rarely accessed queries
- Simple transformations that run quickly
- High-frequency streaming data (consider Real-Time Intelligence for sub-second updates)
How do materialized lake views work?
Materialized lake views use a declarative approach - you define WHAT you want, not HOW to build it:
- Create: Write SQL defining your transformation
- Refresh: Fabric determines the optimal refresh strategy (incremental, full, or skip)
- Query: Applications query the materialized view like any table
- Monitor: Track data quality, lineage, and refresh status
Key capabilities
Automatic refresh optimization
Fabric automatically determines when and how to refresh your views:
- Incremental refresh: Only processes new or changed data
- Full refresh: Rebuilds the entire view when needed
- Skip refresh: No refresh needed when source data hasn't changed
Built-in data quality
Define rules directly in your SQL and specify how to handle violations:
CONSTRAINT valid_sales CHECK (sales_amount > 0) ON MISMATCH DROP
Dependency management
- Visualize how your views depend on each other
- Automatic refresh ordering based on dependencies
- Processing follows dependency chain to ensure data consistency
Monitoring and insights
- Track refresh performance and execution status
- View data quality metrics and violation counts in lineage
- Monitor job instances and refresh history
Common use cases
Sales reporting dashboard
-- Daily sales summary that refreshes automatically
CREATE MATERIALIZED LAKE VIEW daily_sales AS
SELECT
DATE(order_date) as sale_date,
region,
SUM(amount) as total_sales,
COUNT(*) as order_count
FROM orders
GROUP BY DATE(order_date), region;
Data quality validation
-- Clean customer data with quality rules
CREATE MATERIALIZED LAKE VIEW clean_customers (
CONSTRAINT valid_email CHECK (email IS NOT NULL) ON MISMATCH DROP
) AS
SELECT
customer_id,
TRIM(customer_name) as customer_name,
LOWER(email) as email
FROM raw_customers
WHERE customer_name IS NOT NULL;
Medallion architecture
-- Bronze → Silver transformation
CREATE MATERIALIZED LAKE VIEW silver_products AS
SELECT
product_id,
product_name,
category,
CAST(price as DECIMAL(10,2)) as price
FROM bronze_products
WHERE price > 0;
Note
This feature is currently not available in South Central US region.
Current limitations
The following features are currently not available for materialized lake views in Microsoft Fabric:
- Declarative syntax support for PySpark. You can use Spark SQL syntax to create and refresh materialized lake views.
- Cross-lakehouse lineage and execution features.