Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
In this tutorial, you build a lakehouse, ingest sample data into the Delta table, apply transformation where required, and then create reports. Here's a checklist of the steps you complete:
If you don't have Microsoft Fabric, sign up for a free trial capacity.
Prerequisites
- Before you create a lakehouse, you must create a Fabric workspace.
- Before you ingest a CSV file, you must have OneDrive configured. If you don't have OneDrive configured, sign up for the Microsoft 365 free trial: Free Trial - Try Microsoft 365 for a month.
Why do I need OneDrive for this tutorial?
You need OneDrive for this tutorial because the data ingestion process relies on OneDrive as the underlying storage mechanism for file uploads. When you upload a CSV file to Fabric, it's temporarily stored in your OneDrive account before being ingested into the lakehouse. This integration ensures secure and seamless file transfer within the Microsoft 365 ecosystem.
The ingestion step doesn't work if you don't have OneDrive configured, because Fabric can't access the uploaded file. If you already have the data available in your lakehouse or another supported location, OneDrive isn't required.
Note
If you already have data in your lakehouse, you can use that data instead of the sample CSV file. To check if data is already associated with your lakehouse, use the Lakehouse Explorer or the SQL analytics endpoint to browse tables, files, and folders. For more information about how to check, see Lakehouse overview and Query lakehouse tables with SQL analytics endpoint.
Create a lakehouse
In this section, you create a lakehouse in Fabric.
In Fabric, select Workspaces from the navigation bar.
To open your workspace, enter its name in the search box located at the top and select it from the search results.
From the workspace, select New item, enter Lakehouse in the search box, then select Lakehouse.
In the New lakehouse dialog box, enter wwilakehouse in the Name field.
Select Create to create and open the new lakehouse.
Ingest sample data
In this section, you ingest sample customer data into the lakehouse.
Note
If you don't have OneDrive configured, sign up for the Microsoft 365 free trial: Free Trial - Try Microsoft 365 for a month.
Download the dimension_customer.csv file from the Fabric samples repo.
In the Home tab, under Get data in your lakehouse, you see options to load data into the lakehouse. Select New Dataflow Gen2.
In the Create a dataflow pane, enter Customer Dimension Data in the Name field and select Next.
On the new dataflow screen, select Import from a Text/CSV file.
On the Connect to data source screen, select the Upload file radio button. Drag and drop the dimension_customer.csv file that you downloaded in step 1. After the file is uploaded, select Next.
From the Preview file data page, preview the data and select Create to proceed and return back to the dataflow canvas.
Transform and load data into the lakehouse
In this section, you transform the data based on your business requirements and load it into the lakehouse.
In the Query settings pane, update the Name field to dimension_customer.
Note
Fabric adds a space and number at the end of the table name by default. Table names must be lowercase and must not contain spaces. Rename it appropriately and remove any spaces from the table name.
In this tutorial, you associated the customer data with a lakehouse. If you create a dataflow from the lakehouse, the uploaded data is automatically linked to the default lakehouse. If you're creating the dataflow separately, you can optionally associate it with a lakehouse by following these steps:
From the menu items, select Add data destination and select Lakehouse. From the Connect to data destination screen, sign in to your account if necessary and select Next.
Navigate to the wwilakehouse in your workspace.
If the dimension_customer table doesn't exist, select the New table setting and enter the table name dimension_customer. If the table already exists, select the Existing table setting and choose dimension_customer from the list of tables in the object explorer. Select Next.
On the Choose destination settings pane, select Replace as Update method. Select Save settings to return to the dataflow canvas.
From the dataflow canvas, you can easily transform the data based on your business requirements. For simplicity, we aren't making any changes in this tutorial. To proceed, select Save and Run in the tool bar.
Return to your workspace and hover over the Customer Dimension Data dataflow, select the ... menu, and then select Refresh now. This option runs the data flow and moves data from the source file to lakehouse table. While it's in progress, you see a spinning circle next to the dataflow's name.
Once the dataflow is refreshed, select your lakehouse in the top menu bar to view the dimension_customer Delta table.
Select the table to preview its data. You can also use the SQL analytics endpoint of the lakehouse to query the data with SQL statements. Select SQL analytics endpoint from the Lakehouse dropdown menu at the top right of the screen.
Select the dimension_customer table to preview its data or select New SQL query to write your SQL statements.
The following sample query aggregates the row count based on the BuyingGroup column of the dimension_customer table. SQL query files are saved automatically for future reference, and you can rename or delete these files based on your need.
To run the script, select the Run icon at the top of the script file.
SELECT BuyingGroup, Count(*) AS Total FROM dimension_customer GROUP BY BuyingGroup
Add tables to the semantic model
In this section, you add the tables to the semantic model so that you can use them to create reports.
Open your lakehouse and switch to the SQL analytics endpoint view, select New semantic model, name the semantic model, assign a workspace, and select the tables that you want to add to the semantic model. In this case, select the dimension_customer table.
Build a report
In this section, you build a report from the ingested data.
Select the semantic model in your workspace, select the dropdown Explore this data, and then select Auto-create a report. In the next tutorial, we create a report from scratch.
The table is a dimension and there are no measures in it. Power BI creates a measure for the row count, aggregates it across different columns, and creates different charts as shown in the following image.
You can save this report for the future by selecting Save from the top ribbon. You can make more changes to this report to meet your requirements by including or excluding other tables or columns.