Load Sample data to Data Warehouse

In this tutorial, you build a data pipeline to move a Sample dataset to the Data Warehouse. This experience shows you a quick demo about how to use pipeline copy activity and how to load data into Data Warehouse.

Prerequisites

To get started, you must complete the following prerequisites:

  • A Microsoft Fabric tenant account with an active subscription. Create an account for free.
  • Make sure you have a Microsoft Fabric enabled Workspace: Create a workspace.
  • Make sure you have already created a Data Warehouse. To create it, refer to Create a Data Warehouse

Create a data pipeline

  1. Navigate to Power BI.

  2. Select the Power BI icon in the bottom left of the screen, then select Data factory to open homepage of Data Factory.

  3. Navigate to your Microsoft Fabric workspace. If you created a new workspace in the prior Prerequisites section, use this one.

    Screenshot of the workspaces window where you navigate to your workspace.

  4. Select Data pipeline and then input a pipeline name to create a new pipeline.

    Screenshot showing the new data pipeline button in the newly created workspace.

    Screenshot showing the name of creating a new pipeline.

Copy data using pipeline

In this session, you start to build your pipeline by following below steps about copying from a sample dataset provided by pipeline into Data Warehouse.

Step 1: Start with the Copy assistant

  1. After selecting Copy data on the canvas, the Copy assistant tool will be opened to get started.

    Screenshot showing the Copy data button on a new pipeline.

Step 2: Configure your source

  1. Choose the COVID-19 Data Lake from the Sample data options for your data source, and then select Next.

    Screenshot showing the COVID-19 Data Lake sample data selection in the Copy data assistant.

  2. In the Connect to data source section of the Copy data assistant, a preview of the sample data Bing COVID-19 is displayed. Select Next to move on to the data destination.

    Screenshot showing a preview of the Bing COVID-19 sample data.

Step 3: Configure your destination

  1. Select the Workspace tab and choose Data warehouse. Then select Next.

    Screenshot showing the selection of the Data Warehouse destination.

  2. Select your Data Warehouse from the drop-down list, then select Next.

    Screenshot showing selecting Data warehouse.

  3. Configure and map your source data to the destination Data Warehouse table by entering Destination table name, then select Next one more time.

    Screenshot showing the table name to create in the Data Warehouse destination.

  4. Configure other settings on Settings page. In this tutorial, select Next directly since you don't need to use staging and copy command.

    Screenshot showing the destination settings.

Step 4: Review and create your copy activity

  1. Review your copy activity settings in the previous steps and select OK to finish. Or you can revisit the previous steps in the tool to edit your settings, if needed.

    Screenshot of the Review + create page of the Copy data assistant highlighting source and destination.

  2. The Copy activity is added to your new data pipeline canvas. All settings including advanced settings for the activity are available in the tabs below the pipeline canvas when the created Copy data activity is selected.

    Screenshot showing the completed Copy activity in pipeline canvas.

Run and schedule your data pipeline

  1. Switch to the Home tab and select Run. A confirmation dialog is displayed. Then select Save and run to start the activity.

    Screenshot showing the Run button on the Home tab, and the Save and run prompt displayed.

  2. You can monitor the running process and check the results on the Output tab below the pipeline canvas. Select the run details button (with the glasses icon highlighted) to view the run details.

    Screenshot showing the Output tab of the pipeline run in-progress with the Details button highlighted in the run status.

  3. The run details show how much data was read and written and various other details about the run.

    Screenshot showing the run details window.

  4. You can also schedule the pipeline to run with a specific frequency as required. Below is an example scheduling the pipeline to run every 15 minutes. You can also specify the Start time and End time for your schedule. If you don't specify a start time, the start time is the time your schedule applies. If you don't specify an end time, your pipeline run will keep recurring every 15 minutes.

    Screenshot showing the schedule dialog for the pipeline with a 15-minute recurring schedule.

This sample shows you how to load sample data into a Data Warehouse using Data Factory in Microsoft Fabric. You learned how to:

  • Create a data pipeline.
  • Copy data using your pipeline.
  • Run and schedule your data pipeline.

Next, advance to learn more about monitoring your pipeline runs.