Share via


Use Azure Data Lake Storage Gen1 to capture data from Event Hubs

Learn how to use Azure Data Lake Storage Gen1 to capture data received by Azure Event Hubs.

Prerequisites

Assign permissions to Event Hubs

In this section, you create a folder within the account where you want to capture the data from Event Hubs. You also assign permissions to Event Hubs so that it can write data into a Data Lake Storage Gen1 account.

  1. Open the Data Lake Storage Gen1 account where you want to capture data from Event Hubs and then click on Data Explorer.

    Data Lake Storage Gen1 data explorer

  2. Click New Folder and then enter a name for folder where you want to capture the data.

    Create a new folder in Data Lake Storage Gen1

  3. Assign permissions at the root of Data Lake Storage Gen1.

    a. Click Data Explorer, select the root of the Data Lake Storage Gen1 account, and then click Access.

    Screenshot of the Data explorer with the root of the account and the Access option called out.

    b. Under Access, click Add, click Select User or Group, and then search for Microsoft.EventHubs.

    Screenshot of the Access page with the Add option, Select User or Group option, and Microsoft Eventhubs option called out.

    Click Select.

    c. Under Assign Permissions, click Select Permissions. Set Permissions to Execute. Set Add to to This folder and all children. Set Add as to An access permission entry and a default permission entry.

    Important

    When creating a new folder hierarchy for capturing data received by Azure Event Hubs, this is an easy way to ensure access to the destination folder. However, adding permissions to all children of a top level folder with many child files and folders may take a long time. If your root folder contains a large number of files and folders, it may be faster to add Execute permissions for Microsoft.EventHubs individually to each folder in the path to your final destination folder.

    Screenshot of the Assign Permissions section with the Select Permissions option called out. The Select Permissions section is next to it with the Execute option, Add to option, and Add as option called out.

    Click OK.

  4. Assign permissions for the folder under the Data Lake Storage Gen1 account where you want to capture data.

    a. Click Data Explorer, select the folder in the Data Lake Storage Gen1 account, and then click Access.

    Screenshot of the Data explorer with a folder in the account and the Access option called out.

    b. Under Access, click Add, click Select User or Group, and then search for Microsoft.EventHubs.

    Screenshot of the Data explorer Access page with the Add option, Select User or Group option, and Microsoft Eventhubs option called out.

    Click Select.

    c. Under Assign Permissions, click Select Permissions. Set Permissions to Read, Write, and Execute. Set Add to to This folder and all children. Finally, set Add as to An access permission entry and a default permission entry.

    Screenshot of the Assign Permissions section with the Select Permissions option called out. The Select Permissions section is next to it with the Read, Write, and Execute options, the Add to option, and the Add as option called out.

    Click OK.

Configure Event Hubs to capture data to Data Lake Storage Gen1

In this section, you create an Event Hub within an Event Hubs namespace. You also configure the Event Hub to capture data to an Azure Data Lake Storage Gen1 account. This section assumes that you have already created an Event Hubs namespace.

  1. From the Overview pane of the Event Hubs namespace, click + Event Hub.

    Screenshot of the Overview pane with the Event Hub option called out.

  2. Provide the following values to configure Event Hubs to capture data to Data Lake Storage Gen1.

    Screenshot of the Create Event Hub dialog box with the Name text box, the Capture option, the Capture Provider option, the Select Data Lake Store option, and the Data Lake Path option called out.

    a. Provide a name for the Event Hub.

    b. For this tutorial, set Partition Count and Message Retention to the default values.

    c. Set Capture to On. Set the Time Window (how frequently to capture) and Size Window (data size to capture).

    d. For Capture Provider, select Azure Data Lake Store and then select the Data Lake Storage Gen1 account you created earlier. For Data Lake Path, enter the name of the folder you created in the Data Lake Storage Gen1 account. You only need to provide the relative path to the folder.

    e. Leave the Sample capture file name formats to the default value. This option governs the folder structure that is created under the capture folder.

    f. Click Create.

Test the setup

You can now test the solution by sending data to the Azure Event Hub. Follow the instructions at Send events to Azure Event Hubs. Once you start sending the data, you see the data reflected in Data Lake Storage Gen1 using the folder structure you specified. For example, you see a folder structure, as shown in the following screenshot, in your Data Lake Storage Gen1 account.

Sample EventHub data in Data Lake Storage Gen1

Note

Even if you do not have messages coming into Event Hubs, Event Hubs writes empty files with just the headers into the Data Lake Storage Gen1 account. The files are written at the same time interval that you provided while creating the Event Hubs.

Analyze data in Data Lake Storage Gen1

Once the data is in Data Lake Storage Gen1, you can run analytical jobs to process and crunch the data. See USQL Avro Example on how to do this using Azure Data Lake Analytics.

See also