Ingest data from a container or Azure Data Lake Storage into Azure Data Explorer
The ingestion wizard enables you to quickly ingest data in JSON, CSV, and other formats into a table and easily create mapping structures. The data can be ingested either from storage, from a local file, or from a container, or as a one-time or continuous ingestion process.
This document describes using the intuitive ingestion wizard to ingest CSV data from a container into a new table. Ingestion can be done as a one-time operation, or as a continuous method by setting up an Event Grid ingestion pipeline that responds to new files in the source container and ingests qualifying data into your table. This process can be used with slight adaptations to cover a variety of different use cases.
For an overview of the ingestion wizard, see What is the ingestion wizard?. For information about ingesting data into an existing table in Azure Data Explorer, see Ingest data to an existing table
Prerequisites
- An Azure subscription. Create a free Azure account.
- An Azure Data Explorer cluster and database. Create a cluster and database.
- A storage account. Event Grid notification subscription can be set on Azure Storage accounts for
BlobStorage
,StorageV2
, or Data Lake Storage Gen2.
Note
To enable access between a cluster and a storage account without public access (restricted to private endpoint/service endpoint), see Create a Managed Private Endpoint.
Ingest data
In the left menu of the Azure Data Explorer web UI, select Data.
From the Quick actions section, select Ingest data. Alternatively, from the All section, select Ingest data and then Ingest.
In the Ingest data window, the Destination tab is selected. The Cluster and Database fields are automatically populated.
To add a new connection to a cluster, select Add cluster connection below the auto-populated cluster name.
In the popup window, enter the Connection URI for the cluster you're connecting.
Enter a Display Name that you want to use to identify this cluster, and select Add.
In Table, check New table and enter a name for the new table. You can use alphanumeric, hyphens, and underscores. Special characters aren't supported.
Note
Table names must be between 1 and 1024 characters.
Select Next: Source
Select an ingestion type
Under Source type, do the following steps:
Select From blob container (blob container, ADLS Gen2 container). You can ingest up to 5000 blobs from a single container.
For Select source, select Add URL.
Note
Alternatively, you can select Select container and choose information from the dropdown menus to connect to the container.
In the Link to source field, add the blob URI with SAS token or Account key of the container, and optionally enter the sample size. A list is populated with files from the container.
Note
The SAS URL can be created manually or automatically.
Tip
For ingestion from file, see Use the ingestion wizard to ingest JSON data from a local file to an existing table in Azure Data Explorer
Filter data
Optionally, you can filter data to be ingested with File filters. You can filter by file extension, file location, or both.
Filter by file extension
You can filter data to ingest only files with a specific file extension.
For example, filter for all files with a CSV extension.
The system will select one of the files at random and the schema will be generated based on that Schema defining file. You can select a different file.
Filter by folder path
You can also filter files with the full or partial Folder path.
You can enter a partial folder path, or folder name.
Alternatively, enter the full folder path.
Go to the storage account, and select Storage Explorer > Blob Containers
Browse to the selected folder, and select full folder path.
Copy the full folder path and paste it into a temporary file.
Insert
/
in between each folder to create the folder path and enter this path into the Folder path field to select this folder.
Edit the schema
Select Next: Schema to view and edit your table column configuration. The service automatically identifies if the schema is compressed by looking at the name of the source.
In the Schema tab:
Confirm the format selected in Data format:
In this case, the data format is CSV
Tip
If you want to use JSON files, see Use the ingestion wizard to ingest JSON data from a local file to an existing table in Azure Data Explorer.
You can select the check box Ignore the first record to ignore the heading row of the file.
In the Mapping name field, enter a mapping name. You can use alphanumeric characters and underscores. Spaces, special characters, and hyphens aren't supported.
Edit the table
When ingesting to a new table, alter various aspects of the table when creating the table.
The changes you can make in a table depend on the following parameters:
- Table type is new or existing
- Mapping type is new or existing
Table type | Mapping type | Available adjustments |
---|---|---|
New table | New mapping | Change data type, Rename column, New column, Delete column, Update column, Sort ascending, Sort descending |
Existing table | New mapping | New column (on which you can then change data type, rename, and update), Update column, Sort ascending, Sort descending |
Existing mapping | Sort ascending, Sort descending |
Note
When adding a new column or updating a column, you can change mapping transformations. For more information, see Mapping transformations
Note
For tabular formats, you can't map a column twice. To map to an existing column, first delete the new column.
Command editor
Above the Editor pane, select the v button to open the editor. In the editor, you can view and copy the automatic commands generated from your inputs.
Select Next: Summary to create a table and mapping and to begin data ingestion.
Complete data ingestion
In the Data ingestion completed window, all three steps will be marked with green check marks when data ingestion finishes successfully.
Explore quick queries and tools
In the tiles below the ingestion progress, explore Quick queries or Tools:
Quick queries include links to the Azure Data Explorer web UI with example queries.
Tools includes links to Undo or Delete new data on the web UI, which enable you to troubleshoot issues by running the relevant
.drop
commands.Note
You might lose data when you use
.drop
commands. Use them carefully. Drop commands will only revert the changes that were made by this ingestion flow (new extents and columns). Nothing else will be dropped.
Create continuous ingestion
Continuous ingestion enables you to create an Event Grid that listens for new files in the source container. Any new file that meets the criteria of the pre-defined parameters (prefix, suffix, and so on) will be automatically ingested into the destination table.
Select Event Grid in the Continuous ingestion tile to open the Azure portal. The data connection page opens with the Event Grid data connector opened and with source and target parameters already entered (source container, tables, and mappings).
Data connection: Basics
- The Data connection blade opens with the Basics tab selected.
- Enter the Storage account.
- Choose the Event type that will trigger ingestion.
- Select Next: Ingest properties
Ingest properties
The Ingest properties tab opens with pre-filled routing settings. The target table name, format, and mapping name are taken from the table created above.
Select Next: Review + create
Review + create
Review the resources, and select Create.