Azure Data Explorer use cases
Azure Data Explorer (ADX) is an extremely capable tool for exploring "hot data" using the intuitive and powerful Kusto Query Language (KQL). You can find an overview of the capabilities of Azure Data Explorer (ADX) is available here ADX overview.
This article discusses, at a high level, when you should consider using ADX and when you may want to use a different tool.
Optimal ADX use cases
Optimal use cases for ADX include:
- Using ADX to query time-series data in hot storage: ADX is a great choice if the data is queried by time, and the amount of data can be stored in hot storage. ADX loads data into hot storage, such as, local SSD. Data loaded is determined either by ingestion time or by a datetime column. By default, it loads the most recent data into hot storage. This loading behavior is modified through hot windows.
- Explore and visualizing data: ADX is good to use for ad-hoc exploration and visualization of data. Its intuitive query language, Kusto Query Language [KQL], makes it easy to analyze and visualize data.
- Querying time-series data: By default, ADX partitions data by time, and this makes it useful for filtering queries for a specific time range. KQL provides a wide variety of time-series functions. You can configure the partitioning of data in ADX using the partitioning policy, but it has limitations for data queries not based on time.
- Performing near real-time queries: One of the major advantages of ADX is that it's a near real-time data store. The intent is to make the data available for querying shortly after ingestion. ADX does this by storing data in a columnar store for fast querying. Additionally, it initially stores newly ingested data in a row-store for quick availability for querying. Both stores are ADX internals and not exposed to the user.
When to consider other solutions
Given its focus on processing data in hot storage with a simple partitioning scheme, ADX may not be optimal for these use cases:
- Querying cold storage: ADX is not a good choice if data isn't filtered by time and your data doesn't fit in hot storage. ADX is typically not a good solution for querying data from cold storage, as doing so requires loading huge partitions from cold storage into hot storage, which is slow, inefficient, and costly.
- Long-running queries: ADX's primary purpose is exploration using ad-hoc queries. It is not designed to use for long-running batch analytical queries.
- Multi-column partitioning: ADX currently partitions data based on a single column. However, if you need partitioning by multiple columns other than time, ADX may not be the best choice.