IoT analytics with Azure Data Explorer

Azure Cosmos DB
Azure Data Explorer
Azure Digital Twins

Solution ideas

This article is a solution idea. If you'd like us to expand the content with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know by providing GitHub feedback.

This solution idea describes how Azure Data Explorer provides near real-time analytics for fast flowing, high volume streaming data from internet of things (IoT) devices and sensors. This analytics workflow is part of an overall IoT solution that integrates operational and analytical workloads with Azure Cosmos DB and Azure Data Explorer.

Jupyter is a trademark of its respective company. No endorsement is implied by the use of this mark. Apache® and Apache Kafka® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

Architecture

Diagram showing IoT telemetry analytics with Azure Data Explorer.

Download a Visio file of this architecture.

Dataflow

  1. Azure Event Hubs, Azure IoT Hub, or Kafka ingest a wide variety of fast-flowing streaming data such as logs, business events, and user activities.

  2. Azure Functions or Azure Stream Analytics process the data in near real time.

  3. Azure Cosmos DB stores streamed messages in JSON format to serve a real-time operational application.

  4. Azure Data Explorer ingests data for analytics, using its connectors for Azure Event Hubs, Azure IoT Hub, or Kafka for low latency and high throughput.

    Alternatively, you can ingest blobs from your Azure Blob Storage or Azure Data Lake Storage account into Azure Data Explorer by using an Event Grid data connection.

    You can also continuously export data to Azure Storage in compressed, partitioned Apache Parquet format, and seamlessly query the data with Azure Data Explorer. For details, see Continuous data export overview.

  5. To serve both the operational and analytical use cases, data can either route to Azure Data Explorer and Azure Cosmos DB in parallel, or from Azure Cosmos DB to Azure Data Explorer.

    • Azure Cosmos DB transactions can trigger Azure Functions via change feed. Functions will stream data to Event Hubs for ingestion into Azure Data Explorer.

      or

    • Azure Functions can invoke Azure Digital Twins through its API, which then streams data to Event Hubs for ingestion into Azure Data Explorer.

  6. The following interfaces get insights from data stored in Azure Data Explorer:

  7. Azure Data Explorer integrates with Azure Databricks and Azure Machine Learning to provide machine learning (ML) services. You can also build ML models using other tools and services, and export them to Azure Data Explorer for scoring data.

Components

This solution idea uses the following Azure components:

Azure Data Explorer

Azure Data Explorer is a fast, fully managed, and highly scalable big data analytics service. Azure Data Explorer can analyze large volumes of streaming data from applications, websites, and IoT devices in near real-time to serve analytics applications and dashboards.

Azure Data Explorer provides native advanced analytics for:

The Azure Data Explorer Web UI connects to Azure Data Explorer clusters to help write, run, and share Kusto Query Language commands and queries. Azure Data Explorer Dashboards are a feature in the Data Explorer Web UI that natively exports Kusto queries to optimized dashboards.

Other Azure components

  • Azure Cosmos DB is a fully managed, fast NoSQL database service for modern app development with open APIs for any scale.
  • Azure Digital Twins stores digital models of physical environments, to help create next-generation IoT solutions that model the real world.
  • Azure Event Hubs is a fully managed, real-time data ingestion service.
  • Azure IoT Hub enables bi-directional communication between IoT devices and the Azure cloud.
  • Azure Synapse Link for Azure Cosmos DB runs near real-time analytics over operational data in Azure Cosmos DB, without any performance or cost impact on transactional workloads. Synapse Link uses the SQL Serverless and Spark Pools analytics engines from the Azure Synapse workspace.
  • Kafka on HDInsight is an easy, cost-effective, enterprise-grade service for open-source analytics with Apache Kafka.

Scenario details

This solution uses Azure Data Explorer to get near real-time IoT telemetry analytics on fast-flowing, high-volume streaming data from a wide variety of IoT devices.

Potential use cases

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal author:

Next steps