Design an Azure Stream Analytics solution for data analysis

Completed

The process of consuming data streams, analyzing them, and deriving actionable insights is called stream processing. Azure Stream Analytics is a fully managed (PaaS offering), real-time analytics and complex event-processing engine. It offers the possibility to perform real-time analytics on multiple streams of data from sources like IoT device data, sensors, clickstreams, and social media feeds.

Things to know about Azure Stream Analytics

Azure Stream Analytics works on the following concepts:

  • Data streams: Data streams are continuous data generated by applications, IoT devices, or sensors. The data streams are analyzed and actionable insights are extracted. Some examples are monitoring data streams from industrial and manufacturing equipment and monitoring water pipeline data by utility providers. Data streams help us understand change over time.
  • Event processing: Event processing refers to consumption and analysis of a continuous data stream to extract actionable insights from the events happening within that stream. An example is how a car passing through a tollbooth should include temporal information like a timestamp that indicates when the event occurred.

Important

Azure Stream Analytics supports processing events in three data formats: CSV, JSON, and Avro.

The following illustration shows the Stream Analytics pipeline, and how data is ingested, analyzed, and sent for presentation or action.

Diagram that shows the Stream Analytics pipeline, and how data is ingested, analyzed, and sent for presentation or action.

Key features

Stream Analytics ingests data from Azure Event Hubs (including Azure Event Hubs from Apache Kafka), Azure IoT Hub, or Azure Blob Storage. The query, which is based on SQL query language, can be used to easily filter, sort, aggregate, and join streaming data over a period. You can also extend this SQL language with JavaScript and C# user-defined functions (UDFs).

An Azure Stream Analytics job consists of an input, query, and an output. You can do the following tasks with the job output:

  • Route data to storage systems like Azure Blob Storage, Azure SQL Database, Azure Data Lake Store, and Azure Cosmos DB.
  • Send data to Power BI for real-time visualization.
  • Store data in a Data Warehouse service like Azure Synapse Analytics to train a machine learning model based on historical data or perform batch analytics.
  • Trigger custom downstream workflows by sending the data to services like Azure Functions, Azure Service Bus Topics, or Azure Queues.

Business scenario

Tailwind Traders is using digital transformation for their applications and services to help with the growth of the company. They need to support accessing, storing, and analyzing sensor data from the GPS on their delivery trucks that are on the road delivering goods. You're looking for a solution to provide real time analytics on GPS streaming data from the trucks to enable administrators to make decisions in real time. On further analysis, you learn that the team would like this data present in an existing Power BI visualization dashboard. Azure Stream Analytics can help fulfill the requirements of this scenario.

Azure Stream Analytics is an ideal solution for other common enterprise data requirements. Consider the following scenarios:

Requirement Description
Analyze real-time telemetry streams from IoT devices. Gather real-time sensor data in Azure Stream Analytics by building automation systems that relay temperature, humidity, fan runtimes. You can make adjustments to maintain optimum building temperature and reduce costs.
Build web logs and clickstream analytics. A consumer goods retailer can offer real-time product suggestions to users based on e-commerce analytics.
Create geospatial analytics. Prepare analytics for geospatial data sources like sensors, social media, satellite imagery, and mobile devices. You can predict extreme weather events like wildfires and hurricanes to help airlines with routing. You can send out mobile alerts to customers for adverse weather conditions based on their geolocation.
Execute remote monitoring and predictive maintenance of high value assets. Monitor high value assets such as Industrial equipment by gathering operational data in Azure Stream Analytics. You can maximize the useful life of your equipment through predictive maintenance. Data gathered from electrical power transformers can be used by utility companies to avoid disruption of operation.
Perform real-time analytics on point of sale data. Detect fraudulent credit card transactions, and identify suspicious activity at point of sale. You can spot unusually large transactions or unusual location activity based on the credit card holder's contact information. Alert triggers can be set up on data gathered in Azure Stream Analytics.

In the Tailwind Traders scenario, we can apply Azure Stream Analytics to visualize real-time locations of the trucks through Power BI. For management decisions on analytical workloads, data can be stored in a data warehouse like Azure Cosmos DB or Azure Data Lake for future analysis.

Things to consider when using Azure Stream Analytics

Azure Stream Analytics can be a valuable component in your data integration plan for Tailwind Traders. Review the following benefits of the service.

  • Consider provisioning requirements. Azure Stream Analytics is a fully managed service. It's offered as a PaaS (Platform as a Service) offering, so there's no overhead of provisioning any hardware or infrastructure. Azure Stream Analytics fully manages your job, so you can focus on your business logic and not on the infrastructure.
  • Consider costs. Stream Analytics is low cost. Billing is done by Streaming Units (SUs) consumed that represents the amount of CPU and memory resources allocated. Scaling up and down are based on business needs, which can also lower costs. No maintenance charges are involved.
  • Consider implementation. You can run Azure Stream Analytics in the cloud for large-scale analytics. For ultra-low latency analytics, run Stream Analytics on IoT Edge or Azure Stack.
  • Consider performance. Stream Analytics offers reliable performance guarantees. It supports higher performance by partitioning, which allows complex queries to be parallelized and executed on multiple streaming nodes. Stream Analytics can process millions of events every second. It can deliver results with ultra-low latencies.
  • Consider security. Stream Analytics encrypts all incoming and outgoing communications and supports TLS 1.2. Built-in checkpoints are also encrypted. Stream Analytics doesn't store the incoming data because all processing is done in-memory.