Azure Kubernetes in event stream processing

Azure Kubernetes Service (AKS)
Azure IoT Hub
Azure Event Hubs
Azure Functions
Azure Cosmos DB

Solution ideas

This article is a solution idea. If you'd like us to expand the content with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know by providing GitHub feedback.

This article describes a variation of a serverless event-driven architecture that runs on Azure Kubernetes Service (AKS) with KEDA scaler. The solution ingests a stream of data, processes the data, and then writes the results to a back-end database.

Architecture

Architecture diagram showing the data flow described in this article.

Download a Visio file of this architecture.

Dataflow

  1. AKS with the KEDA scaler is used to autoscale Azure Functions containers based on the number of events needing to be processed.
  2. Events arrive at the Input Event Hub.
  3. The De-batching and Filtering Azure Function is triggered to handle the event. This step filters out unwanted events and de-batches the received events before submitting to the Output Event Hub.
  4. If the De-batching and Filtering Azure Function fails to store the event successfully, the event is submitted to the Deadletter Event Hub 1.
  5. Events arriving at the Output Event Hub trigger the Transforming Azure Function. This Azure Function transforms the event into a message for the Azure Cosmos DB instance.
  6. The event is stored in an Azure Cosmos DB database.

Components

  • Azure Kubernetes Service (AKS) simplifies deploying a managed Kubernetes cluster in Azure by offloading the operational overhead to Azure. As a hosted Kubernetes service, Azure handles critical tasks, like health monitoring and maintenance.
  • KEDA is an event-driven autoscaler used to scale containers in the Kubernetes cluster based on the number of events needing to be processed.
  • Event Hubs ingests the data stream. Event Hubs is designed for high-throughput data streaming scenarios.
  • Azure Functions is a serverless compute option. It uses an event-driven model, where a piece of code (a function) is invoked by a trigger.
  • Azure Cosmos DB is a multi-model database service that is available in a serverless, consumption-based mode. For this scenario, the event-processing function stores JSON records, using the Azure Cosmos DB for NoSQL.

Note

For Internet of Thing (IoT) scenarios, we recommend Azure IoT Hub. IoT Hub has a built-in endpoint that's compatible with the Azure Event Hubs API, so you can use either service in this architecture with no major changes in the back-end processing. For more information, see Connecting IoT Devices to Azure: IoT Hub and Event Hubs.

Scenario details

This article describes a serverless event-driven architecture that runs on AKS with KEDA scaler. The solution ingests a stream of data, processes the data, and then writes the results to a back-end database.

To learn more about the basic concepts, considerations, and approaches for serverless event processing, see the Serverless event processing reference architecture.

Potential use case

A popular use case for implementing an end-to-end event stream processing pattern includes the Event Hubs streaming ingestion service to receive and process events per second using a de-batching and transformation logic implemented with highly scalable, event hub-triggered functions.

Contributors

This article is maintained by Microsoft. It was originally written by the following contributors.

Principal author:

To see non-public LinkedIn profiles, sign in to LinkedIn.

Next steps