Message queues and stream processing

Beginner
Developer
Student
Azure

The increase of available data has led to the rise of continuous streams of real-time data to process. Learn about different systems and techniques for consuming and processing real-time data streams.

Learning objectives

In this module, you will:

  • Define a message queue and recall a basic architecture
  • Recall the characteristics, and present the advantages and disadvantages, of a message queue
  • Explain the basic architecture of Apache Kafka
  • Discuss the roles of topics and partitions, as well as how scalability and fault tolerance are achieved
  • Discuss general requirements of stream processing systems
  • Recall the evolution of stream processing
  • Explain the basic components of Apache Samza
  • Discuss how Apache Samza achieves stateful stream processing
  • Discuss the differences between the Lambda and Kappa architectures
  • Discuss the motivation for the adoption of message queues and stream processing in the LinkedIn use case

In partnership with Dr. Majd Sakr and Carnegie Mellon University.

Prerequisites

  • Understand what cloud computing is, including cloud service models and common cloud providers
  • Know the technologies that enable cloud computing
  • Understand how cloud service providers pay for and bill for the cloud
  • Know what datacenters are and why they exist
  • Know how datacenters are set up, powered, and provisioned
  • Understand how cloud resources are provisioned and metered
  • Be familiar with the concept of virtualization
  • Know the different types of virtualization
  • Understand CPU virtualization
  • Understand memory virtualization
  • Understand I/O virtualization
  • Know about the different types of data and how they're stored
  • Be familiar with distributed file systems and how they work
  • Be familiar with NoSQL databases and object storage, and how they work
  • Know what distributed programming is and why it's useful for the cloud
  • Understand MapReduce and how it enables big-data computing
  • Understand Spark and how it differs from MapReduce
  • Understand GraphLab and how it differs from MapReduce and Spark

Get started with Azure

Choose the Azure account that's right for you. Pay as you go or try Azure free for up to 30 days. Sign up.