This article contains recommendations to configure production incremental processing workloads with Structured Streaming on Azure Databricks to fulfill latency and cost requirements for real-time or batch applications.
See examples of using Spark Structured Streaming with Cassandra, Azure Synapse Analytics, Python notebooks, and Scala notebooks in Azure Databricks.
Azure Databricks has specific features for working with semi-structured data fields contained in Avro, protocol buffers, and JSON data payloads. To learn more, see:
Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.