Use Apache Kafka and Azure Databricks for streaming

Cunwei zhao 0 Reputation points
2023-08-29T08:23:28.0833333+00:00

Use Apache Kafka and Azure Databricks for streaming,The code is as follows


# Read data from Apache Kafka
from pyspark.sql.functions import *
from pyspark.sql.types import StructType, StructField
from pyspark.sql.types import *

#import pdb; pdb.set_trace() 
kafka_df_zcw = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "10.10.173.17:9092").option("kafka.security.protocol", "PLAINTEXT").option("subscribe", "zcw").load()

schema = StructType([
    StructField("speed", IntegerType()),
    StructField("volkswagen", StringType()),   
    StructField("version", StringType()),
    StructField("ts", LongType())
])

kafka_df_zcw_temp = kafka_df_zcw.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")

                              .outputMode("append")\               
                               .format("console")\                
                                .start() query_zcw.awaitTermination()

Always prompt as follows:

Snipaste_2023-08-29_16-22-58

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,176 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.