Apache Spark streaming job that reads Apache Kafka data fails with NoClassDefFoundError in HDInsight
This article describes troubleshooting steps and possible resolutions for issues when using Apache Spark components in Azure HDInsight clusters.
Issue
The Apache Spark cluster runs a Spark streaming job that reads data from an Apache Kafka cluster. The Spark streaming job fails if the Kafka stream compression is turned on. In this case, the Spark streaming Yarn app application_1525986016285_0193 failed, due to error:
18/05/17 20:01:33 WARN YarnAllocator: Container marked as failed: container_e25_1525986016285_0193_01_000032 on host: wn87-Scaled.2ajnsmlgqdsutaqydyzfzii3le.cx.internal.cloudapp.net. Exit status: 50. Diagnostics: Exception from container-launch.
Container id: container_e25_1525986016285_0193_01_000032
Exit code: 50
Stack trace: ExitCodeException exitCode=50:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:944)
Cause
This error can be caused by specifying a version of the spark-streaming-kafka
jar file that is different than the version of the Kafka cluster you're running.
For example, if you're running a Kafka cluster version 0.10.1, the following command results in an error:
spark-submit \
--packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.2.0
--conf spark.executor.instances=16 \
...
~/Kafka_Spark_SQL.py <bootstrap server details>
Resolution
Use the Spark-submit
command with the –packages
option, and ensure that the version of the spark-streaming-kafka jar file is the same as the version of the Kafka cluster that you are running.
Next steps
If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:
Get answers from Azure experts through Azure Community Support.
Connect with @AzureSupport - the official Microsoft Azure account for improving customer experience. Connecting the Azure community to the right resources: answers, support, and experts.
If you need more help, you can submit a support request from the Azure portal. Select Support from the menu bar or open the Help + support hub. For more detailed information, review How to create an Azure support request. Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the Azure Support Plans.
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for