How to Disable or Remove Hive Warehouse Connector from HDInsight 5.1

ECSEKI Tibor 0 Reputation points
2025-03-12T11:51:57.01+00:00

Hello,

We are running Spark on HDInsight 5.1 and are not using Hive Metastore at all. However, our Spark application is failing with the following exception:

User class threw exception: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/metrics/impl/FastLongHistogram at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addHBaseDependencyJars(TableMapReduceUtil.java:812) at org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:868) 

It appears that the class FastLongHistogram was removed in hive-warehouse-connector-assembly-2.1.0.5.1.7.7.jar, whereas it was present in hive-warehouse-connector-assembly-2.1.0.5.1.6.7.jar.

Since we are not using Hive Metastore, we would like to disable or completely remove Hive Warehouse Connector from our HDInsight cluster. What would be the recommended way to achieve this?

Thanks in advance for your help!

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
222 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Smaran Thoomu 21,600 Reputation points Microsoft External Staff
    2025-03-12T14:04:19.8066667+00:00

    Hi @ECSEKI Tibor

    Since you're not using Hive Metastore and are running into issues with the Hive Warehouse Connector (HWC) on HDInsight 5.1, there are a few ways you can disable or remove it:

    1. You can prevent the Hive Warehouse Connector from loading by setting the following in your Spark configuration:
         --conf spark.driver.extraClassPath=""
         --conf spark.executor.extraClassPath=""
      
    2. If you have cluster access, navigate to the directory where the JAR is located (typically: /usr/hdp/current/spark2-client/jars/) and rename or remove the hive-warehouse-connector-assembly-*.jar file. Then, restart the Spark services to apply the changes.
    3. To tell Spark to ignore Hive integration, set this option when running your job:
         --conf spark.sql.catalogImplementation=in-memory
      
      This ensures Spark doesn’t attempt to use Hive for metadata storage.
    4. If you're planning to create new HDInsight clusters, you can use a script action during cluster deployment to remove or disable the HWC JAR file automatically.

    Hope this helps! Let me know if you need further clarification or run into any issues.

    Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.