Safely manage jar dependencies
Components installed on HDInsight clusters have dependencies on third-party libraries. Usually, a specific version of common modules like Guava is referenced by these built-in components. When you submit an application with its dependencies, it can cause a conflict between different versions of the same module. If the component version that you reference in the classpath first, built-in components may throw exceptions because of version incompatibility. However, if built-in components inject their dependencies to the classpath first, your application may throw errors like NoSuchMethod
.
To avoid version conflict, consider shading your application dependencies.
What does package shading mean?
Shading provides a way to include and rename dependencies. It relocates the classes and rewrites affected bytecode and resources to create a private copy of your dependencies.
How to shade a package?
Use uber-jar
Uber-jar is a single jar file that contains both the application jar and its dependencies. The dependencies in Uber-jar are by-default not shaded. In some cases, this may introduce version conflict if other components or applications reference a different version of those libraries. To avoid this, you can build an Uber-Jar file with some (or all) of the dependencies shaded.
Shade package using Maven
Maven can build applications written both in Java and Scala. Maven-shade-plugin can help you create a shaded uber-jar easily.
The example below shows a file pom.xml
which has been updated to shade a package using maven-shade-plugin. The XML section <relocation>…</relocation>
moves classes from package com.google.guava
into package com.google.shaded.guava
by moving the corresponding JAR file entries and rewriting the affected bytecode.
After changing pom.xml
, you can execute mvn package
to build the shaded uber-jar.
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>com.google.guava</pattern>
<shadedPattern>com.google.shaded.guava</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
Shade package using SBT
SBT is also a build tool for Scala and Java. SBT doesn't have a shade plugin like maven-shade-plugin. You can modify build.sbt
file to shade packages.
For example, to shade com.google.guava
, you can add the below command to the build.sbt
file:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.guava" -> "com.google.shaded.guava.@1").inAll
)
Then you can run sbt clean
and sbt assembly
to build the shaded jar file.