Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
Serverless Scala and Java jobs are in Beta. You can use JAR tasks to deploy your JAR. See Manage Azure Databricks previews if it's not already enabled.
A Java archive (JAR) packages Java or Scala code into a single file. This article shows you how to create a JAR with Spark code and deploy it as a Lakeflow Job on serverless compute.
Tip
For automated deployment and continuous integration workflows, use Databricks Asset Bundles to create a project from a template with pre-configured build and deployment settings. See Build a Scala JAR using Databricks Asset Bundles and Bundle that uploads a JAR file to Unity Catalog. This article describes the manual approach for deployments or learning how JARs work with serverless compute.
Requirements
Your local development environment must have the following:
- sbt 1.11.7 or higher (for Scala JARs)
- Maven 3.9.0 or higher (for Java JARs)
- JDK, Scala, and Databricks Connect versions that match your serverless environment (this example uses JDK 17, Scala 2.13.16, and Databricks Connect 17.0.1)
Step 1. Build a JAR
Scala
Run the following command to create a new Scala project:
sbt new scala/scala-seed.g8When prompted, enter a project name, for example,
my-spark-app.Replace the contents of your
build.sbtfile with the following:scalaVersion := "2.13.16" libraryDependencies += "com.databricks" %% "databricks-connect" % "17.0.1" // other dependencies go here... // to run with new jvm options, a fork is required otherwise it uses same options as sbt process fork := true javaOptions += "--add-opens=java.base/java.nio=ALL-UNNAMED"Edit or create a
project/assembly.sbtfile, and add this line:addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "2.3.1")Create your main class in
src/main/scala/example/DatabricksExample.scala:package com.examples import org.apache.spark.sql.SparkSession object SparkJar { def main(args: Array[String]): Unit = { val spark = SparkSession.builder().getOrCreate() // Prints the arguments to the class, which // are job parameters when run as a job: println(args.mkString(", ")) // Shows using spark: println(spark.version) println(spark.range(10).limit(3).collect().mkString(" ")) } }To build your JAR file, run the following command:
sbt assembly
Java
Run the following commands to create a new Maven project structure:
# Create all directories at once mkdir -p my-spark-app/src/main/java/com/examples cd my-spark-appCreate a
pom.xmlfile in the project root with the following contents:<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.examples</groupId> <artifactId>my-spark-app</artifactId> <version>1.0-SNAPSHOT</version> <properties> <maven.compiler.source>17</maven.compiler.source> <maven.compiler.target>17</maven.compiler.target> <scala.binary.version>2.13</scala.binary.version> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </properties> <dependencies> <dependency> <groupId>com.databricks</groupId> <artifactId>databricks-connect_${scala.binary.version}</artifactId> <version>17.0.1</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.6.1</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>com.examples.SparkJar</mainClass> </transformer> </transformers> </configuration> </execution> </executions> </plugin> </plugins> </build> </project>Create your main class in
src/main/java/com/examples/SparkJar.java:package com.examples; import org.apache.spark.sql.SparkSession; import java.util.stream.Collectors; public class SparkJar { public static void main(String[] args) { SparkSession spark = SparkSession.builder().getOrCreate(); // Prints the arguments to the class, which // are job parameters when run as a job: System.out.println(String.join(", ", args)); // Shows using spark: System.out.println(spark.version()); System.out.println( spark.range(10).limit(3).collectAsList().stream() .map(Object::toString) .collect(Collectors.joining(" ")) ); } }To build your JAR file, run the following command:
mvn clean packageThe compiled JAR is located in the
target/directory asmy-spark-app-1.0-SNAPSHOT.jar.
Step 2. Create a job to run the JAR
In your workspace, click
Jobs & Pipelines in the sidebar.
Click Create, then Job.
The Tasks tab displays with the empty task pane.
Note
If the Lakeflow Jobs UI is ON, click the JAR tile to configure the first task. If the JAR tile is not available, click Add another task type and search for JAR.
Optionally, replace the name of the job, which defaults to
New Job <date-time>, with your job name.In Task name, enter a name for the task, for example
JAR_example.If necessary, select JAR from the Type drop-down menu.
For Main class, enter the package and class of your Jar. If you followed the example above, enter
com.examples.SparkJar.For Compute, select Serverless.
Configure the serverless environment:
- Choose an environment, then click
Edit to configure it.
- Select 4-scala-preview for the Environment version.
- Add your JAR file by dragging and dropping it into the file selector, or browse to select it from a Unity Catalog volume or workspace location.
- Choose an environment, then click
For Parameters, for this example, enter
["Hello", "World!"].Click Create task.
Step 3: Run the job and view the job run details
Click
to run the workflow. To view details for the run, click View run in the Triggered run pop-up or click the link in the Start time column for the run in the job runs view.
When the run completes, the output displays in the Output panel, including the arguments passed to the task.
Next steps
- To learn more about JAR tasks, see JAR task for jobs.
- To learn more about creating a compatible JAR, see Create an Azure Databricks compatible JAR.
- To learn more about creating and running jobs, see Lakeflow Jobs.