Java gateway process exited before sending its port number when setting spark config

Matthieu Marshall 6 Reputation points
2023-01-19T17:03:47.3966667+00:00

Hello

I would appreciate it if someone would be able to point me in the right direction for an error I am seeing. I am trying to setup pyspark on a Azure DevOps build agent to use abfs to connect to an Azure Blob Storage container.

The error I am getting is below:

tests/unit/test_abfs_read_write.py:11: in <module>
    from rdslm_common import spark
rdslm_common/__init__.py:3: in <module>
    spark = get_spark()
rdslm_common/__main__.py:8: in get_spark
    spark_session = SparkSession.builder.config(
/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/pyspark/sql/session.py:269: in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/pyspark/context.py:483: in getOrCreate
    SparkContext(conf=conf or SparkConf())
/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/pyspark/context.py:195: in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/pyspark/context.py:417: in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
/opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/pyspark/java_gateway.py:106: in launch_gateway
    raise RuntimeError("Java gateway process exited before sending its port number")
E   RuntimeError: Java gateway process exited before sending its port number

The hidden line on line 8 of rdslm_common/main.py is:

SparkSession.builder.config("spark.jars.packages",f"org.apache.hadoop:hadoop-azure:3.3.1,com.databricks:spark-xml_2.12:0.15.0").getOrCreate()

Does anyone know what the cause could be?

When I run the same code locally on my machine, it works fine.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,271 questions
{count} votes

1 answer

Sort by: Most helpful
  1. sheng zi 0 Reputation points
    2023-06-13T15:59:13.69+00:00

    可以查看一下pyspark和java的版本是否相匹配,或者环境变量是不是没有配置好

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.