Posting the solution I was given after contacting support:
This is a bug that sometimes occurs when the workspace is created. After I created a new workspace and ran the same commands, the code worked great.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hello,
I am unable to run a simple spark.sql() (ex. df = spark.sql("SELECT * FROM table1")) in Synapse notebooks. I am able to load and view the file without using SQL, but when using spark.sql() I receive errors for all files including csv and parquet file types.
I have tried different sized clusters, restarting clusters, spark versions, and changing the language and code from PySpark to Scala. My workspace has permission to access my data in ADLS Gen 2. Apologies if this question has already been answered elsewhere. Below is the error I am receiving.
AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException;
Traceback (most recent call last):
File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 767, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
File "/opt/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in call
answer, self.gateway_client, self.target_id, self.name)
File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 75, in deco
raise AnalysisException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException;
Thanks
Posting the solution I was given after contacting support:
This is a bug that sometimes occurs when the workspace is created. After I created a new workspace and ran the same commands, the code worked great.
how are you loading in PySpark? Via the forPath? Did you do a "saveAsTable" on creation? (or any subsequent table creation command)?
I changed the code in cell 33..
# Write data to a new managed catalog table.
## old...... data.write.format("delta").saveAsTable("ManagedDeltaTable")
##new
data.write.format("delta").mode("overwrite").saveAsTable("ManagedDeltaTable")