Azure Databricks error org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND]

Jess N 0 Reputation points Microsoft Vendor
2023-05-17T19:11:45.16+00:00

Hi All and thanks for your time:

A Databricks notebook that previously ran w/o issues (for the same data) has the following error:
org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find data source: sstream

Update: This seems to be a permissions issue. A user w/ elevated permissions executes the same code flawlessly.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,381 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Jess N 0 Reputation points Microsoft Vendor
    2023-05-18T19:16:43.41+00:00

    my role allows for "Starts a managed cluster". While looking at permissions together my databricks session refreshed. I could then see my coworker's notebooks. After which I refreshed the page, opened a test notebook, started our managed cluster we use for the group, ran the notebook, and it ran and finished every cell as designed.

    While we did not make any changes- maybe.... there were changes that occurred but which we were not aware of? What if our ip from vpn didn't get added to a databricks backend white list (if that exists)- I suspect I'd have the issue we started off with. That's just a hypothetical. If it happens again I will check my ip, disconnect and reconnect until i get a new one. Possibly even change work wifi/network if needed. And then update w/ results.

    Regardless, everything works now. So problem solved and case closed :)