SAS authentication does not work with ADSL gen2 with pyspark

SHYAMALA GOWRI 40 Reputation points
2024-04-03T18:05:37.37+00:00

I tried creating SAS Token and tried using in psyspark code (Service Principal mode of authentication worked ) but it failed with following for SAS .

24/04/03 14:05:02 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: abfss://pyspark@sparkadlsiae.dfs.core.windows.net/people.csv.Unable to load SAS token provider class: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider not foundjava.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider not found.

I had executed Pyspark as

pyspark --jars hadoop-azure-3.3.3.jar,hadoop-azure-datalake-3.3.3.jar,hadoop-common-3.3.3.jar

And my connection property were as follows:

spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "SAS")

spark.conf.set("fs.azure.sas.token.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider")

spark.conf.set("fs.azure.sas.fixed.token.<storage-account>.dfs.core.windows.net", "<token>")

spark.read.csv("abfss://<CONTAINER>@<STORAGE ACCOUNT>.dfs.core.windows.net/<PATH>/<FILE>.csv")

Should i be adding more jars in the classpath to get this working

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,517 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Smaran Thoomu 9,210 Reputation points Microsoft Vendor
    2024-04-23T17:19:52.6933333+00:00

    Hi @SHYAMALA GOWRI

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .

    Issue: I tried creating SAS Token and tried using in psyspark code (Service Principal mode of authentication worked ) but it failed with following for SAS .

    24/04/03 14:05:02 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: abfss://pyspark@sparkadlsiae.dfs.core.windows.net/people.csv.Unable to load SAS token provider class: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider not foundjava.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider not found.

    I had executed Pyspark as

    pyspark --jars hadoop-azure-3.3.3.jar,hadoop-azure-datalake-3.3.3.jar,hadoop-common-3.3.3.jar

    And my connection property were as follows:

    spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "SAS")

    spark.conf.set("fs.azure.sas.token.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider")

    spark.conf.set("fs.azure.sas.fixed.token.<storage-account>.dfs.core.windows.net", "<token>")

    spark.read.csv("abfss://<CONTAINER>@<STORAGE ACCOUNT>.dfs.core.windows.net/<PATH>/<FILE>.csv")

    Should i be adding more jars in the classpath to get this working.

    Solution: I am able to get this working after i built jar that contains an implementation of org.apache.hadoop.fs.azurebfs.extensions.SASTokenProvider and added it to the classpath.

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    I hope this helps!

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments