Azure Synapse Analytics : Pyspark notebook - inserting lines in a synapse dedicated SQL Table stopped working

Question

Hello,
I used to insert lines in a logging table in a SQL dedicated pool Table using pyspark with the following lines of code:

schema = StructType(

        [

            StructField("pipeline_id", StringType(), False),

            StructField("entity_name", StringType(), False),

            StructField("step_start_datetime", TimestampType(), False),

            StructField("step_end_datetime", TimestampType(), True),

            StructField("status", StringType(), False),

            StructField("message", StringType(), True),

        ]

    )



    df = spark.createDataFrame(

        [

            (

                pipeline_id,

                entity_name,

                step_start_datetime,

                step_end_datetime,

                status,

                message,

            )

        ],

        schema,

    )



    (

        df.write.option(Constants.SERVER, "allo-bi-syn-dev.sql.azuresynapse.net")

        .option(

            Constants.TEMP_FOLDER,

            "abfss://development@allobiadlscmn.dfs.core.windows.net/data/staging_data/",

        )

        .mode("append")

        .synapsesql("dedicated.log.table_load")

    )

This code example worked perfectly fine since today.

Now when i execute the code through a pipeline run, an error is raised, without any changes in access privilege to adls nor in the code.

However, when i run this code snippet manually in my notebook, no error is raised.

The error is the following :

An error occurred while calling o3800.synapsesqlforpython. : com.microsoft.spark.sqlanalytics.SQLAnalyticsConnectorException: COPY statement input file schema discovery failed: Cannot bulk load. The file "https://allobiadlscmn.dfs.core.windows.net/development/data/staging_data/SQLAnalyticsConnectorStaging/dedicated/log/table_load/internal/Append/1684937350891/application_1684937202630_0001/part-00000-010b84ef-6b21-463a-9507-4c75f46652f9-c000.snappy.parquet" does not exist or you don't have file access rights.

Once again, i precise that i haven't changed the code, and my azure admin certified that no changes have been pushed.

Was there some azure update that broke the adls link with synapse ?

Answer

Hello Etienne Candelot ,

Welcome to the MS Q&A platform.

As per the error message "parquet does not exist or you don't have file access rights", the issue seems to be with access or file doesn't exist. This error can occur if the credentials used to access the input file schema are incorrect.

Since you mentioned that there were no code changes or access privileges, but please check the below.

Verify the file path is correct and that the file exists in the specified container.
Verify the credentials you're using to access the file are correct and have the necessary permissions to access the file.

I hope this helps. Please let us know if you have any further questions.

Azure Synapse Analytics : Pyspark notebook - inserting lines in a synapse dedicated SQL Table stopped working

1 answer