Hi, Thanks for reaching out to Microsoft Q&A. From the error either you should have "storage blob data contributor" role to access the File from adls to synapse workspace or the file is missing from the given path., Pls check if the path given if correct. Pls accept and upvote the answer if you find this answer correct or it helped to fix your issue.
Synapse ERROR: : COPY statement input file schema discovery failed: Cannot bulk load
Devender
61
Reputation points
I am facing one issue when we are loading the data from Synapse Spark notebook to Synapse dedicated pool we are getting an error:
the code i am running to load data into SQL dedicated pool:
df_1.write.mode("append").synapsesql("TEST_DATABASE.dbo.Test_table", Constants.INTERNAL)
When we are running this Code manually through spark notebook it is running fine but when running dynamically through pipleine it is throwing the below error. I want to know what can be the cause for this when running through pipeline. This issue we are facing from last 1 week, till last week all the pipeline were functioning properly.
Py4JJavaError: An error occurred while calling o4399.synapsesqlforpython.
: com.microsoft.spark.sqlanalytics.SQLAnalyticsConnectorException: COPY statement input file schema discovery failed: Cannot bulk load. The file "https://Dummyadls.dfs.core.windows.net/DummySynapse/synapse/workspaces/WorkingSynapse/sparkpools/AdventuteTableSpark/sparkpoolinstances/2864980c-2195-47d6-bc42-d8b4ff8e3e11/livysessions/2023/05/11/1910/tempdata/
TEST_DATABASE/dbo/Test_table/internal/Append/1683796649698/application_1683796164761_0003/part-00000-2a984778-ed60-4fb4-aeed-ab2efb9e1399-c000.snappy.parquet" does not exist or you don't have file access rights.
at com.microsoft.spark.sqlanalytics.SqlAnalyticsConnectorClass$SQLAnalyticsFormatWriter.sqlanalytics(SqlAnalyticsConnectorClass.scala:347)
at com.microsoft.spark.sqlanalytics.SqlAnalyticsConnectorClass$SQLAnalyticsFormatWriter.synapsesql(SqlAnalyticsConnectorClass.scala:191)
at com.microsoft.spark.sqlanalytics.SqlAnalyticsConnectorClass$SQLAnalyticsFormatWriter.synapsesqlforpython(SqlAnalyticsConnectorClass.scala:203)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: COPY statement input file schema discovery failed: Cannot bulk load. The file