Azure ADF Copy activity from Blob to Databricks delta lake table is failing with Malformed csv record error.

Question

Azure ADF Copy activity from Blob to Databricks delta lake table is failing with Malformed csv record error.

Derik Roby 16

I created a ADF copy activity pipeline that moves csv.gz files from Azure blob storage to delta tables in Azure Databricks. All the data types are of the format string in both the source and sink. 2 files was moved successfully, but the last one keeps failing even though all the files have the same configurations. The pipeline fails with the below error message:

ErrorCode=AzureDatabricksCommandError,Hit an error when running the command in Azure Databricks. Error details: org.apache.spark.SparkException: Job aborted. Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult: Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 38.0 failed 4 times, most recent failure: Lost task 0.3 in stage 38.0 (TID 58) (10.139.64.4 executor driver): com.databricks.sql.io.FileReadException: Error while reading file wasbs:@dbtpostgresstorage.blob.core.windows.net/7483a7f6-d13e-4cdb-835e-3f198633241d/AzureDatabricksDeltaLakeImportCommand/listings.txt. Caused by: org.apache.spark.SparkException: Malformed records are detected in record parsing. Parse Mode: FAILFAST. To process malformed records as null result, try setting the option 'mode' as 'PERMISSIVE'. Caused by: org.apache.spark.sql.catalyst.util.BadRecordException: org.apache.spark.sql.catalyst.csv.MalformedCSVException: Malformed CSV record Caused by: org.apache.spark.sql.catalyst.csv.MalformedCSVException: Malformed CSV record Caused by: com.databricks.sql.io.FileReadException: Error while reading file wasbs:@dbtpostgresstorage.blob.core.windows.net/7483a7f6-d13e-4cdb-835e-3f198633241d/AzureDatabricksDeltaLakeImportCommand/listings.txt. Caused by: org.apache.spark.SparkException: Malformed records are detected in record parsing. Parse Mode: FAILFAST. To process malformed records as null result, try setting the option 'mode' as 'PERMISSIVE'. Caused by: org.apache.spark.sql.catalyst.util.BadRecordException: org.apache.spark.sql.catalyst.csv.MalformedCSVException: Malformed CSV record Caused by: org.apache.spark.sql.catalyst.csv.MalformedCSVException: Malformed CSV record.

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-05-16T18:01:32.5+00:00

Hi Derik Roby, Just checking if below answer helps you? If yes, Please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if any further queries.

1 answer

Your answer

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-05-16T18:01:32.5+00:00

Hi Derik Roby, Just checking if below answer helps you? If yes, Please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if any further queries.

Answer 1

ShaikMaheer-MSFT 38,546 Microsoft Employee Moderator

Hi Derik Roby,

Thank you for posting query in Microsoft Q&A Platform.

From the error message it seems your file has some corrupted data which is not in expected format.

Kindly see if under copy activity settings, you can use skip incompatible rows option to run it successfully. If not, then you need to consider checking your source file by opening in editor and identifying if there is any row is not in correct or expected format.

Hope this helps. Please let me know how it goes.

ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator

2023-04-28T14:31:39.47+00:00

Hi Derik Roby,

Just checking if above answer helps you? If yes, Please consider hitting Accept Answer button. Accepted answers help community as well. Please let me know if any further queries.

Share via

Azure ADF Copy activity from Blob to Databricks delta lake table is failing with Malformed csv record error.

1 answer

Your answer