Why are my ADF logs being saved as txt instead of csv?

Sam Rosentel [SFPH] 0 Reputation points
2024-08-13T20:40:04.3166667+00:00

I'm getting log files as listed in the copy activity log section from the docs.

After the copy activity runs completely, you can see the path of log files from the output of each Copy activity run. You can find the log files from the path: https://[your-blob-account].blob.core.windows.net/[logFilePath]/copyactivity-logs/[copy-activity-name]/[copy-activity-run-id]/[auto-generated-GUID].txt. The log files generated have the .txt extension and their data is in CSV format.

Compare this to the quote from the Fault Tolerance section of the docs - subheading Copy Tabular Data > Monitor Skipped Rows

If you configure to log the skipped file names, you can find the log file from this path: https://[your-blob-account].blob.core.windows.net/[path-if-configured]/copyactivity-logs/[copy-activity-name]/[copy-activity-run-id]/[auto-generated-GUID].csv. The log files have to be the csv files.

The problem is that I'm getting a .txt file as mentioned in the first doc but my data looks like the example from the second.

Example log file from copy-activity-log (supposed to be .txt parseable as .csv)

Timestamp, Level, OperationName, OperationItem, Message 2020-10-19 08:39:13.6688152,Info,FileRead,"sample1.csv","Start to read file: {""Path"":""sample1.csv"",""ItemType"":""File"",""Size"":104857620,""LastModified"":""2020-10-19T08:22:31Z"",""ETag"":""\""0x8D874081F80C01A\"""",""ContentMD5"":""dGKVP8BVIy6AoTtKnt+aYQ=="",""ObjectName"":null}"
Example log file from fault-tolerance (supposed to be in csv format)

Timestamp, Level, OperationName, OperationItem, Message
2020-02-26 06:22:32.2586581, Warning, TabularRowSkip, """data1"", ""data2"", ""data3""," "Column 'Prop_2' contains an invalid value 'data3'. Cannot convert 'data3' to type 'DateTime'." 
2020-02-26 06:22:33.2586351, Warning, TabularRowSkip, """data4"", ""data5"", ""data6"",", "Violation of PRIMARY KEY constraint 'PK_tblintstrdatetimewithpk'. Cannot insert duplicate key in object 'dbo.tblintstrdatetimewithpk'. The duplicate key value is (data4)."

I've tried creating a dataset based on the txt file with type csv, but the pipeline can't parse it properly with the triple quote.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,376 questions
{count} votes

5 answers

Sort by: Most helpful
  1. phemanth 14,810 Reputation points Microsoft External Staff
    2024-08-14T14:45:06.15+00:00

    @Sam Rosentel [SFPH]

    Thanks for using MS Q&A platform and posting your query.

    The logs generated by the Copy activity are saved with a .txt extension but contain data in a CSV format. This can be confusing because the file extension doesn’t match the content format.

    Here are a few points to consider:

    File Extension vs. Content Format: The .txt extension is used by default for Copy activity logs, but the content is structured as CSV. This is why your data looks like CSV even though the file extension is .txt.

    Parsing Issues: If you’re having trouble parsing these .txt files as CSV, you might need to explicitly specify the format when reading the files. For example, if you’re using Azure Synapse Analytics, you can use the OPENROWSET function to read the files as CSV.

    User's image

    Configuration: Ensure that your logging configuration in ADF is set correctly. The path and file extension for logs can sometimes be influenced by the settings in your pipeline. Double-check the settings to ensure they align with your expectations.

    Shows how to configure logging for a Copy activity in the settings tab.

    Workaround: If you need the files to have a .csv extension, you might consider a post-processing step to rename the files after they are generated. This can be done using an Azure Function or a similar service to automate the renaming process.

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments

  2. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  3. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  4. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  5. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.