Non-breaking space in source file name causing Copy Data activity to fail in ADF

PM 0 Reputation points
2024-06-13T00:28:36.1533333+00:00

Use case:

Copy files from the SFTP source location to Azure blob storage using ADF (Copy Data Activity)

Implementation:

I have a DataSet pointing to the SFTP location acting as a source to Get MetaData Activity. The Get MetaData Activity returns the list of files which is then used in the ForEach step to perform copy operation.

Issue:

I have 5 files and one of the files has a nonbreaking space character in its name.

Example: FILE NAME.csv is the file with non-breaking space which is not recognized by the ADF and displaying it as BlackDiamond with a Question Mark symbol (I believe this happens when there is an encoding issue) in the output of the Get MetaData Activity

When I ran the pipeline, it was able to copy files from source to destination but got the below error in the Copy Data Activity

Error:

ErrorCode=UserErrorFailedToReadStream,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Failed to read data from Sftp server 'sftp.com', file path 'PROD/FILE NAME.csv', offset '0', length: '127038'.,Source=Microsoft.DataTransfer.ClientLibrary.SftpConnector,''Type=Renci.SshNet.Common.SshException,Message=Failed to open local file,Source=Renci.SshNet,'

Note: When I copy paste the fileName in a web browser like chrome it was able to recognize the non breaking space FILE%C2%A0NAME.csv

Could someone help with this? Why is the activity step failing even after copying successfully?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,625 questions
{count} votes

1 answer

Sort by: Most helpful
  1. phemanth 15,755 Reputation points Microsoft External Staff Moderator
    2024-06-13T08:07:08.9466667+00:00

    @MC

    Thanks for using MS Q&A platform and posting your query

    The issue you’re experiencing seems to be related to the handling of non-breaking space characters in file names by Azure Data Factory (ADF). When ADF encounters a non-breaking space in a file name, it might not recognize it correctly, which could lead to errors during the copy operation.

    Here are a few suggestions that might help resolve this issue:

    1. File Name Encoding: Ensure that the file names are properly encoded. Non-breaking spaces are often encoded as %C2%A0 in URLs. If possible, try to avoid using non-breaking spaces in file names.
    2. Filter Activity: Use a filter activity to exclude files with non-breaking space characters in their names1. This can prevent the copy activity from attempting to copy these files, thereby avoiding the error.
    3. Fault Tolerance: Consider adding fault tolerance to your copy activity2. This allows the copy activity to skip files that it cannot process, such as those with non-breaking space characters in their names.
    4. Check SFTP Server: There might be issues with the SFTP server, such as insufficient storage. Check the server and try optimizing the storage.

    Remember, it’s always a good practice to follow naming conventions that avoid special characters in file names. This can help prevent issues like the one you’re experiencing

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.