Azure Data Factory - Copy Activity fails intermittently with "Failure happened on 'Source' side. ErrorCode=UserErrorFailedFileOperation"

Potlapally, Harsha 6 Reputation points
2023-03-21T16:31:56.8066667+00:00

Azure Data Factory - Copy Activity fails intermittently with the following error message:
Operation on target SqlServer_to_ADLS failed: Failure happened on 'Source' side. ErrorCode=UserErrorFailedFileOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Upload file failed at path devlake/jbtrn_owner_alloc\jbtrn_owner_alloc.parquet.,Source=Microsoft.DataTransfer.Common,''Type=System.Data.SqlClient.SqlException,Message=A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available.),Source=.Net SqlClient Data Provider,SqlErrorNumber=64,Class=20,ErrorCode=-2146232060,State=0,Errors=[{Class=20,Number=64,State=0,Message=A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available.),},],''Type=System.ComponentModel.Win32Exception,Message=The specified network name is no longer available,Source=,'

Source: SqlServer
Target: Azure Data lake Gen 2.

Steps already tried that didn't help:

  1. Increased the retry option to 4
  2. Ran the job at different times.

Things to note:

  1. There are other tables that are able to download from the same source database without any issue.

The part that is not clear is if the error is occurring on the source SQL database or the Azure Data Lake Storage.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} vote

3 answers

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator
    2023-03-23T00:33:57.2933333+00:00

    Hi @Potlapally, Harsha ,

    Welcome to Microsoft Q&A forum and thanks for reaching out here.

    From the error message and from my past experience on such issues, I am confident that the issue is either on the source SQL Server side or a network issue between your Integration runtime and SQL server. I have noticed such behavior when users try to copy data from On-Prem SQL server using SHIR (some times this error may also occur while copying from azure SQL using Azure IR /SHIR).

    In case if your source is also an On Prem SQL server and using SHIR then would recommend looking at SHIR logs which can give you more information about the problem. In addition, I recommend working with your SQL admin and Network Administrator to identify the bottleneck associated with this issue.

    As you have confirmed in your post that other tables data is copied without any issue, this makes me to look at the particular table that is causing this issue and troubleshoot with your SQL admin to identify if something is specific to that table (like if the table is being used by other resources/applications around the same time/any table locks, etc)

    Possible Root cause

    The error message indicates to a Network error, not a SQL Server timeout. There is apparently some sort of network problem between ADF service and the SQL Server.

    Usually, it will happen when the job (read/write from SQL) takes a very long time.

    Below are few mitigations you may try and see if that helps:

    1. Since this kind of issue always happens in Source, please try to apply 'parallel copy' and 'dynamics range' in Source. This way it can faster the whole read performance and increase the stability. https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-sql-database?tabs=data-factory#parallel-copy-from-sql-database
    2. Or implement retry policy and retry interval (interval should be a little longer than normal successful jobs' run time) in pipeline level, then pipeline won't fail fast and do retry for those issues. https://docs.microsoft.com/en-us/azure/data-factory/concepts-pipelines-activities#activity-policy
    3. Split the data volume/size (for example, use query mode). Use several Copy activity to read and write the whole data.

    I would also recommend having a look at this thread and see if any of those suggestions help: Azure Data Factory - Azure SQL connectivity error : A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available.) SqlErrorNumber=64

    If none of these suggestions help, then please feel free to file a support ticket for deeper investigation of the network logs. This may need investigation on both sides (ADF & SQL Server).

    In case if you don't have a support plan, please let me know so that I can work with you offline in creating a one time free support ticket.

    Hope this helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

  2. HimanshuSinha-msft 19,486 Reputation points Microsoft Employee Moderator
    2023-03-30T22:09:26.7566667+00:00

    @Potlapally, Harsha : I think you have tried a few thinks and I am assuming you have followed what @KranthiPakala-MSFT called out.

    Can you please let me know if it is the same table which errors out all the time? If yes how this is different than then other tables? I will focus on the size and data types. Also, do you have an idea as after what time you see these failures (when it fails), is the time to failure similar all the time? This could point to the timeout setting on SQL Server side. I know that you have added retry , please also add retry interval also.

    User's image

    Also, I think I will check the perf counter on the server on which SHIR in installed.

    To your ask

    The part that is not clear is if the error is occurring on the source SQL database or the Azure Data Lake Storage.

    I am confident that the failure is on the Source side .

    Hope this helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.


  3. HimanshuSinha-msft 19,486 Reputation points Microsoft Employee Moderator
    2023-05-02T00:20:01.2633333+00:00

    @Potlapally, Harsha : My apologizes for the late reply . I think that you are correct for the investigation and i am hopeful that you will have a resolution .

    The 'per counter' , stands for performance counter and you can view the same in Window by steps called out here .

    In your case I see that the connections resets after the initial success and so I think you should read the different counters here and see if you can find something.

    In case if you are still struggling with the issue , I suggest working with MS support team , if you have a support plan you may file a support ticket, else could you please send an email to azcommunity@microsoft.com with the below details, so that we can create a one-time-free support ticket for you to work closely on this matter.

    Subscription ID:Subject : Attn Himanshu

    Please let me know once you have done the same.

    Thanks
    Himanshu

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.