Copy Activity Issue

Rohit Kulkarni 711 Reputation points
2025-01-22T11:11:06.12+00:00

Hello Team,

I am unzipping the folder and copying the file to the respective folder. And i am selecting the option called Copy Behavior: Flatten hierarchy in SINK Tab.

But the files are getting saved in ascii character name.

User's image

But the files names are :

Pul_Hospital_Diagnoses.csv

Pul_Procedure_Claims.csv

Ven_Procedure_Claims.csv

Ven_Hospital_Diagnoses.csv

It has to be saved under this name in the respective folder. Please advise

If I don't select Copy Behavior as Flatten hierarchy in sink tab. The folder is getting created twice and the files are getting saved in the respective folder

User's image

User's image

The dataset :

User's image

How it can be converted to single folder like :

dhcunzipped/Pe_Monthly_Thrombosis_Files/

Please advise.

Regards

Rohit

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,219 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Ganesh Gurram 3,765 Reputation points Microsoft Vendor
    2025-01-22T15:54:02.3833333+00:00

    Hi @Rohit Kulkarni
    Greetings & Welcome to the Microsoft Q&A forum! Thank you for sharing your query.

    When using the "Flatten hierarchy" option in the Copy Activity's Sink tab, the output files may be saved with autogenerated names, which can result in ASCII character names instead of the intended file names. This behavior occurs because the "Flatten hierarchy" option does not preserve the original file names and instead generates new names for the files based on the internal processing.

    User's image

    Reference: Copy Activity

    To achieve the desired outcome of saving the files with their original names in a single folder, you might consider the following options:

    If you do not select the "Flatten hierarchy" option, the original folder structure will be preserved, and the files will be saved with their correct names in the respective folders. However, as you've noted, this can lead to the creation of duplicate folders.

    Custom Naming - If you need to keep the "Flatten hierarchy" option, you might need to implement a custom naming convention or use a mapping that specifies how each file should be named during the copy process. This could involve using a data flow to rename the files after they are copied.

    Check Dataset Configuration - Ensure that your dataset is correctly configured to point to the desired output folder and that the file naming conventions are set as needed.

    By adjusting these settings, you should be able to control how the files are named and organized in your destination folder.

    For more details refer to these documentations:

    Copy and transform data in Azure Blob Storage by using Azure Data Factory or Azure Synapse Analytics

    Copy data to or from a file system by using Azure Data Factory or Azure Synapse Analytics

    I hope this information helps.

    Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

    0 comments No comments

  2. Sina Salam 17,176 Reputation points
    2025-01-23T14:18:03.1233333+00:00

    Hello Rohit Kulkarni,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you're having challenges on copy activities and changing the original file name.

    To avoid complexity, additional configuration and to implement a custom naming convention or data flow to rename the files you can take the following steps:

    Firstly, if you do not select the "Flatten hierarchy" option, ensure that your source and sink datasets are correctly configured to avoid creating duplicate folders. This can be done by specifying the exact folder path in the sink dataset. I will advise to avoid using Flatten hierarchy.

    Secondly, if you need to use the "Flatten hierarchy" option, you can implement a custom naming convention using a data flow in Azure Data Factory. This is how you can achieve it:

    • In Azure Data Factory, create a new data flow.
    • Add a source transformation to read the files from the source folder.
    • Use a derived column transformation to create a new column with the desired file names. You can use expressions to extract and format the file names as needed.
    • Add a sink transformation to write the files to the destination folder. In the sink settings, map the new column with the desired file names to the output file names.

    Finally, make sure that your dataset is configured to point to the correct output folder. Set the file naming conventions in the dataset properties to match the desired file names. The below is an example of how to configure the derived column transformation:

    {
      "name": "DerivedColumn",
      "type": "DerivedColumn",
      "transformation": {
        "columns": [
          {
            "name": "NewFileName",
            "expression": "iif(contains(FileName, 'Pul_Hospital_Diagnoses'), 'Pul_Hospital_Diagnoses.csv', iif(contains(FileName, 'Pul_Procedure_Claims'), 'Pul_Procedure_Claims.csv', iif(contains(FileName, 'Ven_Procedure_Claims'), 'Ven_Procedure_Claims.csv', 'Ven_Hospital_Diagnoses.csv')))"
          }
        ]
      }
    }
    

    This configuration above checks the original file name and assigns the correct name based on the content.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.