How to load a file into adf having 2 delimiters using azure dataflows

Pravalika Devaradeshi 5 Reputation points
2023-03-28T14:41:21.9366667+00:00

Hi There,

We have a scenario where we need to load the history data. We have the data available in our storage account from 2016.The files present in SA are delimited with semi column (;) until March 2022 and later we have the files with tab delimited. When we tried to run the history pipeline where the row delimiter as Default (\r,\n, or \r\n) and column delimiter as Tab (\t).
When run the pipeline is failing with User configuration issue at sink.

Is there any solution that we can use to run the pipeline that processes both the file formats?

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,462 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,565 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 32,816 Reputation points Microsoft Employee
    2023-03-29T09:59:28.4233333+00:00

    Hi @Pravalika Devaradeshi ,

    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    As I understand your query, you are trying to copy data for multiple files from past few years. The issue here is that few of the files are having semi colon delimiter and few are having tab delimiter . Pipeline is not working in generic way. Please let me know if that is not the ask here.

    In order to handle this scenario, you need to parameterize your dataset for column delimiter option and pass the value during the runtime.

    If your files are separately stored in year wise folders, then make sure to filter out the files from 2016-19 and pass ';' delimiter during the runtime and similarly for files from 2019 onwards, pass '/t' during the runtime.

    You can use copy activity too instead of dataflow to achieve this requirement as there is not much transformation for this use case.


    Hope it helps. Kindly accept the answer if it's been of help, else kindly revert back on thread having additional queries. Thanks


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.