Trim header and values in a text file stored in azure blob

Kaviprakash Selvaraj 20 Reputation points
2023-08-18T14:35:30.16+00:00

Hi,

I have a text file stored in an blob storage that has file name, headers and values. In ADF, we copy the contents from the file into a SQL table using COPY activity. The challenge is as part of the header and values, there is an additional delimiter (PIPE SYMBOL) at last position that needs to be removed before the copy activity.

Here is the sample contents from the file.

FILENAME|EDI|5281|20201226|26**|**

CoulumName1|CoulumName2|CoulumName3|CoulumName4|CoulumName5**|**

Row1Value1|Row1Value2|Row1Value3|Row1Value4|Row1Value5**|**

Row2Value1|Row2Value2|Row2Value3|Row2Value4|Row2Value5**|**

......

.......

What is the best way to remove the pipe symbol at last position ?

Thanks in advance,

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,199 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,639 questions
SQL Server | Other
0 comments No comments
{count} votes

Accepted answer
  1. Subashri Vasudevan 11,231 Reputation points
    2023-08-20T03:57:37.2533333+00:00

    Hi I assume that you just have only pipe symbol at the end of each row. If so, there are a couple of ways to ignore the pipe symbol. 1. Use azure function with python to strip off last Delimiter character. 2. Use data flow to read only required columns. This video explains the way we use data flow to achieve this. Please check and let us know if that helps and let us know if the assumption is wrong.

    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Kaviprakash Selvaraj 20 Reputation points
    2023-08-21T13:21:44.6433333+00:00

    @Subashri Vasdudevan - Thanks for your response.

    I was able to create a dataflow to fix this issue, however I'm unable to write it to sink. The reason being is we are using SQL server in IaaS model that requires self hosted IR to communicate to the database.

    Linked service with Self-hosted Integration runtime is not supported in data flow. Do you know any other alternative workaround ? Using Azure function will resolve this instead ?


  2. Kaviprakash Selvaraj 20 Reputation points
    2023-08-22T02:56:26.21+00:00

    Thanks for your answers. Let me explore the options.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.