How to use concat and split in ADF

Kacper Sobolewski 6 Reputation points
2023-01-25T14:36:06.2633333+00:00

Hello

I need help. I need to extract some information from @triggerBody().folderPath value and put it in to Dataset file path. From what I understand, this parameter returns full path with container name and folders. Like that:

abfss://test4/test/2022-01/CORSICANA_PACKAGE@storageaccount....

so for this, I need split this value to get container name and folders name separately. and put is in to file path.

Zrzut ekranu 2023-01-25 o 15.05.35

Where in pipeline I'm setting

@{split(pipeline().parameters.containerPath,'/')[2]}

Zrzut ekranu 2023-01-25 o 15.07.24

Now I would like to set same thing for folderPath, but whatever I've try, got errors.

The deal is, that I will have many subfolders f.eg. /test/2022-01/PACKAGE/folder. They will be different each time. Also probably, there will more or less subfolders.

I've think to combine concat and split. something like @concat((split(pipeline().parameters.folderPath,'/')[3]),'/',(split(pipeline().parameters.folderPath,'/')[4]),'/'..etc but first this expression/syntax is rejected by azure. Second doesn't give me sure that this will be correct.

Or maybe I'm thinking wrong, maybe it should be done different way. If so, then how?

My pipeline looks like this

Zrzut ekranu 2023-01-25 o 15.30.17

Data Flow 2 have 2 sources and join function. I't generating some output file, and then this file is used in Copy function.

The pipeline is set to be triggered on storage event, and it's monitoring all containers in Storage Account. If new package will be uploaded with desired file, then pipeline should be started. The pipeline is combining few files, so I need to pass @triggerBody().folderPath so that the Data Flow will know from where take source files.

Any ideas how to solve that? Thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,639 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Kacper Sobolewski 6 Reputation points
    2023-01-26T10:41:54.3733333+00:00

    Ok I was able to solve that. The proper expression looks like that:

    @{concat(split(pipeline().parameters.folderPath,'/')[1],'/',split(pipeline().parameters.folderPath,'/')[2],'/',split(pipeline().parameters.folderPath,'/')[3],'/',split(pipeline().parameters.folderPath,'/')[4],'/',split(pipeline().parameters.folderPath,'/')[5])}
    

    However this solution will works only in case when I have 5 subfolders in container. If I would have less or more subfolders, then this function will return an error.

    Operation on target Data flow2 failed: The expression 'concat(split(pipeline().parameters.folderPath,'/')[1],'/',split(pipeline().parameters.folderPath,'/')[2],'/',split(pipeline().parameters.folderPath,'/')[3],'/',split(pipeline().parameters.folderPath,'/')[4],'/',split(pipeline().parameters.folderPath,'/')[5])' cannot be evaluated because array index '4' is outside bounds (0, 3) of array.
    

    So is there a way to write f.eg. for loop or somethig similar, which will read array until it ends? So like split(pipeline().parameters.folderPath,'/')[n]

    1 person found this answer helpful.
    0 comments No comments

  2. MartinJaffer-MSFT 26,236 Reputation points
    2023-01-26T17:37:09.57+00:00

    @Kacper Sobolewski Hello and welcome to Microsoft Q&A

    As I understand you want help parsing trigger parameters.

    Just to make things clear, let me give example.

    I have a storage event trigger. It triggered when I uploaded:

    https://mystorageaccount.blob.core.windows.net/adftutorialblob/CBB_HOUSTN-DC2/triggerevent.txt

    The @triggerBody().folderPath returned adftutorialblob/CBB_HOUSTN-DC2

    The @triggerBody().fileName returned triggerevent.txt

    adftutorialblob is the container name.

    To split container from folders, I can use:

    @take( triggerBody().folderPath , 1 ) to get just the container

    @skip( triggerBody().folderPath , 1 ) to get everything after the container


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.