Ambiguity in loop variable #item when using contains within reduce in Data Flow Expressions.

Sathyajit Loganathan 40 Reputation points
2025-03-12T22:37:55.83+00:00

I am trying to use a contains function within a reduce function. If I were to perform some comparison on the two items how would I accomplish this?

Would the #item loop variable within the contain override the #item loop variable in the reduce function?

Example use-case:

reduce(
	split($disallowedCharacters, ''),
	false(),
	#acc || contains(split(fieldName, ''), #itemFromReduce != #itemFromContain),
	#result
)
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

Accepted answer
  1. Anonymous
    2025-03-17T09:32:54.8+00:00

    Hello @Sathyajit Loganathan,

    Would the #item loop variable within the contain override the #item loop variable in the reduce function?

    Yes, the contains() #item will override the #item of the reduce() function. Its same as using filter activity item() inside a for-each activity in ADF pipeline. In Pipeline case, you can leverage the ADF variables option to store the for-loop item(), but in the dataflow, it might not be possible to store the #item value of the reduce function.

    So, when using reduce() function, it's better to avoid any other functions which needs their own #item values.

    If I were to perform some comparison on the two items, how would I accomplish this?

    In this case, you can use in() function instead of the contains() function to achieve your requirement.

    Modify your expression as shown below.

    
    reduce(
    
    	split($disallowedCharacters, ''),
    
    	false(),
    
        #acc || not(in(split(fieldname, ''), #item)),
    
    	#result
    
    )
    
    

    For sample, I took the string 'Kohli Dhoni Jaddu' for the parameter and used space ` ` in the split. You can see it's giving the desired outcome as shown below.

    enter image description here

    Hope this helps.

    If the answer is helpful, please click Accept Answer and kindly upvote it. If you have any further questions about this answer, please click Comment.

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Amira Bedhiafi 33,071 Reputation points Volunteer Moderator
    2025-03-13T09:45:47.4166667+00:00

    You need to explicitly differentiate between the loop variables in the outer and inner scopes. Unfortunately, ADF Data Flow Expressions do not support renaming loop variables directly. However, you can work around this limitation by using intermediate variables or restructuring your logic.

    • Use an intermediate variable to store the #item from the reduce function.
    • Use a different variable name for the inner contains function.
    
    reduce(
    
        split($disallowedCharacters, ''),
    
        false(),
    
        #acc || contains(
    
            split(fieldName, ''),
    
            #itemFromReduce != #itemFromContain
    
        ),
    
        #result
    
    )
    
    

    If you cannot rename the loop variables, you can use intermediate variables to store the values temporarily:

    
    reduce(
    
        split($disallowedCharacters, ''),
    
        false(),
    
        #acc || (
    
            #tempItemFromReduce = #item,
    
            contains(
    
                split(fieldName, ''),
    
                #tempItemFromReduce != #item
    
            )
    
        ),
    
        #result
    
    )
    
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.