foreach range - build string in batches

arkiboys 9,641 Reputation points
2022-04-25T07:02:02.567+00:00

Hello,
This is what I have at present and not sure how to get to what I am after (i.e. 10 items at a time to build the url string, etc. to be configurable)
Lookup reads a .csv which has column1, column2, column3 in it but uses column1 (Which is named as p_Name to pass to build string).
The lookup feeds the foreach activity.
Later in the pipeline, the other two columns are used for other things (To pass to dataflow, etc.)
There may be a-lot of rows with Names in this column1 of the .csv file...
i.e.

p_Name

Name1
Name2
...

Inside the download copy activity (see screenshot below) I concatenate a string like:
http://xyz/foldername/ventures in ('Name1', 'Name2','Name3',..., 'Name143')
Please note that at present, I do get the above string but the problem is that because if there are a-lot of Names in p_name column of the .csv, the string gets too long and the url does not accept those many Names and so I would like to send the url in smaller batches say 10 at a time which can perhaps be configured with a parameter...
The copy activity as you se takes the url with the first 10 names and processes them.
I would like this copy activity to process the names in batches of say 10 until it gets to the 143 names
Thank you

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,600 questions
{count} votes

Accepted answer
  1. MartinJaffer-MSFT 26,031 Reputation points
    2022-04-26T16:27:42.57+00:00

    @arkiboys I have a demo solution to share.

    Rather than tell how to build the solution, I provide some code below to try out.

    The pipeline starts out with an array of words 8 items long. There is also a parameter to choose how long the segments should be. (do you want batches 2 long or 3 long or 5 long).

    The first activities determine how many batches your batch size will produce, and any remainder ( 8 items broken into batch size 5 produce segment of 1-5 and segment of 6-8).

    The ForEach works on the batches and an if condition determines whether to use the remainder or the batch size.

    In the deepest level, there is a set variable (assigning to "xcerpt") which produces the batches. Replace this with wherever they need to go.

    {  
        "name": "divvy up list into smaller lists",  
        "properties": {  
            "activities": [  
                {  
                    "name": "Make ForEach Items",  
                    "type": "SetVariable",  
                    "dependsOn": [  
                        {  
                            "activity": "Make remainder",  
                            "dependencyConditions": [  
                                "Succeeded"  
                            ]  
                        }  
                    ],  
                    "userProperties": [],  
                    "typeProperties": {  
                        "variableName": "eachitems",  
                        "value": {  
                            "value": "@range(0,\nadd( \nif(greater(int(variables('remainder')),0) ,1 ,0 ),\n div(length(variables('raw')),\n int(pipeline().parameters.segmentsize)))\n )",  
                            "type": "Expression"  
                        }  
                    }  
                },  
                {  
                    "name": "break up list",  
                    "type": "ForEach",  
                    "dependsOn": [  
                        {  
                            "activity": "Make ForEach Items",  
                            "dependencyConditions": [  
                                "Succeeded"  
                            ]  
                        }  
                    ],  
                    "userProperties": [],  
                    "typeProperties": {  
                        "items": {  
                            "value": "@variables('eachitems')",  
                            "type": "Expression"  
                        },  
                        "activities": [  
                            {  
                                "name": "If last segment",  
                                "description": "Use remainder or segment size",  
                                "type": "IfCondition",  
                                "dependsOn": [],  
                                "userProperties": [],  
                                "typeProperties": {  
                                    "expression": {  
                                        "value": "@less(item(),\r\ndiv(length(variables('raw')),pipeline().parameters.segmentsize)\r\n)",  
                                        "type": "Expression"  
                                    },  
                                    "ifFalseActivities": [  
                                        {  
                                            "name": "remain",  
                                            "type": "SetVariable",  
                                            "dependsOn": [],  
                                            "userProperties": [],  
                                            "typeProperties": {  
                                                "variableName": "xcerpt",  
                                                "value": {  
                                                    "value": "@take(\nskip(variables('raw'),mul(item(),pipeline().parameters.segmentsize))\n, int(variables('remainder')) )",  
                                                    "type": "Expression"  
                                                }  
                                            }  
                                        }  
                                    ],  
                                    "ifTrueActivities": [  
                                        {  
                                            "name": "isless",  
                                            "type": "SetVariable",  
                                            "dependsOn": [],  
                                            "userProperties": [],  
                                            "typeProperties": {  
                                                "variableName": "xcerpt",  
                                                "value": {  
                                                    "value": "@take(\nskip(variables('raw'),mul(item(),pipeline().parameters.segmentsize))\n, pipeline().parameters.segmentsize)",  
                                                    "type": "Expression"  
                                                }  
                                            }  
                                        }  
                                    ]  
                                }  
                            }  
                        ]  
                    }  
                },  
                {  
                    "name": "Make remainder",  
                    "type": "SetVariable",  
                    "dependsOn": [],  
                    "userProperties": [],  
                    "typeProperties": {  
                        "variableName": "remainder",  
                        "value": {  
                            "value": "@{mod(length(variables('raw')),pipeline().parameters.segmentsize)}",  
                            "type": "Expression"  
                        }  
                    }  
                }  
            ],  
            "parameters": {  
                "segmentsize": {  
                    "type": "int",  
                    "defaultValue": 2  
                }  
            },  
            "variables": {  
                "raw": {  
                    "type": "Array",  
                    "defaultValue": [  
                        "one",  
                        "two",  
                        "three",  
                        "four",  
                        "five",  
                        "six",  
                        "seven",  
                        "eight"  
                    ]  
                },  
                "xcerpt": {  
                    "type": "Array"  
                },  
                "eachitems": {  
                    "type": "Array"  
                },  
                "remainder": {  
                    "type": "String"  
                }  
            },  
            "annotations": []  
        }  
    }  
    
    0 comments No comments

0 additional answers

Sort by: Most helpful