How to handle more than 100.000 items?

Niclas Weh 1 Reputation point
2020-07-07T12:50:22.387+00:00

I have a Azure Function which returns a json array. This array can contain 50 items or 500.000.

As I just realisied, a foreach-loop can only handle up to 100.000 items. Is there a way I could split my json array up at a specific point? (Like ID = 99.999, ID = 199.999, ...) and handle each "block" with an extra foreach?

Azure Logic Apps
Azure Logic Apps
An Azure service that automates the access and use of data across clouds without writing code.
3,218 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Samara Soucy - MSFT 5,131 Reputation points
    2020-07-10T02:24:53.453+00:00

    If you can split it in the Function, then that is going to be much easier.

    If you need to do it inside the logic app, I would create a few variables- a working array to hold the chunk you will be currently working on, an integer set to your chunk size (whether you want to set this to 100k or something a bit smaller), and an array that will temporarily hold the remaining items since you can't call a variable that you are currently setting.

    1. I'd nest my foreach (or use a child logic app) inside a while loop, which runs until my data array length is 0.
    2. Use the take() function to load the current chunk into the working array.
    3. Call the foreach or child app with that chunk as the input.
    4. Use the skip() function to set my temp array value to the remaining data minus the items I just processed. This is necessary because you can set a variable you are calling in the value parameter.
    5. Load the temp data back into the main array.

    This is what it looks like at a high level:
    11742-2020-07-09-22-06-18-logic-apps-designer-microsoft.png

    I used an extra array variable in my example to hold the dummy data instead of passing it in the request, but here is the json schema for it:

    {  
        "definition": {  
            "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",  
            "actions": {  
                "Initialize_chunkSize": {  
                    "inputs": {  
                        "variables": [  
                            {  
                                "name": "chunkSize",  
                                "type": "integer",  
                                "value": 2  
                            }  
                        ]  
                    },  
                    "runAfter": {  
                        "Initialize_tempArray": [  
                            "Succeeded"  
                        ]  
                    },  
                    "type": "InitializeVariable"  
                },  
                "Initialize_tempArray": {  
                    "inputs": {  
                        "variables": [  
                            {  
                                "name": "tempArray",  
                                "type": "array"  
                            }  
                        ]  
                    },  
                    "runAfter": {  
                        "Initialize_workingArray": [  
                            "Succeeded"  
                        ]  
                    },  
                    "type": "InitializeVariable"  
                },  
                "Initialize_testArray": {  
                    "inputs": {  
                        "variables": [  
                            {  
                                "name": "testArray",  
                                "type": "array",  
                                "value": [  
                                    1,  
                                    2,  
                                    3,  
                                    4,  
                                    5,  
                                    6,  
                                    7,  
                                    8,  
                                    9,  
                                    10,  
                                    11  
                                ]  
                            }  
                        ]  
                    },  
                    "runAfter": {},  
                    "type": "InitializeVariable"  
                },  
                "Initialize_workingArray": {  
                    "inputs": {  
                        "variables": [  
                            {  
                                "name": "workingArray",  
                                "type": "array"  
                            }  
                        ]  
                    },  
                    "runAfter": {  
                        "Initialize_testArray": [  
                            "Succeeded"  
                        ]  
                    },  
                    "type": "InitializeVariable"  
                },  
                "Until": {  
                    "actions": {  
                        "For_each": {  
                            "actions": {  
                                "Compose": {  
                                    "inputs": "@items('For_each')",  
                                    "runAfter": {},  
                                    "type": "Compose"  
                                }  
                            },  
                            "foreach": "@variables('workingArray')",  
                            "runAfter": {  
                                "Set_Working_Array": [  
                                    "Succeeded"  
                                ]  
                            },  
                            "type": "Foreach"  
                        },  
                        "Remove_items_from_main_array": {  
                            "inputs": {  
                                "name": "testArray",  
                                "value": "@variables('tempArray')"  
                            },  
                            "runAfter": {  
                                "Set_Temp_Array": [  
                                    "Succeeded"  
                                ]  
                            },  
                            "type": "SetVariable"  
                        },  
                        "Set_Temp_Array": {  
                            "inputs": {  
                                "name": "tempArray",  
                                "value": "@skip(variables('testArray'),variables('chunkSize'))"  
                            },  
                            "runAfter": {  
                                "For_each": [  
                                    "Succeeded"  
                                ]  
                            },  
                            "type": "SetVariable"  
                        },  
                        "Set_Working_Array": {  
                            "inputs": {  
                                "name": "workingArray",  
                                "value": "@take(variables('testArray'), variables('chunkSize'))"  
                            },  
                            "runAfter": {},  
                            "type": "SetVariable"  
                        }  
                    },  
                    "expression": "@equals(length(variables('testArray')), 0)",  
                    "limit": {  
                        "count": 60,  
                        "timeout": "PT1H"  
                    },  
                    "runAfter": {  
                        "Initialize_chunkSize": [  
                            "Succeeded"  
                        ]  
                    },  
                    "type": "Until"  
                }  
            },  
            "contentVersion": "1.0.0.0",  
            "outputs": {},  
            "parameters": {},  
            "triggers": {  
                "manual": {  
                    "inputs": {  
                        "schema": {}  
                    },  
                    "kind": "Http",  
                    "type": "Request"  
                }  
            }  
        },  
        "parameters": {}  
    }  
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.