Hi Team, How can I use azure batch to process a large json ,I want to chuck that data and want to process it .

Shailendra Namdeo 0 Reputation points
2023-04-24T09:04:53.36+00:00

I have a rest end point where I am getting a large json as a request body , I want to process that json using azure batch.

Azure Batch
Azure Batch
An Azure service that provides cloud-scale job scheduling and compute management.
301 questions
{count} votes

1 answer

Sort by: Most helpful
  1. KarishmaTiwari-MSFT 18,367 Reputation points Microsoft Employee
    2023-04-26T23:43:57.0433333+00:00

    @Shailendra Namdeo

    Thanks for posting your query on Microsoft Q&A.

    #1. You can create and use JSON template files with Azure CLI to create Batch pools, jobs, and tasks.
    See Use Azure Batch CLI templates and file transfer

    #2. You can also Process large-scale datasets by using Data Factory and Batch

    To process a large JSON file using Azure Batch, Basic workflow will look something like this:

    1. Split the large JSON file into smaller chunks that can be processed independently, either programmatically or using tools for it. Save the chunks to Azure Storage account/ Blob Storage.
    2. Create a new Azure Batch account. You'll need to specify a unique name, subscription, resource group, and location for your account.
    3. In the Azure Batch account, create a new pool of compute nodes that will process the JSON chunks. You can choose the number of nodes, their size, and the operating system.
    4. Write a script or batch application that will process the JSON chunks. This application should read a JSON chunk from the Azure Blob Storage, process it, and write the results back to storage or another external service.
    5. Upload the application files that your tasks will run to the Applications section of your Azure Batch account.
    6. In the Azure Batch account, create a new job that will run your application. You'll need to specify the pool of compute nodes to use, and any other settings required for your specific workload.
    7. For each JSON chunk in Azure Blob Storage, create a new task in the Batch job. Each task should reference the application package you uploaded earlier, and include any required input data and command-line arguments to process the JSON chunk. As each task completes, it can upload its output to Azure Storage.
    8. Use the Azure portal or Batch SDKs to monitor the progress of your job and tasks. Once all tasks have completed, retrieve the processed results from Azure Blob Storage or your chosen external service.

    If you have any questions at all, please let us know via "comments" and we would be happy to help you. Comment is the fastest way of notifying the experts.

    Please don’t forget to 'Accept Answer' and hit 'Yes' for "was this answer helpful" wherever the information provided helps you, as this can be beneficial to other community members for remediation for similar issues.

    0 comments No comments