How to list the contents of a large zip file upload to Azure Blob storage

Marc Hedgley 185 Reputation points
2023-11-21T17:50:47.8233333+00:00

I have a requirement to take zip files uploaded by external clients to Azure Blob Storage and check the file formats of all files contained therein against an agreed whitelist.

A logic app have been created to effectively decompress the the file, list the formats and exclude any that are outside of the whitelists.

We have run into an Azure limitation issue where the logic app is returning the attached error due to the amount of files contained within it.

I need a way to be able to list out the file formats of zip files that could contain thousands of files or contain files where the file size is greater than 750mb. Does anyone have any advice on how this could be potentially be done?

User's image

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
3,200 questions
Azure Logic Apps
Azure Logic Apps
An Azure service that automates the access and use of data across clouds without writing code.
3,551 questions
{count} votes

Accepted answer
  1. Adam Zachary 2,936 Reputation points
    2023-11-22T03:34:11.2066667+00:00

    Hi Marc,

    You can use Azure Functions to trigger a process when a zip file is uploaded.

    This process can use streaming to handle the zip contents, which allows for dealing with large files while keeping memory consumption low.

    You would create a function that streams the zip file content, unzips it, and then processes each file accordingly, without exceeding memory limitations. By streaming the files directly from the Blob Storage, and avoiding the use of a MemoryStream which has a capacity limit, you can manage large files more effectively. This approach should help you bypass the limitations you are facing with Logic Apps​

    Azure Functions offer a scalable way to handle zip files in Azure Blob Storage. You can create a function that triggers when a zip file lands in a specific Blob Container.

    This function can stream the data, unzip the files, and store them as individual files in another container. This method is beneficial because it processes data in a streaming manner, allowing it to handle large files without consuming a lot of memory.

    There are different code samples available for this approach using classes like ZipArchive from the .NET Framework and ZipInputStream from the SharpZipLib library. These samples highlight different ways of handling the streaming and buffer sizes. The choice between these depends on factors like memory usage, processing time, and whether you need to import third-party libraries​​​​​​.

    If you find the provided information helpful and it resolves your query, please consider accepting the answer. Your feedback is valuable and helps ensure the quality and relevance of the responses.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.