What is the fastest way to send files to gen2 storage account?

Kowalczyk Tomasz 0 Reputation points
2024-02-13T13:44:37.52+00:00

Hey guys, I have a question regarding the fastest possible way to send files from folder to folder or storage account to storage account in general. I have around 100k small files, around 1kb each that needs to be send very quickly, ideally sub 1 minute time. What would be the best approach for this kind of thing? Cheers!

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,431 questions
{count} votes

1 answer

Sort by: Most helpful
  1. KarishmaTiwari-MSFT 20,667 Reputation points Microsoft Employee
    2024-02-13T23:04:31.84+00:00

    @Kowalczyk Tomasz Is your question about the fastest way to send files to Azure Data lake Gen2 storage account?
    Can you please clarify on - from where are you sending the data from?

    Please elaborate on your ask since it is not clear if you are planning to move data from one storage account to another or from a data source to Azure data lake storage account. Let me know and I can help you further.

    When it comes to sending files to an Azure Data Lake Storage Gen2 account, you can use AzCopy (it’s a reliable choice for bulk data transfers), Azure Storage Explorer (if you prefer a graphical user interface) or Azure Data factory (if you need to automate data movement). However, for large-scale transfers, AzCopy might be more efficient.

    While you’ve already mentioned using ADF with multiple concurrent copy activities, it’s worth noting that ADF is a powerful tool for orchestrating data workflows. You can fine-tune your ADF pipeline by optimizing settings such as parallelism, batch size, and data flow transformations.

    If you haven’t explored it yet, consider using ADF’s built-in copy activity specifically designed for Data Lake Storage Gen2. Ensure that you configure it appropriately for optimal performance2.

    Keep in mind that factors such as network latency, file size, and the number of files impact overall performance. For extremely large datasets, consider breaking them down into smaller chunks and uploading them in parallel. Refer: https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-best-practices

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.