Azure Blob Storage Upload Performance (python async)

Dyt13 206 Reputation points
2022-11-25T17:14:44.007+00:00

Hello,

I'm uploading concurrently 1500 blobs (1Mo max per blob) to a container in Azure Storage Account (StorageV2 (general purpose v2))

So far i'm uploading them via python package azure-blob_storage with the pseudo-code below.

async def upload_blobs(blobs_args:list):  
   tasks = [asyncio.create_task(upload_blob(blob_arg)) for arg in blobs_args}  
     
   # concurrent call return_when all completed. Safe.  
   finished, pending = await asyncio.wait(  
       tasks, return_when=asyncio.ALL_COMPLETED  
   )  
  
   return None  
  
....  
  
async def upload_blob_async(args: dict):  
  # Instantiate a new BlobServiceClient using a connection string  
  blob_service_client = asyncbsc.from_connection_string(CONNECTION_STRING_STORAGE)  
  
  async with blob_service_client:  
      # Instantiate a new ContainerClient  
      container_client = blob_service_client.get_container_client(args["blob_name"])  
      # Upload a blob to the container  
      await container_client.upload_blob(...)  

With no restriction on the number of // queries, sending 1500 docs has a huge impact on my E2E response time

What would you recommand in order to lower the E2E ? Using a semaphore in order to send maybe requests 100 by 100 ? Also i need to keep the general purpose storage account (i/o premium account) because i use the tags (not available on the premium...).

Thanks264289-e2e.png

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,906 questions
Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,607 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Sumarigo-MSFT 45,406 Reputation points Microsoft Employee
    2022-11-28T13:16:47.857+00:00

    @Dyt13 Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

    Firstly, Apologies for the delay response!
    Please refer to this article: Which explain detailed information in Latency in Blob Storage : https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-latency

    End-to-end (E2E) latency measures the interval from when Azure Storage receives the first packet of the request until Azure Storage receives a client acknowledgment on the last packet of the response.
    The average end-to-end latency of successful requests made to a storage service or the specified API operation. This value includes the required processing time within Azure Storage to read the request, send the response, and receive acknowledgment of the response.

    264862-image.png

    What would you recommended in order to lower the E2E ? Refer to this troubleshooting article " How to isolate latency issue for Azure Storage Account"

    You can try single blobserviceclient? Also azcopy and check for the status

    If the issue still persist, please let me know I would like to work closer on this issue

    ----------

    Please do not forget to 264767-accept-answer.png and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments