Slow read speed from fileshare to app service

Hannes Caesar 5 Reputation points
2023-05-08T07:59:16.2066667+00:00

Hello everyone,

I am trying to create a web service which purpose it is to access a subset of timeseries data, perform a simple calculation (like a sum) and return the result. The data is stored in a parquet file, which can be loaded partially by providing the wanted collumn names.
I implemented the app with flask and deployed it to an Azure App Service (App Service Plan Configuration: S2 (2 vcores, 3.5GB RAM)).
Since the amount of data is not that big at the moment (3GB parquet file), I stored the data in an azure fileshare which I mounted on to the app service. The storage specifications are: Premium Fileshare with 100GB provisioned storage). I have accounted for the memory constraint of the app service by splitting the data up during read in.

My problem is, that when reading the data from the fileshare using python, the read speed is extremely low. According to the documentation, the egress of the fileshare should be up to 110 MB/s. However I can only achieve read speeds of 10 - 20 mb/s . I also tried downloading files via the python sdk provided by azure with the same result (e.g. reading in a 160 mb csv file into a pandas dataframe takes about 8 seconds).

I am happy for any help regarding the issue. Is the network bandwidth for azure app services visible somewhere?
Would another solution (i.e. not using app service / fileshare) be better?

Best regards

Hannes

Azure Files
Azure Files
An Azure service that offers file shares in the cloud.
1,186 questions
Azure App Service
Azure App Service
Azure App Service is a service used to create and deploy scalable, mission-critical web apps.
7,045 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. brtrach-MSFT 15,356 Reputation points Microsoft Employee
    2023-05-10T03:02:58.1733333+00:00

    @Hannes Caesar Troubleshooting performance across multiple services is going to be tricky and ultimately, nobody here is going to be able to provide a clear solution. Opening a technical support ticket would put you in touch with someone who can see exactly what services are in play and use logs to understand the situation better.

    With that being said, there are a few things I would suggest. You'll need to play around with your settings to see if anything yields a better result.

    1. Ensure both the fileshare and Web App (along with any other services) are all in the same region. This is the most common mistake. Note that West US and West US2 (or similar) are not the same data centers. These are geographically located at least 500 miles apart and this will introduce serious latency. All services should be in West US only for example.
    2. Play around with the size of the web app. S2 is good but try using a P1v3 for example. This is the latest architecture available. Using the higher tier for an hour or two will generate a little bit more in charges but you're billed per hour so the cost to scale up, perform your test, and then scale back down should be very minor. Just don't forget to scale back down.
    3. If none of the above solutions work, you can try using Azure Blob Storage instead of Azure File Share. Azure Blob Storage is optimized for storing and retrieving large amounts of unstructured data. You can use the Azure Blob Storage SDK for Python to read and write data to Azure Blob Storage.

    Please let us know the outcome of the above steps.

    0 comments No comments