Troubleshoot Azure Files performance issues
Note
CentOS referenced in this article is a Linux distribution and will reach End Of Life (EOL). Consider your use and plan accordingly. For more information, see CentOS End Of Life guidance.
This article lists common problems related to Azure file share performance, and provides potential causes and workarounds. To get the most value from this troubleshooting guide, we recommend first reading Understand Azure Files performance.
File share type | SMB | NFS |
---|---|---|
Standard file shares (GPv2), LRS/ZRS | ![]() |
![]() |
Standard file shares (GPv2), GRS/GZRS | ![]() |
![]() |
Premium file shares (FileStorage), LRS/ZRS | ![]() |
![]() |
First, rule out some common reasons why you might be having performance problems.
If your client virtual machine (VM) is running Windows 8.1 or Windows Server 2012 R2, or an older Linux distro or kernel, you might experience performance issues when accessing Azure file shares. Either upgrade your client OS or apply the fixes below.
An I/O depth of greater than 1 isn't supported on older versions of CentOS Linux or RHEL.
- Upgrade to CentOS Linux 8.6+ or RHEL 8.6+.
- Change to Ubuntu.
- For other Linux VMs, upgrade the kernel to 5.0 or later.
Requests are throttled when the I/O operations per second (IOPS), ingress, or egress limits for a file share are reached. For example, if the client exceeds baseline IOPS, it will get throttled by the Azure Files service. Throttling can result in the client experiencing poor performance.
To understand the limits for standard and premium file shares, see File share and file scale targets. Depending on your workload, throttling can often be avoided by moving from standard to premium Azure file shares.
To learn more about how throttling at the share level or storage account level can cause high latency, low throughput, and general performance issues, see Share or storage account is being throttled.
To confirm whether your share or storage account is being throttled, you can access and use Azure metrics in the portal. You can also create alerts that will notify you if a share is being throttled or is about to be throttled. See Troubleshoot Azure Files by creating alerts.
Important
For standard storage accounts, throttling occurs at the storage account level. For premium file shares, throttling occurs at the share level.
In the Azure portal, go to your storage account.
On the left pane, under Monitoring, select Metrics.
Select File as the metric namespace for your storage account scope.
Select Transactions as the metric.
Add a filter for Response type, and then check to see whether any requests have been throttled.
For standard file shares, the following response types are logged if a request is throttled at the client account level:
- ClientAccountRequestThrottlingError
- ClientAccountBandwidthThrottlingError
For premium file shares, the following response types are logged if a request is throttled at the share level:
- SuccessWithShareEgressThrottling
- SuccessWithShareIngressThrottling
- SuccessWithShareIopsThrottling
- ClientShareEgressThrottlingError
- ClientShareIngressThrottlingError
- ClientShareIopsThrottlingError
If a throttled request was authenticated with Kerberos, you might see a prefix indicating the authentication protocol, such as:
- KerberosSuccessWithShareEgressThrottling
- KerberosSuccessWithShareIngressThrottling
To learn more about each response type, see Metric dimensions.
If you're using a premium file share, increase the provisioned file share size to increase the IOPS limit. To learn more, see the Understanding provisioning for premium file shares.
If the majority of your requests are metadata-centric (such as createfile
, openfile
, closefile
, queryinfo
, or querydirectory
), the latency will be worse than that of read/write operations.
To determine whether most of your requests are metadata-centric, start by following steps 1-4 as previously outlined in Cause 1. For step 5, instead of adding a filter for Response type, add a property filter for API name.
Check to see whether the application can be modified to reduce the number of metadata operations.
If you're using premium SMB Azure file shares, use metadata caching.
Separate the file share into multiple file shares within the same storage account.
Add a virtual hard disk (VHD) on the file share and mount the VHD from the client to perform file operations against the data. This approach works for single writer/reader scenarios or scenarios with multiple readers and no writers. Because the file system is owned by the client rather than Azure Files, this allows metadata operations to be local. The setup offers performance similar to that of local directly attached storage. However, because the data is in a VHD, it can't be accessed via any other means other than the SMB mount, such as REST API or through the Azure portal.
- From the machine which needs to access the Azure file share, mount the file share using the storage account key and map it to an available network drive (for example, Z:).
- Go to Disk Management and select Action > Create VHD.
- Set Location to the network drive that the Azure file share is mapped to, set Virtual hard disk size as needed, and select Fixed size.
- Select OK. Once the VHD creation is complete, it will automatically mount, and a new unallocated disk will appear.
- Right-click the new unknown disk and select Initialize Disk.
- Right-click the unallocated area and create a New Simple Volume.
- You should see a new drive letter appear in Disk Management representing this VHD with read/write access (for example, E:). In File Explorer, you should see the new VHD on the mapped Azure file share's network drive (Z: in this example). To be clear, there should be two drive letters present: the standard Azure file share network mapping on Z:, and the VHD mapping on the E: drive.
- There should be much better performance on heavy metadata operations against files on the VHD mapped drive (E:) versus the Azure file share mapped drive (Z:). If desired, it should be possible to disconnect the mapped network drive (Z:) and still access the mounted VHD drive (E:).
- To mount a VHD on a Windows client, you can also use the
Mount-DiskImage
PowerShell cmdlet. - To mount a VHD on Linux, consult the documentation for your Linux distribution. Here's an example.
If the application that you're using is single-threaded, this setup can result in significantly lower IOPS throughput than the maximum possible throughput, depending on your provisioned share size.
- Increase application parallelism by increasing the number of threads.
- Switch to applications where parallelism is possible. For example, for copy operations, you could use AzCopy or RoboCopy from Windows clients or the parallel command from Linux clients.
If you're using SMB MultiChannel and the number of channels you have exceeds four, this will result in poor performance. To determine if your connection count exceeds four, use the PowerShell cmdlet get-SmbClientConfiguration
to view the current connection count settings.
Set the Windows per NIC setting for SMB so that the total channels don't exceed four. For example, if you have two NICs, you can set the maximum per NIC to two using the following PowerShell cmdlet: Set-SmbClientConfiguration -ConnectionCountPerRssNetworkInterface 2
.
Beginning with Linux kernel version 5.4, the Linux NFS client uses a default read_ahead_kb
value of 128 kibibytes (KiB). This small value might reduce the amount of read throughput for large files.
We recommend that you increase the read_ahead_kb
kernel parameter value to 15 mebibytes (MiB). To change this value, set the read-ahead size persistently by adding a rule in udev, a Linux kernel device manager. Follow these steps:
In a text editor, create the /etc/udev/rules.d/99-nfs.rules file by entering and saving the following text:
SUBSYSTEM=="bdi" \ , ACTION=="add" \ , PROGRAM="/usr/bin/awk -v bdi=$kernel 'BEGIN{ret=1} {if ($4 == bdi) {ret=0}} END{exit ret}' /proc/fs/nfsfs/volumes" \ , ATTR{read_ahead_kb}="15360"
In a console, apply the udev rule by running the udevadm command as a superuser and reloading the rules files and other databases. To make udev aware of the new file, you only need to run this command once.
sudo udevadm control --reload
The client VM could be located in a different region than the file share. Other reason for high latency could be due to the latency caused by the client or the network.
- Run the application from a VM that's located in the same region as the file share.
- For your storage account, review transaction metrics SuccessE2ELatency and SuccessServerLatency via Azure Monitor in Azure portal. A high difference between SuccessE2ELatency and SuccessServerLatency metrics values is an indication of latency that is likely caused by the network or the client. See Transaction metrics in Azure Files Monitoring data reference.
One potential cause is a lack of SMB multi-channel support for standard file shares. Currently, Azure Files supports only single channel for standard file shares, so there's only one connection from the client VM to the server. This single connection is pegged to a single core on the client VM, so the maximum throughput achievable from a VM is bound by a single core.
- For premium file shares, enable SMB Multichannel.
- Obtaining a VM with a bigger core might help improve throughput.
- Running the client application from multiple VMs will increase throughput.
- Use REST APIs where possible.
- For NFS Azure file shares,
nconnect
is available. See Improve NFS Azure file share performance with nconnect.
One possible cause of slow performance is disabled caching. Caching can be useful if you are accessing a file repeatedly. Otherwise, it can be an overhead. Check if you're using the cache before disabling it.
To check whether caching is disabled, look for the cache=
entry.
Cache=none
indicates that caching is disabled. Remount the share by using the default mount command or by explicitly adding the cache=strict
option to the mount command to ensure that default caching or "strict" caching mode is enabled.
In some scenarios, the serverino
mount option can cause the ls
command to run stat
against every directory entry. This behavior results in performance degradation when you're listing a large directory. You can check the mount options in your /etc/fstab entry:
//azureuser.file.core.windows.net/cifs /cifs cifs vers=2.1,serverino,username=xxx,password=xxx,dir_mode=0777,file_mode=0777
You can also check whether the correct options are being used by running the sudo mount | grep cifs
command and checking its output. The following is an example output:
//azureuser.file.core.windows.net/cifs on /cifs type cifs (rw,relatime,vers=2.1,sec=ntlmssp,cache=strict,username=xxx,domain=X,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.10.1,file_mode=0777, dir_mode=0777,persistenthandles,nounix,serverino,mapposix,rsize=1048576,wsize=1048576,actimeo=1)
If the cache=strict
or serverino
option isn't present, unmount and mount Azure Files again by running the mount command from the documentation. Then, recheck that the /etc/fstab entry has the correct options.
It's possible you're experiencing throttling, and your requests are being sent to a queue. You can verify this by leveraging Azure Storage metrics in Azure Monitor. You can also create alerts that notify you if a share is being throttled or is about to be throttled. See Troubleshoot Azure Files by creating alerts.
Ensure your app is within the Azure Files scale targets. If you're using standard Azure file shares, consider switching to premium.
This is a known issue with implementing the SMB client on Linux.
- Spread the load across multiple VMs.
- On the same VM, use multiple mount points with a
nosharesock
option, and spread the load across these mount points. - On Linux, try mounting with a
nostrictsync
option to avoid forcing an SMB flush on everyfsync
call. For Azure Files, this option doesn't interfere with data consistency, but it might result in stale file metadata on directory listings (ls -l
command). Directly querying file metadata by using thestat
command will return the most up-to-date file metadata.
Lack of support for directory leases.
- If possible, avoid using an excessive opening/closing handle on the same directory within a short period of time.
- For Linux VMs, increase the directory entry cache timeout by specifying
actimeo=<sec>
as a mount option. By default, the timeout is 1 second, so a larger value, such as 30 seconds, might help. - For CentOS Linux or Red Hat Enterprise Linux (RHEL) VMs, upgrade the system to CentOS Linux 8.2 or RHEL 8.2. For other Linux distros, upgrade the kernel to 5.0 or later.
This problem can occur if there isn't enough cache or memory on the client machine for large directories.
The amount of memory available influences the number of inode hash buckets the system has, which affects and improves enumeration performance for large directories. You can modify the number of inode hash buckets the system has to reduce the hash collisions that can occur during large enumeration workloads. For instructions, see Increase the number of hash buckets.
You might see slow performance when you try to transfer files to the Azure Files service. If you don't have a specific minimum I/O size requirement, we recommend that you use 1 MiB as the I/O size for optimal performance.
Use the right copy method:
- Use AzCopy for any transfer between two file shares.
- Using cp or dd with parallel could improve copy speed. The number of threads depends on your use case and workload. The following examples use six:
- cp example (cp will use the default block size of the file system as the chunk size):
find * -type f | parallel --will-cite -j 6 cp {} /mntpremium/ &
. - dd example (this command explicitly sets chunk size to 1 MiB):
find * -type f | parallel --will-cite-j 6 dd if={} of=/mnt/share/{} bs=1M
- cp example (cp will use the default block size of the file system as the chunk size):
- Open source third-party tools such as: - GNU Parallel. - Fpart - Sorts files and packs them into partitions. - Fpsync - Uses Fpart and a copy tool to spawn multiple instances to migrate data from src_dir to dst_url. - Multi - Multi-threaded cp and md5sum based on GNU coreutils.
Setting the file size in advance, instead of making every write an extending write, helps improve copy speed in scenarios where the file size is known. If extending writes need to be avoided, you can set a destination file size with the
truncate --size <size> <file>
command. After that, thedd if=<source> of=<target> bs=1M conv=notrunc
command will copy a source file without having to repeatedly update the size of the target file. For example, you can set the destination file size for every file you want to copy (assume a share is mounted under /mnt/share):for i in `` find * -type f``; do truncate --size ``stat -c%s $i`` /mnt/share/$i; done
Then, copy files without extending writes in parallel:
find * -type f | parallel -j6 dd if={} of =/mnt/share/{} bs=1M conv=notrunc
If the number of DirectoryOpen/DirectoryClose calls is among the top API calls and you don't expect the client to make that many calls, the issue might be caused by the antivirus software that's installed on the Azure client VM.
A fix for this issue is available in the April Platform Update for Windows.
Recent changes to SMB Multichannel config settings without a remount.
- After any changes to Windows SMB client or account SMB multichannel configuration settings, you have to unmount the share, wait for 60 seconds, and remount the share to trigger the multichannel.
- For Windows client OS, generate IO load with high queue depth say QD=8, for example copying a file to trigger SMB Multichannel. For server OS, SMB Multichannel is triggered with QD=1, which means as soon as you start any IO to the share.
Depending on the exact compression method and unzip operation used, decompression operations may perform more slowly on an Azure file share than on your local disk. This is often because unzipping tools perform a number of metadata operations in the process of performing the decompression of a compressed archive. For the best performance, we recommend copying the compressed archive from the Azure file share to your local disk, unzipping there, and then using a copy tool such as Robocopy (or AzCopy) to copy back to the Azure file share. Using a copy tool like Robocopy can compensate for the decreased performance of metadata operations in Azure Files relative to your local disk by using multiple threads to copy data in parallel.
High number file change notification on file shares can result in high latencies. This typically occurs with web sites hosted on file shares with deep nested directory structure. A typical scenario is IIS hosted web application where file change notification is set up for each directory in the default configuration. Each change (ReadDirectoryChangesW) on the share that the client is registered for pushes a change notification from the file service to the client, which takes system resources, and the issue worsens with the number of changes. This can cause share throttling and thus result in higher client-side latency.
To confirm, you can use Azure Metrics in the portal.
- In the Azure portal, go to your storage account.
- In the left menu, under Monitoring, select Metrics.
- Select File as the metric namespace for your storage account scope.
- Select Transactions as the metric.
- Add a filter for ResponseType and check to see if any requests have a response code of
SuccessWithThrottling
(for SMB or NFS) orClientThrottlingError
(for REST).
If file change notification isn't used, disable file change notification (preferred).
- Disable file change notification by updating FCNMode.
- Update the IIS Worker Process (W3WP) polling interval to 0 by setting
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\W3SVC\Parameters\ConfigPollMilliSeconds
in your registry and restart the W3WP process. To learn about this setting, see Common registry keys that are used by many parts of IIS.
Increase the frequency of the file change notification polling interval to reduce volume.
Update the W3WP worker process polling interval to a higher value (for example, 10 or 30 minutes) based on your requirement. Set
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\W3SVC\Parameters\ConfigPollMilliSeconds
in your registry and restart the W3WP process.If your web site's mapped physical directory has a nested directory structure, you can try to limit the scope of file change notifications to reduce the notification volume. By default, IIS uses configuration from Web.config files in the physical directory to which the virtual directory is mapped, as well as in any child directories in that physical directory. If you don't want to use Web.config files in child directories, specify
false
for theallowSubDirConfig
attribute on the virtual directory. More details can be found here.Set the IIS virtual directory
allowSubDirConfig
setting in Web.Config tofalse
to exclude mapped physical child directories from the scope.
- Troubleshoot Azure Files
- Troubleshoot Azure Files by creating alerts
- Understand Azure Files performance
- Overview of alerts in Microsoft Azure
- Azure Files FAQ
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.