Edit

Troubleshoot NFS Azure file shares

Applies to: ✔️ NFS Azure file shares

Note

CentOS referenced in this article is a Linux distribution and will reach End Of Life (EOL). Consider your use and plan accordingly. For more information, see CentOS End Of Life guidance.

This article lists common issues related to NFS Azure file shares and provides potential causes and workarounds.

Important

The content of this article only applies to NFS shares. To troubleshoot SMB issues in Linux, see Troubleshoot Azure Files problems in Linux (SMB). NFS Azure file shares aren't supported for Windows.

Use the Always-On Diagnostics tool

You can use the Always-On Diagnostics (AOD) tool to collect logs on NFSv4 and SMB Linux clients. The daemon runs in the background as a system service and can be configured to detect anomalies in various sources, such as dmesg logs, debug data, error metrics, and latency metrics. It can capture data from tcpdump, nfsstat, mountstsat, and other sources, along with the system's CPU and memory usage. The tool is useful for collecting debug information on field issues that are difficult to reproduce.

The Always-On Diagnostics tool is currently compatible with systems running SUSE Linux Enterprise Server 15 (SLES 15) and Red Hat Enterprise Linux 8 (RHEL 8). Follow the installation steps that correspond to your operating system:

Important

Always-On Diagnostics doesn't support NFS volumes with encryption in transit enabled. To enable log collection on the impacted NFS share, you must mount the share without EiT.

In RHEL 8, follow these instructions to install the Always-On Diagnostics tool:

  1. Download the repo configuration package.

    curl -ssl -O https://packages.microsoft.com/config/rhel/8/packages-microsoft-prod.rpm
    
  2. Install the repo configuration package.

    sudo rpm -i packages-microsoft-prod.rpm
    
  3. Delete the repo configuration package after installing and updating the package index files.

    rm packages-microsoft-prod.rpm
    sudo dnf update
    
  4. Install the package.

    sudo dnf install aod
    

chgrp "filename" failed: Invalid argument (22)

Cause 1: idmapping isn't disabled

Because Azure Files disallows alphanumeric UID/GID, you must disable idmapping.

Cause 2: Disabled idmapping gets re-enabled after encountering bad file or directory name

Even if you disable idmapping, the system can automatically re-enable it in some cases. For example, when Azure Files encounters a bad file name, it sends back an error. Upon seeing this error code, an NFS 4.1 Linux client decides to re-enable idmapping, and sends future requests with alphanumeric UID or GID. For a list of unsupported characters on Azure Files, see Naming and referencing shares, directories, files, and metadata. Colon is one of the unsupported characters.

Workaround

Make sure you disable idmapping and that nothing re-enables it. Then perform the following steps:

  1. Unmount the share.

  2. Disable idmapping by running the following command:

    sudo echo Y > /sys/module/nfs/parameters/nfs4_disable_idmapping
    
  3. Mount the share back.

  4. If you're running rsync, run rsync with the -numeric-ids argument from a directory that doesn't have a bad directory or file name.

Unable to create an NFS share

Cause: Unsupported storage account settings

NFS is only available on storage accounts with the following configuration:

  • Tier: Premium
  • Account Kind: FileStorage

Solution

Follow the instructions in Create an NFS file share.

Can't connect to or mount an NFS Azure file share

Cause 1: Request originates from a client in an untrusted network or untrusted IP

Unlike SMB, NFS doesn't support user-based authentication. Authentication for a share depends on your network security rule configuration. To ensure clients only establish secure connections to your NFS share, you must use either the service endpoint or private endpoints. To access shares from on-premises in addition to private endpoints, you must set up a VPN or Azure ExpressRoute connection. The storage account firewall ignores IPs added to the allowlist. To set up access to an NFS share, use one of the following methods:

  • Service endpoint

    • Accessed by the public endpoint.

    • Only available in the same region.

    • You can't use VNet peering for share access.

    • You must add each virtual network or subnet individually to the allowlist.

    • For on-premises access, you can use service endpoints with ExpressRoute, point-to-site, and site-to-site VPNs. Use a private endpoint because it's more secure.

      The following diagram depicts connectivity using public endpoints:

      Diagram of public endpoint connectivity.

  • Private endpoint

    • Access is more secure than the service endpoint.

    • Access to NFS share via private link is available from within and outside the storage account's Azure region (cross-region, on-premises).

    • Virtual network peering with virtual networks hosted in the private endpoint give the NFS share access to the clients in peered virtual networks.

    • You can use private endpoints with ExpressRoute, point-to-site VPNs, and site-to-site VPNs.

      Diagram of private endpoint connectivity.

Cause 2: nfs-utils, nfs-client, or nfs-common package isn't installed

Before running the mount command, install the nfs-utils, nfs-client, or nfs-common package.

To check if the NFS package is installed, run:

The same commands in this section apply to CentOS and Oracle Linux.

sudo rpm -qa | grep nfs-utils

Solution

If the package isn't installed, install the package by using your distro-specific command.

The same commands in this section apply to CentOS and Oracle Linux.

OS Version 7.X

sudo yum install nfs-utils

OS Version 8.X or 9.X

sudo dnf install nfs-utils

Cause 3: Firewall blocking port 2049

The NFS protocol communicates to its server over port 2049. Make sure that this port is open to the storage account (the NFS server).

Solution

Verify that port 2049 is open on your client by running the following command. If the port isn't open, open it.

sudo nc -zv <storageaccountnamehere>.file.core.windows.net 2049

Cause 4: Storage account deleted

If you're unable to mount the file share due to error: connection timed out, the storage account containing the file share might be deleted accidentally.

Solution

Recover the storage account. Then, delete and re-create the private endpoint so it's associated with the new storage account resource ID.

Cause 5: You're trying to mount the share using the NFS client mount instead of the AZNFS mount helper, and the Secure transfer required and/or Require encryption in transit for NFS setting is enabled on the storage account.

The Secure transfer required setting enforces encryption in transit for all file shares within the storage account unless the Require encryption in transit for NFS setting is enabled, in which case Secure transfer required only applies to REST/HTTPS traffic. For NFS file shares, using encryption in transit requires mounting the share using the AZNFS Mount Helper, a client utility package that abstracts the complexity of establishing secure tunnels for NFSv4.1 traffic.

Solution

Either disable both the Secure transfer required setting and the Require encryption in transit for NFS setting on the storage account, or use the AZNFS mount helper to mount the share. For more information, see Encryption in transit for NFS Azure file shares.

ls hangs for large directory enumeration on some kernels

Cause: A bug was introduced in Linux kernel v5.11 and fixed in v5.12.5

Some kernel versions have a bug that causes directory listings to result in an endless READDIR sequence. Small directories where all entries can be shipped in one call don't have this problem. The bug was introduced in Linux kernel v5.11 and fixed in v5.12.5. So, any version in between has the bug. RHEL 8.4 uses this kernel version.

Workaround: Downgrade or upgrade the kernel

Downgrade or upgrade the kernel to a version outside the affected range to resolve the issue.

System commands fail with the "File not found" error

Cause

Linux 32-bit applications that rely on inode numbers might not work as expected with Azure Files due to the formatting of the 64-bit inode numbers generated by the NFS service.

Solution

To resolve this issue, use one of the following methods:

  • Compress the 64-bit inode numbers to 32 bits by using the nfs.enable_ino64=0 kernel boot option.

  • Set the module parameter by adding options nfs enable_ino64=0 to the /etc/modprobe.d/nfs.conf file and rebooting the VM.

You can also persist this kernel boot option in the grub.conf file. For more information, see the documentation for your Linux distribution.

Unable to change the ownership of files and directories

Cause

The client OS enforces permissions on NFS file shares, not the Azure Files service. If you enable the Root Squash setting on an NFS file share, the root user on the client system becomes an anonymous (non-privileged) user for access control purposes. This restriction means that even if you're logged in as root on the client system, you can't use the chown command to change the ownership of files and directories that you don't own.

Solution

In the Azure portal, go to the file share and select Properties. Change the Root Squash setting to No Root Squash. For more information, see Configure root squash for Azure Files.

When you enable No Root Squash, the root user on the client system has the same privileges as the root user on the server system. You can now use chown to change the ownership of any file or directory in the share, regardless of the current owner. After you make the changes, you can re-enable Root Squash if necessary.

Need help?

If you still need help, contact support to get your problem resolved quickly.

See also

Third-party information disclaimer

The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.