Continuously Available SMB share on Workgroup Cluster do not failover transparently

Mansell 0 Reputation points
2023-03-03T09:32:03.5766667+00:00

Created a Workgroup Cluster with two cluster nodes.
Created a File server role on it, with an SMB share, configured with Continuous Availability.
Client could access files on the SMB share, no matter which server is the owner node.

But if I start downloading a large file from the SMB share, and then perform a 'failover operation' on the File Server role, the following occurs:

  • On the transfer progress window, the transfer speed dropped to zero, looks like it will resume after the failover completed.
  • The File server role migrated from node A to node B successfully. No error in the Failover Cluster Manager.
  • The file transfer window timed out after ~3 mins, then prompted a 'Network Error' message.
  • If I click 'Try Again' on the 'Network Error' box, the download will start from the beginning.

I checked the Event Logs on both the client and cluster nodes, and found these entries seems related:

On the client:

  • SMBWitnessClient/Informational, Event ID 4:
    Witness Client is waiting to receive list of Witness Servers from IP address 10.10.18.64
    (And this event appears every 20 seconds. Seems like it could not get the list of Witness Servers from 10.10.18.64, which is the IP address of the File Server cluster role)
  • SMBClient/Operational, Event ID 30611:
    Failed to reconnect a persistent handle.
    Error: {Access Denied}
    A process has requested access to an object, but has not been granted those access rights.
    FileId: 0x200000200000011:0x0
    CreateGUID: {xxxxxx}
    Path: \cbsrepo\cdc1-smb-ntfs-share\Infernal.mkv
    Reason: The server denied the create request.
    Previous reconnect error: The transport connection is now disconnected.
    Previous reconnect reason: Disconnected because there was a network disconnect indication
    Guidance: A persistent handle allows transparent failover on Windows File Server clusters. This eveny has many causes and does not always indicate an issue with SMB. Review online documentation for troubleshooting information.
  • SMBWitnessClient/Admin, Event ID 42
    Witness Client failed to connect to the witness server at IP address 10.10.18.64 with error (The remote procedure call failed). This event may be suppressed for this resource for the next 12 hours if the condition persists.
  • SMBClient/Operational, Event ID 30906
    A request on persistent/resilient handle failed because the handle was invalid or it exceeded the timeout.
    Status: The transport connection is now disconnected.
    Type: Read
    Path: \cbsrepo\cdc1-smb-ntfs-share\Infernal.mkv
    Restart count: 5
    Guidance: After retrying a request on a Continuously Available (Persistent) handle or a Resilient handle, the client was unable to reconnect the handle. This event is the result of a handle recovery failure. Review other events for more details.
  • SMBWitnessClient/Admin, Event ID 3
    Witness Client failed to obtain the list of Witness Servers from IP address 10.10.18.64 with error (Access is denied). This event may be suppressed for this resource for the next 12 hours if the condition persists.

On the DESTINATION cluster node:

  • SMBClient/Operational, Event ID 1016:
    Reopen failed.
    Client Name: \10.10.19.147
    Client Address: 10.10.19.147:49813
    User Name: NX3340B\Administrator
    Session ID: 0x40800000000001
    Share Name: cdc1-smb-ntfs-share
    File Name: Infernal.mkv
    Resume Key: {xxxxxx}
    Status: {Access Denied}
    A process has requested access to an object, but has not been granted those access rights. (0xC0000022)
    RKF Status: {Access Denied}
    A process has requested access to an object, but has not been granted those access rights. (0xC0000022)
    Durable: false
    Resilient: false
    Persistent: false
    Reason: RKF resume create
    Guidance: The client attempted to reopen a continuously available handle, but the attempt failed. This typically indicates a problem with the network or underlying file being re-opened.
  • SMBServer/Security, Event ID 1006
    The share denied access to the client.
    Client Name: \10.10.19.147
    Client Address: 10.10.19.147:49813
    User Name: NX3340B\Administrator
    Session ID: 0x40800000000001
    Share Name: \CBSREPO\cdc1-smb-ntfs-share
    Share Path: ??\S:\Shares\cdc1-smb-ntfs-share
    Status: {Access Denied}
    A process has requested access to an object, but has not been granted those access rights. (0xC0000022)
    Mapped Access: 0x1002A8
    Grandted Access: 0x0
    Security Descriptor: 0x010004803..........................
    Guidance: You should expect access denied errors when a principal accesses a share without the necessary permissions. Ususally, this indicates that the principal does not have direct security permissions or lacks membership in a group that has direct access permissions. To determine and correct the permissions on the specified share, an administrator can use the Security tab in File Explorer Properties dialog, the SMBSHARE Windows PowerShell module, or the NET SHARE command. You can also use the Effective Access tab in File Explorer to help diagnose the issue.
    Applications may generate access denied errors if they attempt to open files in a writable mode first, and then repoen the files in a read-only mode. In this case, no user action is required.

From these logs it seems related to the 'Access Denied' of the SMB Witness / Resume Key, but I don't know where are they and how to customize the permissions.

I do have another similar setup, but the Failover Cluster is AD joined. SMB share failover transparently as expected, if the client is also AD joined. If the client is a workgroup computer, then the transparent failover will not occur. Is it a normal behaviour? Or any tweaks to work around this as well?

Grateful if some experts could help.

Thanks a lot.

Windows Server 2019
Windows Server 2019
A Microsoft server operating system that supports enterprise-level management updated to data storage.
3,613 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
979 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Limitless Technology 44,121 Reputation points
    2023-03-06T16:22:34.92+00:00

    Hello there,

    You can quickly check the SMB Transparent Failover requirements from here https://techcommunity.microsoft.com/t5/storage-at-microsoft/smb-transparent-failover-8211-making-file-shares-continuously/ba-p/425693

    When the SMB client initially connects to a file server cluster node, the SMB client notifies the SMB Witness client, which is running on the same computer. The SMB Witness client obtains a list of cluster members from the SMB Witness service running on the file server cluster node. The SMB Witness client picks a different cluster member and issues a registration request to the SMB Witness service on that cluster member.

    If an unplanned failure occurs on the file server cluster node, the SMB Witness service on the other cluster member receives a notification from the cluster service. The SMB Witness service also notifies the SMB Witness client, which in turns notifies the SMB client that the file server cluster node has failed. Upon receiving the SMB Witness notification, the SMB client immediately starts reconnecting to a different file server cluster node, which significantly speeds up recovery from unplanned failures

    Hope this resolves your Query !!

    --If the reply is helpful, please Upvote and Accept it as an answer–