Azure Kubernetes Cluster: Errno 122 when attempting to concurrently file lock the same file in AzureFiles from >= 100 nodes

asked 2021-11-12T18:47:00.427+00:00
David Zanter 21 Reputation points

I have the following issue I am witnessing when I have a AKS k8s cluster with >100 nodes, and I attempt to have every node lock the same file which is located on a shared AzureFile mount; the 100th node to request the lock is returned an errno 122 (Disk Quote Exceeded.) . (I am doing this to in-parallel have a 100 node computation platform parse a dataset.)

  volumes:
  - azureFile:
      readOnly: false
      secretName: azure-secret
      shareName: aksshare 

This happens always exactly on the 100th node to request the concurrent file lock; and I have never seen it on < 100 nodes; So I am assuming there is some Hard Limit on the amount of concurrent locks that are allowed.

Specifically I was curious if anyone had seen this; and if possibly there is some configuration setting that could be increased to allow more concurrent locks?

To simplify the scenario I wrote a simple C program (shown below) which is able to reproduce the problem.

(Other data points:
*I did try have 100 pids/program on a single kubernetes node lock the same file in an azure files mount and that did work.
*I did also have the 100 separate k8s nodes each lock a hostPath file, and that did work. (would expect that to work since those are different files per each host, but just to sanity check it.)
)

Simple repro-program:

int main(int argc, char *argv[])
{
     struct flock fltest = {0,0,0,0,0};
     fltest.l_type = F_RDLCK;
     int fd = open( argv[1], O_RDONLY, 0);
    printf("opened file: %s, fid:%d errno:%d\n", argv[1], fd, errno);
    int irc = fcntl(fd, F_SETLK, &fltest);
    printf("locked file: %s, %d errno %d\n", argv[1], irc, errno);
    if (irc == 0)
    {
        printf("Waiting 30 seconds while holding lock.\n");
        sl_ep(30); 
    }
    fltest.l_type = F_UNLCK;
    irc = fcntl(fd, F_SETLK, &fltest);
    printf("Lock released irc:%d errno %d\n", irc, errno);
}
Azure Files
Azure Files
An Azure service that offers file shares in the cloud.
764 questions
Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,102 questions
No comments
{count} votes

Accepted answer
  1. answered 2021-11-14T20:04:19.757+00:00
    Pradeep Kommaraju 2,521 Reputation points

    Hi @David Zanter

    Thank you for reaching out to Microsoft Q&A Forum ,

    Looks like in your case you were hitting the maximum concurrent requests for the Azure Files.
    Reviewed various other similar cases as well and unfortunately all these limits are Hard limits and there is no way out of it .

    Hope you find another cloud solution for your use case .

    Thanks & Regards,
    Pradeep

    ----------------------------------------

    Please don't forget to accept the answer if this clarifies your ask .

    No comments

0 additional answers

Sort by: Most helpful