Azure Redis Cache connection timeout from AKS workloads

Anonymous
2022-03-08T09:42:04.787+00:00

Dears,
we are facing connection timeout to Redis Cache (PaaS) on our AKS workloads.
Before moving to the Redis PaaS solution, we were using our own Redis deploy (K8s pods).
We were facing the same issue: connection timeout.

One can think that the issue is caused by our application and not Redis itself.

But the point here is that we have many pods in separate namespaces, with different configurations and they all face Redis disconnections in the same time frame (approx 1 hour).

At this point my guess is that the underlying issue comes from the AKS node timesync.

One clue for this assumption is that all the pods facing the issue are on the same node, despite we have many replicas on other pool nodes.
Another clue is that while the issue is going on, we have no other issues in the cluster and node, all metrics are fine: CPU usage, IO, memory, nw bandwith...

My questions are:
1- is there any evidence that AKS has timesync issues in the current OS node version for k8s vers. 1.21.2 ?
2- how can I investigate on my own if timesync is occuring while I have the Redis timeouts ?

thanks
Marco

Azure Kubernetes Service
Azure Kubernetes Service
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,447 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Anonymous
    2022-04-11T07:22:39.613+00:00

    Just for the sake of completeness:
    we were not able to troubleshoot the issue: after excluding the timesync issue on the AKS nodes, we didn't find any other anomalies that could lead to a Redis fault.

    So we moved to the Redis Cache managed solution offered by Azure and it seems to work properly.

    1 person found this answer helpful.

  2. shiva patpi 13,366 Reputation points Microsoft Employee Moderator
    2022-03-09T05:27:48.307+00:00
    0 comments No comments

  3. Anonymous
    2022-03-09T08:06:11.593+00:00

    Hi @shiva patpi ,

    all the pool nodes report system clock sync: Yes

    181363-2022-03-09-08-59-30-window.png

    This excludes that there is a time sync issue, correct?


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.