Azure Managed Redis OOM for two hours

Théo Mouchabac 20 Reputation points
2025-06-17T18:33:26.0366667+00:00

My Azure Managed Redis instance went down for at least two hours with no possible action on my side to restore, restart, or do anything with the service during that time.

Clients could not write to the redis instance anymore, getting OOM (out of memory) issues. maxmemory was not readable or settable from redis clients.

When looking at monitoring dashboards from the Azure portal, the memory consumption was fine (way below the instance capacity). Every metric was fine until a sudden spike in read, writes and latency. We cannot see what justifies this coming from clients connected to the service.

The incident was "automatically" resolved a few hours later, but this was way too long and we had to recreate a new instance to ensure our own service availability.
However, the resolving of the incident came with no explanation at all. Because I don't want the issue to reproduce, I need to understand what happened exactly.

Here is a screenshot of the health event: Capture d’écran 2025-06-17 à 20.49.08

At that time, I could not even contact support to understand what happened.

Please provide me with more explanation on this case specifically so that i can avoid the issue in the future.

Regards,

Azure Cache for Redis
Azure Cache for Redis
An Azure service that provides access to a secure, dedicated Redis cache, managed by Microsoft.
305 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Narendra Pakkirigari 475 Reputation points Microsoft External Staff Moderator
    2025-06-20T09:32:02.4133333+00:00

    Hi Théo Mouchabac,

    The issue occurred because a scaling operation for your Azure Redis cache in the France South region didn’t go as expected. The system attempted to scale the cache, but the process got stuck and eventually timed out, leaving the cache in an unresponsive state almost like it had run out of memory, even though actual usage was within limits. As a result, the service remained unavailable for about two hours. Fortunately, the Azure Redis team identified the problem, applied the necessary fixes, retried the scaling operation, and successfully brought the cache back online.

    However, since there was no valid Account Admin email configured in Azure, you likely didn’t receive automatic notifications about this incident. Azure does not send default alerts about service outages, incidents, or maintenance unless a valid email is provided or proper alerting is set up. To ensure your team stays informed especially for region-specific issues you should manually configure Service Health alerts and link them to action groups with valid email addresses, phone numbers, or other preferred notification methods.

    if the above answer was helpful. If this answers your query, do click Accept Answer and Yes, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.