Non-Restarting RabbitMQ Pod in AKS

Question

Non-Restarting RabbitMQ Pod in AKS

Koprucu, Mert (ADV D EU TR AP&I TIA 1) 0

I am currently deploying RabbitMQ Server on Azure Kubernetes Service (AKS), specifically utilizing Availability Zone 1.

I am aiming to ensure that the running pods, especially the RabbitMQ Server, do not experience any unnecessary restarts unless it's due to issues originating from RabbitMQ itself. Last days, I changed Availablity Zone from "None" to "Zone1" for AKS. I supposed it would be enough to fix the issue but it didn't work. I realized all pods are restarted.

Could you please provide guidance on best practices or configurations within AKS to achieve this goal?

vipullag-MSFT 26,492 Reputation points Moderator

2024-04-18T04:49:52.19+00:00

Hello Koprucu, Mert (ADV D EU TR AP&I TIA 1)Any update on the issue?

Just checking in to see if you got a chance to see previous response.

1 answer

Your answer

vipullag-MSFT 26,492 Reputation points Moderator

2024-04-18T04:49:52.19+00:00

Hello Koprucu, Mert (ADV D EU TR AP&I TIA 1)Any update on the issue?

Just checking in to see if you got a chance to see previous response.

Answer 1

Hello Koprucu, Mert (ADV D EU TR AP&I TIA 1)

Welcome to Microsoft Q&A Platform, thanks for posting your query here.

To ensure that your running pods, especially the RabbitMQ Server, do not experience any unnecessary restarts unless it's due to issues originating from RabbitMQ itself, you can consider using Pod Disruption Budgets. Pod Disruption Budgets define how many replicas in a deployment can be taken down during an update or node upgrade.
For example, if you have five replicas in your deployment, you can define a pod disruption of four to only allow one replica to be deleted or rescheduled at a time. As with pod resource limits, best practice is to define pod disruption budgets on applications that require a minimum number of replicas to always be present.

Additionally, you mentioned that you changed the Availability Zone from "None" to "Zone1" for your AKS cluster, but it did not fix the issue. It's important to note that simply enabling Availability Zones does not guarantee high availability or prevent pod restarts. You also need to ensure that your RabbitMQ Server deployment is configured to take advantage of Availability Zones.

To achieve high availability on the node-level, you can use Kubernetes Pod Topology Spread Constraints or Pod Anti-Affinity to schedule your RabbitMQ Server pods on nodes spread across Availability Zones. This can help prevent disruptions due to data center or node failures

Hope this helps.

Koprucu, Mert (ADV D EU TR AP&I TIA 1) 0 Reputation points

2024-04-24T10:19:42.2233333+00:00

Hi @vipullag-MSFT,

Sorry for late response. I will search your suggestions but I work with only one RabbitMQ replica in AKS. As far as I understand in first insight, PDB is dedicated for one more than pod. Should I consider RabbitMQ federation for zero down-time for this purpose?
vipullag-MSFT 26,492 Reputation points Moderator

2024-04-25T15:59:33.68+00:00

Hello Koprucu, Mert (ADV D EU TR AP&I TIA 1)

You are correct that Pod Disruption Budgets (PDBs) are typically used for deployments with more than one replica. In your case, since you only have one RabbitMQ replica in your AKS cluster, PDBs may not be the best solution.

RabbitMQ Federation is a feature that allows you to replicate messages between different RabbitMQ brokers. This can be used to achieve high availability and prevent downtime in the event of a node failure.

With RabbitMQ Federation, you can configure your RabbitMQ Server to replicate messages to a secondary RabbitMQ broker running in a different Availability Zone or region. This secondary broker can take over in the event of a failure, ensuring that your messages are still available and preventing downtime.

Share via

Non-Restarting RabbitMQ Pod in AKS

1 answer

Your answer