What node sizes for system node and high traffic webserver node in a small cluster?

Question

I am trying to figure out what types of node sizes will meet minimum requirements in my case. There seem to be so many variables that affect stability of the cluster, that it is hard to tell what causes issues. First I think I need to make sure that basic node size configuration is done correctly.

The cluster consists of 7 node pools - each has 1 node without scaling Standard_B2ms. I understand that B series is intended for development and testing, but I try to cut costs where possible. What runs on these nodes is mostly nothing extraordinary - pretty simple nodejs webservers communicating with CosmosDB and ADLS. One node is more complicated.

What is the problem?

The system node is using 80% cpu and 40% memory. It does not seem to be a good situation. What node size would be an upgrade for a system node?

One node which is the more complicated one seems to have issues with memory. It hosts a nodejs microservice which serves as a proxy for downloading many small file chunks from ADLS. Sometimes it might process up to 10k requests per second. I assume it is not a low traffic scenario. The real problem is that it seems that the memory allocated for file chunk buffers to be returned from ADLS is not released fast enought, but it is really hard to observe what is actually happening in the cluster.

That observation led to enabling Insights, Prometheus and Grafana, but It might be that it actually started causing problems.

There is also a recurring error that might be connected to the high traffic node which I do not understand:

node-problem-detector-startup.sh[632]: I1027 01:39:57.732252     632 custom_plugin_monitor.go:280] New status generated: &{Source:kubelet-custom-plugin-monitor Events:[{Severity:info Timestamp:2023-10-27 01:39:57.732212674 +0000 UTC m=+125764.447568175 Reason:KubeletIsUp Message:Node condition KubeletProblem is now: Unknown, reason: KubeletIsUp, message: "Timeout when running plugin \"/etc/node-problem-detector.d/plugin/check_kubelet.s"}] Conditions:[{Type:KubeletProblem Status:Unknown Transition:2023-10-27 01:39:57.732212674 +0000 UTC m=+125764.447568175 Reason:KubeletIsUp Message:Timeout when running plugin "/etc/node-problem-detector.d/plugin/check_kubelet.s}]}
node-problem-detector-startup.sh[632]: E1027 01:39:57.529185     632 plugin.go:186] Error in running plugin timeout "/etc/node-problem-detector.d/plugin/check_runtime.sh"

I would appreciate any hints and directions as it is obvious I do not even know what questions to ask.

Accepted Answer

Hi @Tomasz Rozwadowski
Thanks for posting your question,

A few points that I hope help:

The B series SKU is indeed not ideal for AKS nodes as they might not have a non consistent behaviour in terms of resource availability, and so you can experience issues in your application that might be harder to replicate or troubleshoot later.

For system nodepools, the recommended is at least 4vCPUs so Standard_DS4_v2 is a common choice, but it will come down to the size of your cluster. 80% CPU usage is not ideal, it can easily spike up to full usage and leave processes with iowait issues. I would suggest looking to upgrade that nodepool at least in CPU count.

For the latter scenario, if you are not using it, I would consider VMs with ephemeral disk to make sure the IO is not the bottleneck and then re-evaluate the memory usage on the nodes to see if a higher SKU is needed. Keep in mind the differences in RSS and working set memory metrics, check the link for more information.

https://learn.microsoft.com/en-us/azure/virtual-machines/ephemeral-os-disks

https://learn.microsoft.com/en-us/azure/azure-monitor/containers/container-insights-analyze#:~:text=Memory%20working%20set%20shows,to%20memory%20when%20needed.

You might also want to consider using Affinity and anti affinity rules to better distribute your resource intensive workloads, e.g. make sure that two pods for deployments that use a lot of memory are not scheduled/running in the same node.

Lastly, regarding that error log, I believe you are correct. That node-problem-detector-startup.sh is standard behaviour for AKS nodes when kubelet process is misbehaving. If you are running into high resource usage problems, that would be the most likely cause and after you adjust your nodes resources it should clear up. However keep in mind that to avoid that scenario you should set mem/cpu limits on your workloads to avoid it.

If you have issues with that, even after following those recommendations, I would suggest opening a support ticket with Microsoft for better investigation.

Share via

What node sizes for system node and high traffic webserver node in a small cluster?

0 additional answers