How can I access the core-dump file if a container in AKS crashes?

Question

How can I access the core-dump file if a container in AKS crashes?

Simon Schick (ITC Service) 126

The OS Linux allows me to create a core-dump if an application crashes, which contains the data the application had in memory at the time of the crash. This is very valuable in order to do post-mortem debugging (https://en.wikipedia.org/wiki/Core_dump).

While trying to run my containers on an AKS cluster, I discovered, that the VMs of an AKS cluster do not set a custom value as path for where to save the core dump, which tries to pipe it to /usr/share/apport/apport %p %s %c. Pipes are forwarded to the host system (see https://unix.stackexchange.com/questions/565788/core-file-may-or-may-not-escape-docker-container-depending-on-the-core-pattern) where I would like to know what the values of each VM are and how I can pick up these core-dump files.

According to a stackoverflow answer (https://stackoverflow.com/questions/58715733/export-memory-dump-azure-kubernetes) this should be somewhere in the docs of Azure, but I couldn't find any reference.

Please tell me where AKS stores the core-dump files of my containers and how I can access them.

Accepted answer

3 additional answers

Your answer

Answer 1

Simon Schick (ITC Service) 126

For my purposes I resolved it by:

Creating a DaemonSet which runs a privileged container using the pid of the host.
Use nsenter to enter the main namespace.
Execute a command which changes the core_pattern file.
Wait forever, because a DaemonSet must have the restart-policy set to Always.

This instruction was taken from: https://medium.com/[@](/users/na/?userId=174b737b-7ffe-0003-0000-000000000000)/initialize-your-aks-nodes-with-daemonsets-679fa81fd20e

Here's the final configuration:

   apiVersion: apps/v1  
   kind: DaemonSet  
   metadata:  
     name: init-node  
   spec:  
     selector:  
       matchLabels:  
         job: init-node  
     template:  
       metadata:  
         labels:  
           job: init-node  
       spec:  
         hostPID: true  
         restartPolicy: Always  
         containers:  
         - image: alpine:3.12.0  
           name: init-node  
           securityContext:  
             privileged: true  
           command: ["/bin/sh"]  
           args: ["-c", "nsenter -t 1 -m -- su -c \"echo \\\"/core/core.%e.%p.%t\\\" > /proc/sys/kernel/core_pattern\" && sleep infinity"]

Since pipes are executed in the namespace of root, I'll see if I can create a script which writes the datastream it gets as input to a location. By this, I can have a directory, mounted on the host system (best case an Azure File System instance) where this application would write the core-dump to. In this case the dump would be saved outside the container, which actually would be the best scenario for me.

EDIT:
The last part was the easiest. Just set /proc/sys/kernel/core_pattern to |/bin/dd of=/core/%h.%e.%p.%t and your core-dumps will get piped to /core on your host. Now mount an AzureFileStorage (https://learn.microsoft.com/en-us/azure/virtual-machines/linux/mount-azure-file-storage-on-linux-using-smb) on this location and you're ready to go.

This will hit already quite a couple of birds with one stone:

Your core-dumps are not on the container after it crashed. All sensible information are stored at a different place which is not accessible if a container is compormized. Be careful where you store them.
If you do it like me, they're not stored on the host either, so you're free to scale up and down on your cluster without worrying about loosing information.

DISCLAIMER:
The application nsenter enters a different namespace and executes commands in it. Therefore this solution might only work with docker, which uses namespaces and cgroups to run the containers.

Muthukumar, Harinarayanan 6 Reputation points

2022-03-09T14:12:23.843+00:00

This exactly looks like something that could solve the problem im facing on my AKS Clusters , But something i dont understand is post we changing the path of core dump files to say /core , How do we get those out of the POD / Nodes ?

/core path where the dumps will be generated will it be on the POD or on the Actual AKS Node ?

If its in the POD , then there are chances of loosing those after a pod life cycle ends and we will then have to add a volume and a volume mount to each of our deployments .

if its in the Node , How do we get it out as we use VMSS to back our AKS clusters and I am not aware of any way to add a mount to the Actual Node
Simon Schick (ITC Service) 126 Reputation points

2022-03-09T14:52:34.57+00:00

@Muthukumar, Harinarayanan Maybe a read on https://stackoverflow.com/a/68480387/517914 will make it clearer how it works - I'll still try to answer your questions here:

How do we get those out of the POD / Nodes ?
/core path where the dumps will be generated will it be on the POD or on the Actual AKS Node ?

According to how /proc/sys/kernel/core_pattern is interpreted, a pipe would be executed in the global namespace, outside the container. To use your naming, this would be the "Actual AKS Node". Please read up on how docker/moby work internally - like cgroups and namespaces in Linux - to understand better what I'm talking about.

How do we get it out as we use VMSS to back our AKS clusters and I am not aware of any way to add a mount to the Actual Node

Just include the process of the mounting in the DaemonSet...
Simon Schick (ITC Service) 126 Reputation points

2022-03-09T15:02:14.01+00:00

@Muthukumar, Harinarayanan I admit, its quite confusing because there's so much internal knowledge required to understand what actually is happening here. This stackoverflow answer of mine might help: https://stackoverflow.com/a/68480387/517914 - but I'll still try to answer your questions as good as I can:

How do we get those out of the POD / Nodes ?
/core path where the dumps will be generated will it be on the POD or on the Actual AKS Node ?

This is explained by how pipes on core-dumps work (https://groups.google.com/g/linux.kernel/c/OGfVHd2ZB58/m/T15D-WbgAgAJ). To answer in your own terminology: it will be generated on the "Actual AKS Node". Btw, after understanding how namespaces and cgroups work, I guess I get what you mean, but it sounds odd :)

How do we get it out as we use VMSS to back our AKS clusters and I am not aware of any way to add a mount to the Actual Node

Well, why don't you adjust the script for the DaemonSet, which runs on every machine? I've done it and by this now initialize a mount on every machine in my cluster.
Muthukumar, Harinarayanan 6 Reputation points

2022-03-09T15:11:02.63+00:00

thanks :) Thats exactly what I did , in the daemon set apart from changing the path of core dumps ... I added extra commands that creates a mount to storage account on each node .

thanks once again for the insights
Muthukumar, Harinarayanan 6 Reputation points

2022-03-09T15:51:15.447+00:00

Thanks for the explanation . Thats exactly what I did and the script in the daemon set now creates a mount and adds the Azure Storage File there .. So that part is sorted .

The setup is now running fine and now I am looking for a way to actually test if all this works ..

I have a .Net core application running on my AKS Cluster and I wanted to simulate a crash which will inturn generate a core dump . I just want to test this end to end .. I have tried running kill -9 <pid> to simulate a crash and it dosent seem to work .. I think this way of killing the pid is not triggering the core dump generation . Any better way to test this all out ?
Simon Schick (ITC Service) 126 Reputation points

2022-03-10T07:38:23.127+00:00

@Muthukumar, Harinarayanan Well, kill -9 <pid> will send a SIGKILL, which ,according to the Wikipedia article on "Signal (IPC)", kills the application without a core-dump. Running kill -3 <pid>, which sends a SIGQUIT should create a core-dump though.
Muthukumar, Harinarayanan 6 Reputation points

2022-03-10T18:56:08.38+00:00

I tried running kill -3 <pid> , but I am not seeing the core dump being generated .

I even tried setting the COMPlus_DbgEnableMiniDump=1 app settings to see if that helps . But still no luck .

Any help would be really helpful.
santosh 0 Reputation points

2024-10-22T11:13:32.0733333+00:00

Hello,

I am getting confused here. I saw one of the DaemonSets with file storage that needs to run on the AKS cluster, but do I need to change anything in my deployment/pod configuration to get core dumps?

Answer 2

@Simon Schick (ITC Service)

Apologies in delayed response.

In order to see core dump, first item which needs to be checked is that core dump are enabled in the worker nodes of AKS. To do that, login into AKS worker nodes and check the settings of core dump present at /proc/sys/kernel/core_pattern It would provide the setting where the core dump are configured to be written.

After checking that, log into pod to see the directory of the core dump (to check the mapping of that to host file system).

The default setting of core dump is to write the core dump at the home of the user (which could be totally different for AKS worker nodes). If /tmp folder of pod is mapped to local filesystem of worker node, it can be configured to write core dump which can be extracted from worker node.
Some reference links Apport Ubuntu and stack overflow.

The default images managed by Azure might have core dumps disabled (considering the core dump file size). However, there are ways to use custom image as worker nodes (ref stackoverflow). Also, you can have a volume with type local dir which maps a local directory to the pod, then you can capture dumps on the mapped directory inside the pod.

[Please 'Accept as answer' if it helped, so that it can help others in the community looking for help on similar topics.]

Answer 3

Thanks for the response.

According to the value of RLIMIT_CORE, which is set to -1, core dumps are activated in a kubernetes pod.

The value of the file /proc/sys/kernel/core_pattern on the AKS worker nodes by default is what Ubuntu gives it: |/usr/share/apport/apport %p %s %c which - according to this thread on the Linux mailing list https://lists.gt.net/linux/kernel/2322800, because it is a pipe - is evaluated in the global filesystem namespace.

I've even tried changing the path by a container which has privileged access and thereby should be able to overwrite the resource - but that just gave a very strange case where it sometimes worked but sometimes not.

Now - how can I either:

Check how the machine is configured so I can find out the location of where apport saves those files?
Change this path persistently?

Answer 4

Anthony Whalley 1

Hi @Simon Schick (ITC Service)

I've been working on this problem over at this project.
https://github.com/IBM/core-dump-handler

We've done some preliminary testing on AKS and it seems to work ok but it would be great to get some additional feedback if you have a time.

Hope this is useful.

Share via

How can I access the core-dump file if a container in AKS crashes?

3 additional answers

Your answer