How can I access the core-dump file if a container in AKS crashes?

Simon Schick (ITC Service) 106 Reputation points
2020-05-25T14:27:43.67+00:00

The OS Linux allows me to create a core-dump if an application crashes, which contains the data the application had in memory at the time of the crash. This is very valuable in order to do post-mortem debugging (https://en.wikipedia.org/wiki/Core_dump).

While trying to run my containers on an AKS cluster, I discovered, that the VMs of an AKS cluster do not set a custom value as path for where to save the core dump, which tries to pipe it to /usr/share/apport/apport %p %s %c. Pipes are forwarded to the host system (see https://unix.stackexchange.com/questions/565788/core-file-may-or-may-not-escape-docker-container-depending-on-the-core-pattern) where I would like to know what the values of each VM are and how I can pick up these core-dump files.

According to a stackoverflow answer (https://stackoverflow.com/questions/58715733/export-memory-dump-azure-kubernetes) this should be somewhere in the docs of Azure, but I couldn't find any reference.

Please tell me where AKS stores the core-dump files of my containers and how I can access them.

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,116 questions
No comments
{count} votes

Accepted answer
  1. Simon Schick (ITC Service) 106 Reputation points
    2020-06-05T08:27:12.04+00:00

    For my purposes I resolved it by:

    1. Creating a DaemonSet which runs a privileged container using the pid of the host.
    2. Use nsenter to enter the main namespace.
    3. Execute a command which changes the core_pattern file.
    4. Wait forever, because a DaemonSet must have the restart-policy set to Always.

    This instruction was taken from: https://medium.com/[@](/users/na/?userId=174b737b-7ffe-0003-0000-000000000000)/initialize-your-aks-nodes-with-daemonsets-679fa81fd20e

    Here's the final configuration:

       apiVersion: apps/v1  
       kind: DaemonSet  
       metadata:  
         name: init-node  
       spec:  
         selector:  
           matchLabels:  
             job: init-node  
         template:  
           metadata:  
             labels:  
               job: init-node  
           spec:  
             hostPID: true  
             restartPolicy: Always  
             containers:  
             - image: alpine:3.12.0  
               name: init-node  
               securityContext:  
                 privileged: true  
               command: ["/bin/sh"]  
               args: ["-c", "nsenter -t 1 -m -- su -c \"echo \\\"/core/core.%e.%p.%t\\\" > /proc/sys/kernel/core_pattern\" && sleep infinity"]  
    

    Since pipes are executed in the namespace of root, I'll see if I can create a script which writes the datastream it gets as input to a location. By this, I can have a directory, mounted on the host system (best case an Azure File System instance) where this application would write the core-dump to. In this case the dump would be saved outside the container, which actually would be the best scenario for me.

    EDIT:
    The last part was the easiest. Just set /proc/sys/kernel/core_pattern to |/bin/dd of=/core/%h.%e.%p.%t and your core-dumps will get piped to /core on your host. Now mount an AzureFileStorage (https://learn.microsoft.com/en-us/azure/virtual-machines/linux/mount-azure-file-storage-on-linux-using-smb) on this location and you're ready to go.

    This will hit already quite a couple of birds with one stone:

    1. Your core-dumps are not on the container after it crashed. All sensible information are stored at a different place which is not accessible if a container is compormized. Be careful where you store them.
    2. If you do it like me, they're not stored on the host either, so you're free to scale up and down on your cluster without worrying about loosing information.

    DISCLAIMER:
    The application nsenter enters a different namespace and executes commands in it. Therefore this solution might only work with docker, which uses namespaces and cgroups to run the containers.


3 additional answers

Sort by: Most helpful
  1. vipullag-MSFT 16,736 Reputation points Microsoft Employee
    2020-06-04T13:45:30.39+00:00

    @Simon Schick (ITC Service)

    Apologies in delayed response.

    In order to see core dump, first item which needs to be checked is that core dump are enabled in the worker nodes of AKS. To do that, login into AKS worker nodes and check the settings of core dump present at /proc/sys/kernel/core_pattern It would provide the setting where the core dump are configured to be written.

    After checking that, log into pod to see the directory of the core dump (to check the mapping of that to host file system).

    The default setting of core dump is to write the core dump at the home of the user (which could be totally different for AKS worker nodes). If /tmp folder of pod is mapped to local filesystem of worker node, it can be configured to write core dump which can be extracted from worker node.
    Some reference links Apport Ubuntu and stack overflow.

    The default images managed by Azure might have core dumps disabled (considering the core dump file size). However, there are ways to use custom image as worker nodes (ref stackoverflow). Also, you can have a volume with type local dir which maps a local directory to the pod, then you can capture dumps on the mapped directory inside the pod.

    [Please 'Accept as answer' if it helped, so that it can help others in the community looking for help on similar topics.]

    No comments

  2. Simon Schick (ITC Service) 106 Reputation points
    2020-06-04T16:52:25.953+00:00

    Thanks for the response.

    According to the value of RLIMIT_CORE, which is set to -1, core dumps are activated in a kubernetes pod.

    The value of the file /proc/sys/kernel/core_pattern on the AKS worker nodes by default is what Ubuntu gives it: |/usr/share/apport/apport %p %s %c which - according to this thread on the Linux mailing list https://lists.gt.net/linux/kernel/2322800, because it is a pipe - is evaluated in the global filesystem namespace.

    I've even tried changing the path by a container which has privileged access and thereby should be able to overwrite the resource - but that just gave a very strange case where it sometimes worked but sometimes not.

    Now - how can I either:

    1. Check how the machine is configured so I can find out the location of where apport saves those files?
    2. Change this path persistently?
    No comments

  3. Anthony Whalley 1 Reputation point
    2021-09-24T23:42:53.427+00:00

    Hi @Simon Schick (ITC Service)

    I've been working on this problem over at this project.
    https://github.com/IBM/core-dump-handler

    We've done some preliminary testing on AKS and it seems to work ok but it would be great to get some additional feedback if you have a time.

    Hope this is useful.

    No comments