Launch QEMU with strace¶

This guide explains how launch QEMU with a debugging tool in virt-launcher pod. This method can be useful to debug early failures or starting QEMU as a child of the debug tool relying on ptrace. The second point is particularly relevant when a process is operating in a non-privileged environment since otherwise, it would need root access to be able to ptrace the process.

Ephemeral containers are among the emerging techniques to overcome the lack of debugging tool inside the original image. This solution does, however, come with a number of limitations. For example, it is possible to spawn a new container inside the same pod of the application to debug and share the same PID namespace. Though they share the same PID namespace, KubeVirt's usage of unprivileged containers makes it, for example, impossible to ptrace a running container. Therefore, this technique isn't appropriate for our needs.

Due to its security and image size reduction, KubeVirt container images are based on distroless containers. These kinds of images are extremely beneficial for deployments, but they are challenging to troubleshoot because there is no package management, which prevents the installation of additional tools on the flight.

Wrapping the QEMU binary in a script is one practical method for debugging QEMU launched by Libvirt. This script launches the QEMU as a child of this process together with the debugging tool (such as strace or valgrind).

The final part that needs to be added is the configuration for Libvirt to use the wrapped script rather than calling the QEMU program directly.

It is possible to alter the generated XML with the help of KubeVirt sidecars. This allows us to use the wrapping script in place of the built-in emulator.

The primary concept behind this configuration is that all of the additional tools, scripts, and final output files will be stored in a PerstistentVolumeClaim (PVC) that this guide refers to as debug-tools. The virt-launcher pod that we wish to debug will have this PVC attached to it.

PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: debug-tools
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi

In this guide, we'll apply the above concepts to debug QEMU inside virt-launcher using strace without the need of build a custom virt-launcher image.

You can see a full demo of this setup:

How to bring the debug tools and wrapping script into distroless containers¶

This section provides an example of how to provide extra tools into the distroless container that will be supplied as a PVC using a Dockerfile. Although there are several ways to accomplish this, this covers a relatively simple technique. Alternatively, you could run a pod and manually populate the PVC by execing into the pod.

Dockerfile:

FROM quay.io/centos/centos:stream9 as build

ENV DIR /debug-tools
RUN mkdir -p ${DIR}/logs

RUN  yum install  --installroot=${DIR} -y strace && yum clean all

COPY ./wrap_qemu_strace.sh $DIR/wrap_qemu_strace.sh
RUN chmod 0755 ${DIR}/wrap_qemu_strace.sh
RUN chown 107:107 ${DIR}/wrap_qemu_strace.sh
RUN chown 107:107 ${DIR}/logs

The directory debug-tools stores the content that will be later copied inside the debug-tools PVC. We are essentially adding the missing utilities in the custom directory with yum install --installroot=${DIR}}, and the parent image matches with the parent images of virt-launcher.

The wrap_qemu_strace.sh is the wrapping script that will be used to launch QEMU with strace similarly as the example with valgrind.

#!/bin/bash

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/var/run/debug/usr/lib64 /var/run/debug/usr/bin/strace \
        -o /var/run/debug/logs/strace.out \
        /usr/libexec/qemu-kvm $@

It is important to set the dynamic library path LD_LIBRARY_PATH to the path where the PVC will be mounted in the virt-launcher container.

Then, you will simply need to build the image and your debug setup is ready. The Dockerfle and the script wrap_qemu_strace.sh need to be in the same directory where you run the command.

$ podman build -t debug .

The second step is to populate the PVC. This can be easily achieved using a kubernetes Job like:

apiVersion: batch/v1
kind: Job
metadata:
  name: populate-pvc
spec:
  template:
    spec:
      volumes:
        - name: populate
          persistentVolumeClaim:
            claimName: debug-tools
      containers:
        - name: populate
          image: registry:5000/debug:latest
          command: ["sh", "-c", "cp -r /debug-tools/* /vol"]
          imagePullPolicy: Always
          volumeMounts:
            - mountPath: "/vol"
              name: populate
      restartPolicy: Never
  backoffLimit: 4

The image referenced in the Job is the image we built in the previous step. Once applied this and the job completed, thedebug-tools PVC is ready to be used.

How to start qemu launched by a debugging tool (e.g strace)¶

This part is achieved by using ConfigMaps and a KubeVirt sidecar (more details in the section Using ConfigMap to run custom script).

Configmap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-config-map
data:
  my_script.sh: |
    #!/bin/sh
    tempFile=`mktemp --dry-run`
    echo $4 > $tempFile
    sed -i "s|<emulator>/usr/libexec/qemu-kvm</emulator>|<emulator>/var/run/debug/wrap_qemu_strace.sh</emulator>|" $tempFile
    cat $tempFile

The script that replaces the QEMU binary with the wrapping script in the XML is stored in the configmap my-config-map. This script will run as a hook, as explained in full in the documentation for the KubeVirt sidecar.

Once all the objects created, we can finally run the guest to debug.

VMI:

apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
  annotations:
    hooks.kubevirt.io/hookSidecars: '[{"args": ["--version", "v1alpha2"],
    "image":"registry:5000/kubevirt/sidecar-shim:devel",
    "pvc": {"name": "debug-tools","volumePath": "/debug", "sharedComputePath": "/var/run/debug"},
    "configMap": {"name": "my-config-map","key": "my_script.sh", "hookPath": "/usr/bin/onDefineDomain"}}]'
  labels:
    special: vmi-debug-tools
  name: vmi-debug-tools
spec:
  domain:
    devices:
      disks:
      - disk:
          bus: virtio
        name: containerdisk
      - disk:
          bus: virtio
        name: cloudinitdisk
      rng: {}
    resources:
      requests:
        memory: 1024M
  terminationGracePeriodSeconds: 0
  volumes:
  - containerDisk:
      image: registry:5000/kubevirt/fedora-with-test-tooling-container-disk:devel
    name: containerdisk
  - cloudInitNoCloud:
      userData: |-
        #cloud-config
        password: fedora
        chpasswd: { expire: False }
    name: cloudinitdisk

The VMI example is a simply VM instance declaration and the interesting parts are the annotations for the hook: * image refers to the sidecar-shim already built and shipped with KubeVirt * pvc refers to the PVC populated with the debug setup. The name refers to the claim name, the volumePath is the path inside the sidecar container where the volume is mounted while the sharedComputePath is the path of the same volume inside the compute container. * configMap refers to the confimap containing the script to modify the XML for the wrapping script

Once the VM is declared, the hook will modify the emulator section and Libvirt will call the wrapping script instead of QEMU directly.

How to fetch the output¶

The wrapping script configures strace to store the output in the PVC. In this way, it is possible to retrieve the output file in a later time, for example using an additional pod like:

apiVersion: v1
kind: Pod
metadata:
  name: fetch-logs
spec:
  securityContext:
    runAsUser: 107
    fsGroup: 107
  volumes:
    - name: populate
      persistentVolumeClaim:
        claimName: debug-tools
  containers:
    - name: populate
      image: busybox:latest
      command: ["tail", "-f", "/dev/null"]
      volumeMounts:
        - mountPath: "/vol"
          name: populate

It is then possible to copy the file locally with:

$ kubectl cp fetch-logs:/vol/logs/strace.out strace.out