Dedicated CPU resources¶
Certain workloads, requiring a predictable latency and enhanced performance during its execution would benefit from obtaining dedicated CPU resources. KubeVirt, relying on the Kubernetes CPU manager, is able to pin guest's vCPUs to the host's pCPUs.
Kubernetes CPU manager¶
Kubernetes CPU manager is a mechanism that affects the scheduling of
workloads, placing it on a host which can allocate Guaranteed
resources and pin certain Pod's containers to host pCPUs, if the
following requirements are met:
- Pod's QoS is Guaranteed
- resources requests and limits are equal
- all containers in the Pod express CPU and memory requirements
- Requested number of CPUs is an Integer
Additional information:
- Enabling the CPU manager on Kubernetes
- Enabling the CPU manager on OKD
- Kubernetes blog explaining the feature
Requesting dedicated CPU resources¶
Setting spec.domain.cpu.dedicatedCpuPlacement
to true
in a VMI spec
will indicate the desire to allocate dedicated CPU resource to the VMI
Kubevirt will verify that all the necessary conditions are met, for the Kubernetes CPU manager to pin the virt-launcher container to dedicated host CPUs. Once, virt-launcher is running, the VMI's vCPUs will be pinned to the pCPUS that has been dedicated for the virt-launcher container.
Expressing the desired amount of VMI's vCPUs can be done by either
setting the guest topology in spec.domain.cpu
(sockets
, cores
,
threads
) or spec.domain.resources.[requests/limits].cpu
to a whole
number integer ([1-9]+) indicating the number of vCPUs
requested for the VMI. Number of vCPUs is counted as
sockets * cores * threads
or if spec.domain.cpu
is empty then it
takes value from spec.domain.resources.requests.cpu
or
spec.domain.resources.limits.cpu
.
Note: Users should not specify both
spec.domain.cpu
andspec.domain.resources.[requests/limits].cpu
Note:
spec.domain.resources.requests.cpu
must be equal tospec.domain.resources.limits.cpu
Note: Multiple cpu-bound microbenchmarks show a significant performance advantage when using
spec.domain.cpu.sockets
instead ofspec.domain.cpu.cores
.
All inconsistent requirements will be rejected.
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
spec:
domain:
cpu:
sockets: 2
cores: 1
threads: 1
dedicatedCpuPlacement: true
resources:
limits:
memory: 2Gi
[...]
OR
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
spec:
domain:
cpu:
dedicatedCpuPlacement: true
resources:
limits:
cpu: 2
memory: 2Gi
[...]
Requesting dedicated CPU for QEMU emulator¶
A number of QEMU threads, such as QEMU main event loop, async I/O
operation completion, etc., also execute on the same physical CPUs as
the VMI's vCPUs. This may affect the expected latency of a vCPU. In
order to enhance the real-time support in KubeVirt and provide improved
latency, KubeVirt will allocate an additional dedicated CPU, exclusively
for the emulator thread, to which it will be pinned. This will
effectively "isolate" the emulator thread from the vCPUs of the VMI.
In case ioThreadsPolicy
is set to auto
IOThreads will also be
"isolated" and placed on the same physical CPU as the QEMU emulator thread.
This functionality can be enabled by specifying
isolateEmulatorThread: true
inside VMI spec's Spec.Domain.CPU
section. Naturally, this setting has to be specified in a combination
with a dedicatedCpuPlacement: true
.
Example:
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
spec:
domain:
cpu:
dedicatedCpuPlacement: true
isolateEmulatorThread: true
resources:
limits:
cpu: 2
memory: 2Gi
Compute Nodes with SMT Enabled¶
When the following conditions are met:
- The compute node has SMT enabled
- Kubelet's CPUManager policy is set to static - full-pcpus-only
- The VM is configured to have an even number of CPUs
dedicatedCpuPlacement
andisolateEmulatorThread
are enabled
The VM is scheduled, but rejected by the kubelet with the following event:
In order to address this issue:
- Enable the
AlignCPUs
feature gate in the KubeVirt CR. - Add the following annotation to the Kubevirt CR:
KubeVirt will then add one or two dedicated CPUs for the emulator threads, in a way that completes the total CPU count to be even.
Identifying nodes with a running CPU manager¶
At this time, Kubernetes doesn't label the nodes that has CPU manager running on it.
KubeVirt has a mechanism to identify which nodes has the CPU manager
running and manually add a cpumanager=true
label. This label will be
removed when KubeVirt will identify that CPU manager is no longer
running on the node. This automatic identification should be viewed as a
temporary workaround until Kubernetes will provide the required
functionality. Therefore, this feature should be manually enabled by
activating the CPUManager
feature gate
to the KubeVirt CR.
When automatic identification is disabled, cluster administrator may manually add the above label to all the nodes when CPU Manager is running.
-
Nodes' labels are view-able:
kubectl describe nodes
-
Administrators may manually label a missing node:
kubectl label node [node_name] cpumanager=true
Sidecar containers and CPU allocation overhead¶
Note: In order to run sidecar containers, KubeVirt requires the
Sidecar
feature gate
to be enabled in KubeVirt's CR.
According to the Kubernetes CPU manager model, in order the POD would
reach the required QOS level Guaranteed
, all containers in the POD
must express CPU and memory requirements. At this time, Kubevirt often
uses a sidecar container to mount VMI's registry disk. It also uses a
sidecar container of it's hooking mechanism. These additional resources
can be viewed as an overhead and should be taken into account when
calculating a node capacity.
Note: The current defaults for sidecar's resources: CPU: 200m
Memory: 64M
As the CPU resource is not expressed as a whole number,
CPU manager will not attempt to pin the sidecar container to a host CPU.