Linux perf

perf, what’s that? perf, what’s that? section" href="#perf-whats-that">

The Linux perf tool is an incredibly powerful tool, that can amongst other things be used for:

In general, perf can count and/or record the call stacks of your threads when a certain event occurs. These events can be triggered by:

Getting perf to work perf to work section" href="#getting-perf-to-work">

Unfortunately, getting perf to work depends on your environment. Below, please find a selection of environments and how to get perf to work there.

Installing perf perf section" href="#installing-perf">

Technically, perf is part of the Linux kernel sources and you’d want a perf version that exactly matches your Linux kernel version. In many cases however a “close-enough” perf version will do too. If in doubt, use a perf version that’s slightly older than your kernel over one that’s newer.

You can confirm that your perf installation works using perf stat -- sleep 0.1 (if you’re already root) or sudo perf stat -- sleep 0.1.

perf on Ubuntu when you can’t match the kernel version perf on Ubuntu when you can’t match the kernel version section" href="#perf-on-ubuntu-when-you-cant-match-the-kernel-version">

On Ubuntu (and other distributions that package perf per kernel version) you may see an error after installing linux-tools-generic. The error message will look similar to

$ perf stat -- sleep 0.1
WARNING: perf not found for kernel 5.10.25

  You may need to install the following packages for this specific kernel:
    linux-tools-5.10.25-linuxkit
    linux-cloud-tools-5.10.25-linuxkit

  You may also want to install one of the following packages to keep up to date:
    linux-tools-linuxkit
    linux-cloud-tools-linuxkit

The best fix for this is to follow what perf says and to install one of the above packages. If you’re in a Docker container, this may not be possible because you’d need to match the kernel version (which is especially difficult in Docker for Mac because it uses a VM). For example, the suggested linux-tools-5.10.25-linuxkit is not actually available.

As a workaround, you can try one of the following options

After this, you should be able to use perf stat -- sleep 0.1 (if you’re already root) or sudo perf stat -- sleep 0.1 successfully.

Bare metal

For a bare metal Linux machine, all you need to do is to install perf which should then work in full fidelity.

In Docker (running on bare-metal Linux)

You will need to launch your container with docker run --privileged (don’t run this in production) and then you should have full access to perf (including the PMU).

To validate that perf works correctly, run for example perf stat -- sleep 0.1. Whether you’ll see the <not supported> next to some information will depend on if you have access to the CPU’s performance counters (the PMU). In Docker on bare metal, this should work, ie. no <not supported>s should show up.

Docker for Mac

Docker for Mac is like Docker on bare metal but with some extra complexity because we’re actually running the Docker containers hosted in a Linux VM. So matching the kernel version will be difficult.

If you follow the above installation instructions, you should nevertheless get perf to work but you won’t have access to the CPU’s performance counters (the PMU) so you’ll see a few events show up as <not supported>.

$ perf stat -- sleep 0.1

 Performance counter stats for 'sleep 0.1':

              0.44 msec task-clock                #    0.004 CPUs utilized
                 1      context-switches          #    0.002 M/sec
                 0      cpu-migrations            #    0.000 K/sec
                57      page-faults               #    0.129 M/sec
   <not supported>      cycles
   <not supported>      instructions
   <not supported>      branches
   <not supported>      branch-misses

       0.102869000 seconds time elapsed

       0.000000000 seconds user
       0.001069000 seconds sys

In a VM

In a virtual machine, you would install perf just like on bare metal. And either perf will work just fine with all its features or it will look similarly to what you get on Docker for Mac.

What you need your hypervisor to support (& allow) is “PMU passthrough” or “PMU virtualisation”. VMware Fusion does support PMU virtualisation which they call vPMC (VM settings -> Processors & Memory -> Advanced -> Allow code profiling applications in this VM). If you’re on a Mac this setting is unfortunately only supported up to including macOS Catalina (and not on Big Sur).

If you use libvirt to manage your hypervisor and VMs, you can use sudo virsh edit your-domain and replace the <cpu .../> XML tag with

<cpu mode='host-passthrough' check='none'/>

to allow the PMU to be passed through to the guest. For other hypervisors, an internet search will usually reveal how to enable PMU passthrough.