perf top for debugging and checking if your app is hogging your CPU
A lot of products today live in the cloud and when it comes to its performance, developers typically have the knowledge or tools that they use to ensure the performance is acceptable. However, as a PM I like looking at this too; cause it tells me if my performance is degrading based on a feature that was added.
You can look at performance of an application by multiple means. You can use synthetic tools like Google Puppeteer and Lighthouse to see the performance of a web application. But what about that server-side code that sits in the background to process and serve this data to your application.
There are some handy local Linux tools that you can use to get a quick idea on how your application is impacting your CPU.
Your applications/programs spend a lot of time on the CPU - like billions of cycles (a billion cycle is normal). and you want to know what is your application really doing and which process is using or impacting your CPU the most.
perf
is a well-known linux utility that is helpful to get some of your answers quickly. Keep in mind perf
is a weird tool, that gives you useful information in different ways so give it some time to use it an understand the output.
Try running $ sudo perf record python
After running it for sometime quit by hitting Ctrl + C
The results are saved in a file called perf.data
To view the results run $ sudo perf report
This will show you the C functions
from the CPython interpreter
(not the Python functions) and its % usage on the CPU.
perf
works on any linux machine. The exact features will vary depending on your kernel version.
When you are noticing CPU spikes on your server in very short durations there are two things you can do.
First by running $ top
will show you the list of all programs and its % usage of the CPU.
You can then run $ perf top
This is just like top but for functions instead of programs. This will help you determine what function in the program is causing the CPU to spike so much.
perf top
doesn’t always help but its easy to try and sometimes you are surprised by the results.
Also check out Flamegraphs by Brendan Gregg if you ant to visualize your CPU performance. Follow the instruction on his GitHub to generate report.
The graph is built from collections (usually thousands) of stack traces sampled form a single program.
Cache
Your CPU has a small cache on it called the L1 cache that it can access in about ~0.5 nanoseconds. Its 200x faster than accessing the RAM. If you are trying to do operation where ever 0.001 seconds matter, CPU cache usage matters. But you don’t want anyone to be abusing it.
Run $ perf stat
Let it run for a few seconds and then quit by hitting Ctrl + C
This will show you if any program is using those cache and how much.
You can also run $perf stat ls
which simply runs the ls
command and prints a report at the end.
You can also pass -e
to get specific stats.
Your CPU tracks all kinds of counters and what its doing. $ perf stat
asks it to count things (like L1 cache misses) and reports the results.