Random Observations (1) - top(1) application CPU usage definitionPosted by Inferno Nettverk A/S, Norway on Sat Jan 4 02:38:22 MET 2014
This blog posting contains some random observations made during software development, regarding problems that were not easily solved by searching the Internet for solutions explaining what was going on. Perhaps these entries will help others in a similar situation by making the information more available.Question: What does the 'top' application CPU percentage value include?
The 'top' program is a useful tool that shows information about what is currently going on on a system, including CPU usage, memory consumption, etc., for example like this:
CPU0 states: 1.7% user, 0.0% nice, 0.6% system, 0.1% interrupt, 97.6% idle CPU1 states: 3.3% user, 0.0% nice, 1.8% system, 0.0% interrupt, 94.9% idle Memory: Real: 434M/721M act/tot Free: 251M Cache: 175M Swap: 0K/3318M PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 2016 user 2 0 345M 327M sleep/1 poll 4:32 14.70% firefox
One point to note is that for the CPU0/CPU1 lines, the total CPU usage is shown as a division of user/nice/system/interrupt and idle CPU, while for the application only a single value is shown. So the question then becomes, does the application CPU percentage value show the amount of cpu time spent only in the application, or does it include also system and interrupt CPU time?
Trying to answer this question proved to be more difficult than expected. If this was described in the top(1) manual page, it was not found by a quick look through it. Turning to search engines, a large number of pages describing the output of top(1) was found, but none that provided an answer to the question at hand.
This does not mean that this information is nowhere to be found, but that enough time had been unsuccessfully spent on trying to find an answer to the question for it to appear simplest to attempt to consult the application source code for a more definite answer.
A look at the top(1) implementation in OpenBSD resulted in the following line, in the function format_next_process():http://www.openbsd.org/cgi-bin/cvsweb/src/usr.bin/top/machine.c?rev=1.63
cputime = (pp->p_uticks + pp->p_sticks + pp->p_iticks) / stathz; With these defined as: u_int64_t p_uticks;/* U_QUAD_T: Statclock hits in user mode. */ u_int64_t p_sticks;/* U_QUAD_T: Statclock hits in system mode. */ u_int64_t p_iticks;/* U_QUAD_T: Statclock hits processing intr. */
From this it looks like the application CPU usage value on OpenBSD is calculated based on the user, system and interrupt time values.
For, Linux finding the source code used by top was somewhat more difficult, but the following appeared to be a likely candidate:https://gitorious.org/procps/procps/blobs/edba932a7e9b950dd91bc486e107788e977a5186/top/top.c
The function procs_hlp() contained the following comments and code:
/* calculate time in this process; the sum of user time (utime) and system time (stime) -- but PLEASE dont waste time and effort on calcs and saves that go unused, like the old top! */ PHist_new[Frame_maxtask].pid = this->tid; PHist_new[Frame_maxtask].tics = tics = (this->utime + this->stime); Where these values are defined as follows: utime, // stat user-mode CPU time accumulated by process stime, // stat kernel-mode CPU time accumulated by process
In other words, both top versions appear to include the time spent in kernel and in user space for the application.
For a final confirmation, a test using the following command, which should primarily result in system time, was done on both an OpenBSD and a Linux machine:
dd if=/dev/zero of=/dev/null bs=65536
OpenBSD: CPU0 states: 2.0% user, 0.0% nice, 46.3% system, 0.0% interrupt, 51.7% idle CPU1 states: 4.0% user, 0.0% nice, 48.4% system, 0.0% interrupt, 47.6% idle Memory: Real: 406M/737M act/tot Free: 235M Cache: 191M Swap: 0K/3318M PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 22805 user 64 0 376K 232K onproc/0 - 0:37 83.35% dd
Linux: Cpu(s): 0.2%us, 12.8%sy, 0.0%ni, 87.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16330524k total, 7869052k used, 8461472k free, 464240k buffers Swap: 18563064k total, 0k used, 18563064k free, 6716724k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29937 user 20 0 102m 724 564 R 99.7 0.0 0:13.40 dd
Both show a high CPU usage value for the application and a low user CPU time on the machine.
In other words, the application CPU time appears to be the sum of both user and system CPU time, which seems logical.
In hindsight, simply doing the final test using dd would likely have been quicker than going through all the steps above, but that's life I guess...