Noticed something odd today;
The cpu idle showing when using “mpstat” differs from the one you have with “sar”, and would expect when reading “prstat”.
Check the following example:
a small script to generate CPU load
root@solaris> cat test
#!/bin/bash
for cpu in 1 ; do
( while true; do true; done ) &
done
an output of mpstat prior to starting the load
root@solaris> mpstat
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 724 66 581 720 602 832 40 139 38 0 1783 11 15 0 74
1 670 60 503 703 564 974 47 138 34 0 1598 9 14 0 77
we’ve got two processors, so we run the script twice so we’ll notice it more
root@solaris> ./test
root@solaris> ./test
sar shows we’re using 88% cpu, the system is using 12%, so 0% is idle
root@solaris> sar 1 1
SunOS solaris 5.10 Generic_125100-02 sun4u 04/26/2007
12:34:29 %usr %sys %wio %idle
12:34:30 88 12 0 0
mpstat says we’ve got around 74 to 77 % idle cpu time… how can that be?
root@solaris> mpstat
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 724 66 581 720 602 832 40 139 38 0 1783 11 15 0 74
1 670 60 503 703 564 974 47 138 34 0 1598 9 14 0 77
prstat shows that we’re using 62% at the moment
root@solaris> prstat
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
24715 root 2848K 1680K run 55 0 0:00:23 31% test/1
24709 root 2848K 1680K run 50 0 0:00:25 31% test/1
…
Total: 257 processes, 744 lwps, load averages: 1.40, 0.82, 0.70
sar shows us 80%, 20% for the system, so again 0% idle time
root@solaris> sar 1 1
SunOS solaris 5.10 Generic_125100-02 sun4u 04/26/2007
12:34:52 %usr %sys %wio %idle
12:34:53 80 20 0 0
When checking the man page of “mpstat”, we see the following explanation:
idl – percent idle time
So we’dd expect a value more towards 0% then towars 100%…
But further down we there is the following note:
The sum of CPU utilization might vary slightly from 100 due
to rounding errors in the production of a percentage figure.The total time used for CPU processing is the sum of usr and
sys output values, reported for user and system operations.
The idl value reports the time that the CPU is idle for any
reason other than pending disk I/O operations.
So mpstat only shows the idle time in relation to the pending disk I/O operations. Useful to know!
The man page for sar says you shouldn’t run it with the 2nd argument less than 5 – otherwise you’re just measuring the CPU usage for the sar startup sequence. 😉
It also says you shouldn’t set the first argument to less that 5, but you can take that with a pinch of salt if you like – your figures will just be a bit less accurate.
True,
Man page
New test:
prstat
NPROC USERNAME SWAP RSS MEMORY TIME CPU
93 root 2675M 1647M 20% 8:59:53 97%
…
Total: 227 processes, 1236 lwps, load averages: 30.94, 15.15, 6.47
sar
bash-3.00# sar 5 5
…
08:31:28 %usr %sys %wio %idle
08:31:33 99 1 0 0
08:31:38 100 0 0 0
08:31:43 99 1 0 0
08:31:48 97 3 0 0
08:31:53 97 3 0 0
mpstat
bash-3.00# mpstat -a
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 1677 0 4590 807 237 1039 5 114 166 0 2829 1 1 0 98 24
bash-3.00# mpstat -a 5 5
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 1677 0 4590 807 237 1039 5 114 166 0 2829 2 1 0 98 24
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 1295 0 5824 976 261 1031 600 144 55 0 2738 99 1 0 0 24
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 1672 0 5055 954 238 955 602 153 170 0 2552 99 1 0 0 24
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 89 0 1435 914 230 895 572 135 19 0 987 100 0 0 0 24
SET minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl sze
0 1449 0 5287 928 227 919 592 145 51 0 2201 99 1 0 0 24
So you’ve got a good statement