CPU Analysis - Locating Dominant Processes
Linux Windows AIX Solaris
Linux
To determine which process is hogging CPU, use 'top':
top - 23:50:16 up 3:25, 1 user, load average: 0.00, 0.00, 0.00
Tasks: 88 total, 1 running, 87 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 2055112k total, 227684k used, 1827428k free, 53556k buffers
Swap: 2096472k total, 0k used, 2096472k free, 100884k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 16 0 4876 596 500 S 0.0 0.0 0:00.78 init
2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
6 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/2
7 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/3
9 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/3
10 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/0
11 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/1
12 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 events/2
|
'ps aux --sort pcpu' can give you CPU info in a simpler format at command
prompt:
# ps aux --sort pcpu
....
db2fenc1 5849 0.0 0.1 319016 9140 ? S Aug10 0:10 db2fmp ,1,0,0,0,
0,0,0,1e014,2,0,1,69fe0,0x12ac0000,0x12ac0000,eaa4
db2inst1 7462 0.0 0.4 369432 33408 ? S Aug11 4:34 db2agent (idle)
db2inst1 30723 0.0 0.4 370592 37540 ? S Sep04 14:04 db2agent (idle)
root 5640 0.0 0.0 1500 428 tty1 S Sep26 0:00 /sbin/mingetty tty1
root 25189 0.0 0.0 1796 504 ? S Oct12 0:00 /usr/sbin/vsftpd /etc/vsftpd/vsftpd.conf
root 15515 0.0 0.0 6872 2188 ? S 15:36 0:00 sshd: root@pts/1
root 15518 0.0 0.0 4252 1388 pts/1 S 15:36 0:00 -bash
root 15765 0.0 0.0 2724 776 pts/1 R 16:03 0:00 ps aux --sort pcpu
root 1554 43.0 0.0 194024 3744 ? R Aug08 73330:27 /usr/local/staf/bin/STAFProc
|
If the process is normal (error-free), you system may just be handling the peak
load (more load will crash it), so it is time to increase CPU capacity. If
you believe the workload is not heavy, then your application may need to be
tuned. See "Code Tuning" for the next step.
WindowsWindows Task
Manager can be used to monitor your system's CPU uage. You can start by
"Start->Run->(type) taskmgr and hit <Enter>". Look at the 'Processes' tab,
and click on the 'CPU' column to sort by CPU usage. The highest CPU-using
process is on the top.
To automate and capture CPU usage into a file using 'perfmon' tool that comes
with Windows 2003, see "perf automation" section from the left navigation.
AIX
'ps aux' will show cpu usage sorted by most usage first. Do a 'more' to
see the top ones.
# ps aux |more
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
root 53274 12.4 0.0 40 40 - A Nov 28 8678:15 wait
root 45078 12.4 0.0 40 40 - A Nov 28 8677:24 wait
root 69666 12.4 0.0 40 40 - A Nov 28 8677:08 wait
root 61470 12.4 0.0 40 40 - A Nov 28 8669:01 wait
root 65568 12.4 0.0 40 40 - A Nov 28 8659:51 wait
root 8196 12.4 0.0 40 40 - A Nov 28 8656:59 wait
root 49176 12.4 0.0 40 40 - A Nov 28 8654:17 wait
root 57372 12.4 0.0 40 40 - A Nov 28 8633:34 wait
root 90280 0.1 0.0 492 500 - A Nov 28 55:12 /usr/sbin/syncd
root 1716352 0.0 8.0 1154040 1154072 - A Dec 02 3:47 /usr/WebSphere/
db2inst1 1143036 0.0 0.0 1044 460 - A Dec 02 2:11 db2disp 0
root 0 0.0 0.0 64 64 - A Nov 28 4:55 swapper
|
A better CPU profiling tool is called 'tprof', available in the latest
AIX versions. It profiles CPU usage for a given interval, and prints out
top CPU-hogging program(s) and their associated libraries. While your
application is running under load, do the following:
# tprof -x sleep 60
Sun Dec 4 15:55:01 2005
System: AIX 5.3 Node: pw101 Machine: 002FBF7D4C00
Starting Command sleep 60
stopping trace collection.
Generating sleep.prof
# more sleep.prof
Configuration information
=========================
System: AIX 5.3 Node: pw101 Machine: 002FBF7D4C00
Tprof command was:
tprof -x sleep 60
Trace command was:
/usr/bin/trace -ad -L 1000000 -T 500000 -j 000,001,002,003,38F,005,006,134,139,5A2,465,00A,234 -o -
Total Samples = 48008
Total Elapsed Time = 60.01s
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Process Freq Total Kernel User Shared Other
======= ==== ===== ====== ==== ====== =====
wait 8 99.69 99.69 0.00 0.00 0.00
/usr/bin/tprof 1 0.14 0.00 0.00 0.13 0.00
/home/db2inst1/sqllib/bin/db2fm
6 0.04 0.02 0.00 0.02 0.00
/etc/syncd 2 0.03 0.03 0.00 0.00 0.00
/home/dasusr1/das/bin/db2fm 6 0.01 0.01 0.00 0.00 0.00
db2hmon 1 0.01 0.01 0.00 0.00 0.00
db2agent 1 0.01 0.01 0.00 0.00 0.00
/home/db2inst1/sqllib/adm/db2set
6 0.01 0.01 0.00 0.00 0.00
/usr/bin/sh 6 0.01 0.01 0.00 0.00 0.00
r/WebSphere/AppServer/java/bin//java 3 0.01 0.01 0.00 0.00 0.00
db2disp 1 0.00 0.00 0.00 0.00 0.00
|
If the process is normal (error-free), you system may just be handling the peak
load (more load will crash it), so it is time to increase CPU capacity.
You can certainly change the priority the busy application is running at by
using 'nice' and 'renice' commands (not very often used). Alternatively, if
you believe the workload is not heavy, but CPU usage is high, then your application may need to be
tuned. See "Code Tuning" for the next step.
Solaris
Information coming soon (if you have useful information regarding this topic,
please send it to me
info@performancewiki.com, thanks.)
|