Is DCPI the right tool? -- A DCPI "one pager" / PJD / 1 October 2001 It's important to use the right tool for the right job. This one pager will help determine if DCPI is the right tool to solve your performance problem, or if another tool is more appropriate. DCPI is an excellent tool for identifying and diagnosing CPU-related performance problems. DCPI can help answer the following kinds of questions: * What program component consumed the most processor cycles? How many? * What program component executed the most instructions? How many? * How many instructions were retired per cycle and how can that ratio be improved? * How many instructions were aborted (unsuccessfully executed and wasted) by a program component due to a performance issue? * Is the ratio of successfully completed instructions (retired instructions) to aborted instructions unacceptably low? * What are the factors inhibiting performance? Mispredicted branches? Delays waiting for memory to return data values? * How long does it take a frequently executed load instruction to read a memory data value? Here, we use the word "component" to refer (broadly) to executable images, procedures, and even individual program instructions or statements. The kinds of problems that are best resolved by DCPI are CPU-centric. Other kinds of performance problems are better addressed by other tools: * Have a thread-related problem? Use Visual Threads. * Have an I/O- or file system-related problem? Use iostat, vmstat, collect, etc. * Is the problem network-related? Use netstat. * Have a resource scheduling problem? Use iostat, vmstat, collect. * Need a call graph? Use Hiprof. Programmers often have an intuition or hypothesis about the likely source of a performance issue. For example, if the program does very little I/O, then the program may be CPU bound. Or, if the program's data layout has poor cache memory behavior, it may be taking a long time to read information from memory. Tru64 UNIX provides a number of utility programs to perform high level "triage" on performance problems: ps - Display current process status time - Time the execution of a command top - Provide continuous reports on the state of the system uptime - Show how long a system has been running vmstat - Display virtual memory statistics w - Print a summary of current system activity These utility programs provide a graphical user interface: collgui - A graphical front-end for collect dxsysinfo - Monitor system info such as CPU activity, memory, swap space... pmgr - Display system statistics (Performance Manager) The goal is to develop and test a working hypothesis about a performance issue and then resolve it. The overall assessment process may be as simple as using the time command to display a summary of resource utilization by a program: % time dilate heart.ima ball out.ima 3.28u 0.06s 0:04 76% 0+9k 68+64io 2pf+0w Here, the program "dilate" uses 3.28 seconds of user CPU time to 0.06 seconds of system CPU time. Total CPU time is 3.34 seconds and elapsed time is four seconds. This evidence suggests that dilate is a compute-intensive program. Another approach is to monitor the behavior of a program while it runs using dxsysinfo, vmstat or collect. The collect command: collect -i 1 -s c displays CPU statistics (-s c) at one second intervals (-i 1). If CPU utilization remains high while the program executes, it is compute bound and is a good candidate for further analysis with DCPI.