hiprof(1)hiprof(1)NAMEhiprof - CPU-time and page-fault call-graph profiler for performance
analysis
SYNOPSIShiprof [-flat] [-pthread | -threads] [hiprof-option...] [gprof-
option...] program [argument...]
hiprof { -cycles | -faults } [hiprof-option...] [gprof-option...]
program [argument...]
See the start of the OPTIONS section for details of hiprof options that
may be essential for the correct execution of the program.
The atom -toolhiprof interface is still available, for compatibility
with earlier releases. However, it is now undocumented, and it will be
retired in a future release.
DESCRIPTION
See prof_intro(1) for an introduction to the application performance
tuning tools provided with Tru64 UNIX.
The hiprof command creates an instrumented version of a program (pro‐
gram.hiprof) that produces call-graph and flat profiles of one of a
range of performance statistics: The CPU time spent in each procedure
(or optionally, each source line or instruction), measured by sampling
the program counter about every millisecond (the default) The CPU time
spent in each procedure and procedure call, measured as machine cycles,
including the effects of any memory-access delays (with the -cycles
option) The number of page faults that occur during each procedure and
procedure call (with the -faults option)
See the limitations of each performance statistic in the RESTRICTIONS
section below.
If you specify program arguments (argument...) or -run, the instru‐
mented program is also executed.
If you specify -display or any of the gprof-options, the hiprof command
runs the instrumented program and then displays the profile by running
the gprof tool (with any specified gprof-options).
If you omit the program name, a usage message is printed.
The following example shows how to instrument, run, and display the
profile for a multithreaded program: cc *.c -pthread -L. -g1 -O2 -o
program -lapp1 -lapp2 hiprof-pthread -L. -all program data/*
The -all option request that all shared libraries be profiled, but
threads-related system libraries cannot be safely instrumented to count
procedure calls that are needed to print a call graph. By default,
these libraries are still sampled to provide flat CPU-time profiles.
The -cycles and -faults options cannot be used with threaded programs,
but the displayed time or page-fault count for a procedure includes the
time or count for any procedures that it calls but that were not
selected for instrumentation--for example, any procedures in libraries
not selected by the -all or -incobj options. This means that time is
not lost from these profiles by excluding shared libraries.
OPERANDS
File name of a fully linked call-shared or nonshared executable to be
profiled. This program should be compiled with the -g or -gn option
(n>=1) to obtain more complete profiling information. If the default
symbol table level (-g0) is used, line number information, static pro‐
cedure names, and file names are unavailable. Inlined procedure calls
are also unavailable.Programs that are stripped or are optimized by
spike or cc -om are not supported. All arguments following the program
name are considered to be arguments needed by the instrumented program
to execute the procedures, lines, and instructions of interest. Multi‐
ple arguments can be specified. They imply -run if any are specified,
and they can be replaced by -run if none are needed.
OPTIONS
Options can be abbreviated to three characters. The gprof-options,
which are provided as alternatives to the -display option, can be
abbreviated to one character.
For options that specify a procedure name (proc), C++ procedures can
omit the argument type list, though this will match all overloaded pro‐
cedures with that name. To select a specific procedure, specify the
full symbol name (as printed by the nm command). Symbol names contain‐
ing spaces, asterisks, and so on must be quoted.
Essential Options
Some or all of these options may be needed to prevent the instrumented
program from malfunctioning: Specify -pthread if the program or any of
its libraries calls pthread_create(3) (for example, if it was compiled
with either the -pthread option or the -threads compatibility option).
This will make the collection of profile data thread-safe. The -fork
option is maintained for compatibility with earlier releases. By
default, hiprof now profiles subprocesses that do not call exec(2), and
produces separate profiling data files for the forked subprocesses,
including the process id in their file names as if -pids was specified.
By default, the hiprof code running in the program's process allocates
memory for its own use at address 38000000000. If the program needs to
use memory between 38000000000 and 3ff00000000, specify the address
that the hiprof code should use. Specify -sigdump to force the instru‐
mented program to write the current profile data to its file(s) on
receipt of the named signal. By default, the program writes the pro‐
filing data file(s) only when the process terminates, but some pro‐
cesses never terminate normally, so this option lets you generate the
file(s) on demand. After a file is written, the instruction counts of
the profile are all set to zero; so by sending two signals, any inter‐
val of a test run can be profiled, with the second signal's file(s)
overwriting the first. For example, to use the default kill pid command
to signal the program, specify -sigdump TERM. Choose a signal that the
program does not use for another purpose.
Profiling Statistics Options
Generates a flat profile; that is, it avoids the intrusiveness of col‐
lecting the default call-graph information. If the -display option is
specified, it defaults to gprof -procedures. Do not use the -flat
option with the -cycles or -faults options. Profiles CPU time by
counting the machine cycles used in each procedure call. Use this
option only for non-threaded programs. Profiles page faults that occur
during each procedure instead of the default time spent in each proce‐
dure. Use this option only for nonthreaded programs.
File Generating Options
Does not print informational and progress messages on the standard
error stream. Prints the command lines used to instrument the program
and to execute the instrumented program. Prints the names of any proce‐
dures that were not instrumented. Names the instrumented program file
instead of the default program.hiprof. Specifies the directory to
which the instrumented program writes the profiling data file(s) for
each test run. The default is the current directory. Adds the process-
id of the instrumented program's test run to the name of the profiling
data file produced (that is, program.pid.hiout). By default, the file
is named program.hiout. When profiling a threaded program, specify
-threads to produce a separate profile for each pthread in the program.
The files are named program[.pid].sequence.hiout, where sequence is the
thread sequence number assigned by pthread_create(3). The -threads
option implies the -pthread option. If -sigdump is needed, -pthread is
recommended instead of -threads, to avoid possible synchronization
problems.
Shared-Library Profiling Options
Profiles all of the shared libraries in addition to the program's exe‐
cutable. If -all was specified, does not profile the shared library
lib. Can be repeated to exclude multiple libraries. Profiles the
shared library lib. Can be repeated to include multiple libraries.
Searches for shared-libraries in the specified directory before search‐
ing the default directories. Can be repeated to make a search path. Use
the same options that were used when linking the program with ld. Does
not instrument the procedure proc. This option can be used to exclude
procedures that are uninteresting or that interfere with the instrumen‐
tation (such as nonstandard assembly code).
Execution Control Options
Prints the tool's version number. Executes the instrumented program,
even if no arguments are specified. By default, the program is only
instrumented (for later execution). Executes the instrumented program,
and runs gprof with default options on the resulting file(s). Executes
the instrumented program, and runs gprof on the resulting file(s). The
following gprof options are supported: Profiles each instruction within
selected procedures. Does not report on called procedures. Excludes
procedure proc and its descendants from the profile, but totals all
procedures. Includes only procedure proc and its descendants in the
profile, but totals all procedures. Profiles procedures as an indexed
call graph (default). Profiles source lines, listing the most heavily
used first. Profiles source lines, in order within selected proce‐
dures. Merges all input files into file. Prints each procedure's
starting line number. Profiles procedures, listing the most heavily
used first (default). Profiles the whole executable and any shared
libraries. Reports procedures that were never called.
NOTES
If hiprof finds any previously instrumented shared libraries in the
working directory, it will reuse them if they meet current require‐
ments, to reduce re-instrumentation costs.
Temporary instrumentation files are created in /tmp. Set the TMPDIR
environment variable to a different directory to create the files else‐
where, for example, in a disk partition with more space.
RESTRICTIONS
The default sampled profile only estimates the CPU time spent in each
procedure call; profiles made with the -cycles and -faults options mea‐
sure it.
When timing a program's procedures by measuring machine cycles (with
the -cycles option), the 32-bit cycle-counting hardware will wrap if no
procedure call or return is executed by the program every few seconds
-- for example, because of a long-running loop. If the counter wraps,
the profile will be incorrect. Using the -all or -incobj options to
profile all nonsystem libraries and procedures can help avoid this
restriction.
The -cycles option generates an inaccurate profile if the instrumented
program is run on a system whose processors have different cycle
speeds. This inaccuracy can be avoided by using hiprof's default sam‐
pling profiler or the cc -p/-pg profilers instead, or by running the
application on a subset of the processors: Select a single processor
using the runon command. Check the processor speeds using the psrinfo
-v command and run the application in a processor set comprising only
processors that run at the same speed (see processor_sets(4)).
Approximate performance estimates are as follows but will vary accord‐
ing to the application and the machine's CPU count, type, and clock
rate. The hiprof instrumentation takes ~2s per Mb of program file on a
500-MHz EV6 (21264) Alpha system, using ~10 Mb of memory plus another
~10 Mb per Mb of the largest file. The instrumented files are ~20%
larger than the originals, plus ~1 Mb of hiprof code. They run ~4 times
slower. By default, each profile data file is at least the size of the
instrumented code (and uses this much memory), but these files are very
small for the -cycles and -faults options.
If a procedure contains interprocedural branches or interprocedural
jumps, that procedure will not be instrumented with the -cycles or
-faults option, and no information will be reported about that proce‐
dure. Use the -v option to see which procedures were not instrumented.
Compilers can optimize return statements or non-returning function
calls to interprocedural branches. To avoid this, recompile with the
-O0 or -no_inline option.
FILES
Instrumented version of program produced by hiprof Profile data file
produced by program.hiprof Instrumented shared libraries produced by
hiprof Temporary file created and deleted in the current and -dirname
path directories.
SEE ALSO
Introduction: prof_intro(1)atom(1), cc(1), dxprof(1), fork(2), gprof(1), kill(1), ld(1), pixie(1),
processor_sets(4), psrinfo(1), pthread(3), runon(1), uprofile(1).
(dxprof is available as an option.)
Programmer's Guide
hiprof(1)