Displays call graph profile data.
/usr/ccs/bin/gprof [ -b ] [ -c [ filename ] ] [ -e Name ] [ -E Name ] [ -f Name ] [-g filename ] [-i filename] [-p filename ] [ -F Name ] [ -L PathName ] [ -s ] [ -x [ filename ] ] [ -z ] [ a.out [ gmon.out ... ] ]
The gprof command produces an execution profile of C, FORTRAN, or COBOL programs. The effect of called routines is incorporated into the profile of each caller. The gprof command is useful in identifying how a program consumes CPU resource. To find out which functions (routines) in the program are using the CPU, you can profile the program with the gprof command.
The profile data is taken from the call graph profile file (gmon.out by default) created by programs compiled with the cc command using the -pg option. The -pg option also links in versions of library routines compiled for profiling, and reads the symbol table in the named object file (a.out by default), correlating it with the call graph profile file. If more than one profile file is specified, the gprof command output shows the sum of the profile information in the given profile files.
GPROF = profile:<profile-type>,scale:<scaling-factor>,file:<file-type>,filename:<filename>
where: The gprof command produces three items:
The grpof command can also be used to analyze the execution profile of a program on a remote machine. This can be done by running the gprof command with the -c option on the call graph profile file (gmon.out by default) to generate a file (gprof.remote by default) which can then be processed on a remote machine. If a call graph profile file other than gmon.out is to be used, the call graph profile file name(s) should be specified after -c Filename and the executable name. Filename must be specified if the GPROF environment variable's file attribute is set to multi; multiple gmon.out files are created, with one gmon.out file for each PID when the executing program forks. The -x option can be used on the remote machine to process the gprof.remote (by default) file to generate profile reports.
Profiling with the fork and exec Subroutines
Profiling using the gprof command is problematic if your program runs the fork or exec subroutine on multiple, concurrent processes. Profiling is an attribute of the environment of each process, so if you are profiling a process that forks a new process, the child is also profiled. However, both processes write a gmon.out file in the directory from which you run the parent process, overwriting one of them. The tprof command is recommended for multiple-process profiling. In AIX 5.3, you can use file:mutli to avoid destroying the gmon.out file of the parent process, file:multi using the AIX 5.3 naming convention to generate the gmon.out files, hence the child processes gmon.out file will not have the same name as that of the parent, which will avoid overwrites.
For versions previous to AIX 5.3: If you must use the gprof command, one way around this problem is to call the chdir subroutine to change the current directory of the child process. Then, when the child process exits, its gmon.out file is written to the new directory. The following example demonstrates this method:
cd /u/test # current directory containing forker.c program
pg forker.c
main()
{
int i, pid;
static char path[]="/u/test2";
pid=fork(); /* fork a child process */
if(pid==0) { /* Ok, this is the child process */
chdir (path); /* create new home directory so
gmon.out isn't clobbered! */
for (i=0; i<30000; i++) sub2(); /* 30000 calls to sub2
in child profile */
}
else /* Parent process... leave gmon.out
in current directory */
for (i=0;i<1000; i++) sub1(pid); /* 1000 calls to sub1
in parent profile */
}
int sub1(pid) /* silly little function #1, called
by parent 1000 times */
int pid;
{
int i;
printf("I'm the parent, child pid is %i.\n",pid);
}
int sub2() /* silly little function #2, called
by child 30,000 times */
{
printf("I'm the child.\n");
}
cc -pg forker.c -o forker # compile the program
mkdir /u/test2 # create a directory for childi
to write gmon.out in
forker >/dev/null # Throw away forker's many,
useless output lines
gprof forker >parent.out # Parent process's gmon.out is
in current directory
gprof forker ../test2/gmon.out >child.out
# Child's gmon.out is in test2
directory
At this point, if you compare the two gprof command output listings in directory test, parent.out, and child.out, you see that the sub1 subroutine is called 1,000 times in the parent and 0 times in the child, while the sub2 subroutine is called 30,000 times in the child and 0 times in the parent.
Processes that run the exec subroutine do not inherit profiling. However, the program executed by the exec subroutine should be profiled if it was compiled with the -pg option. As with the preceding forker.c example, if both the parent and the program run by the exec subroutine program are profiled, one overwrites the other's gmon.out file unless you use the chdir subroutine in one of them.
Profiling without Source Code
If you do not have source for your program, you can profile using the gprof command without recompiling. You must, however, be able to relink your program modules with the appropriate compiler command (for example, cc for C). If you do not recompile, you do not get call frequency counts, although the flat profile is still useful without them. As an added benefit, your program runs almost as fast as it usually does. The following explains how to profile:
cc -c dhry.c # Create dhry.o without call counting code.
cc -pg dhry.o -L/lib -L/usr/lib -o dhryfast
# Re-link (and avoid -pg libraries).
dhryfast # Create gmon.out without call counts.
gprof >dhryfast.out # You get an error message about no call counts
# -- ignore it.
A result of running without call counts is that some quickly executing functions (which you know had to be called) do not appear in the listing at all. Although nonintuitive, this result is normal for the gprof command. The gprof command lists only functions that were either called at least once, or which registered at least one clock tick. Even though they ran, quickly executing functions often receive no clock ticks. Since call-counting was suspended, these small functions are not listed at all. (You can get call counts for the runtime routines by omitting the -L options on the cc -pg command line.)
Using Less Real Memory
Profiling with the gprof command can cause programs to page excessively since the -pg option dedicates pinned real-memory buffer space equal to one-half the size of your program's text. Excessive paging does not affect the data generated by profiling, since profiled programs do not generate ticks when waiting on I/O, only when using the CPU. If the time delay caused by excessive paging is unacceptable, we recommend using the tprof command.
Item | Description |
---|---|
-b | Suppresses the printing of a description of each field in the profile. |
-c Filename | Creates a file that contains the information needed for remote processing of profiling information. Do not use the -c flag in combination with other flags. |
-E Name | Suppresses the printing of the graph profile entry for routine Name and its descendants, similar to the -e flag, but excludes the time spent by routine Name and its descendants from the total and percentage time computations. (-E MonitorCount -E MonitorCleanup is the default.) |
-e Name | Suppresses the printing of the graph profile entry for routine Name and all its descendants (unless they have other ancestors that are not suppressed). More than one -e flag can be given. Only one routine can be specified with each -e flag. |
-F Name | Prints the graph profile entry of the routine Name and its descendants similar to the -f flag, but uses only the times of the printed routines in total time and percentage computations. More than one -F flag can be given. Only one routine can be specified with each -F flag. The -F flag overrides the -E flag. |
-f Name | Prints the graph profile entry of the specified routine Name and its descendants. More than one -f flag can be given. Only one routine can be specified with each -f flag. |
-g Filename | Writes call graph information to the specified output filename. It also suppresses the profile information unless the -p flag is used. |
-i Ffilename | Writes the routine index table to the specified output filename. If this flag is not used, the index table goes either at the end of the standard output, or at the bottom of the filename(s) specified with the -p and -g flags. |
-L PathName | Uses an alternate pathname for locating shared objects. |
-p Filename | Writes flat profile information to the specified output filename. It also suppresses the call graph information unless the -g flag is used. |
-s | Produces the gmon.sum profile file, which represents the sum of the profile information in all the specified profile files. This summary profile file may be given to subsequent executions of the gprof command (using the -s flag) to accumulate profile data across several runs of an a.out file. |
-x Filename | Retrieves information from Filename (a file created with the -c option) to generate profile reports. If Filename is not specified, the gprof command searches for the default gprof.remote file. |
-z | Displays routines that have zero usage (as indicated by call counts and accumulated time). |
gprof
gprof -L/home/score/lib runfile runfile.gmon
This
example uses the given runfile.gmon file for sample data and
the runfile file for local symbols, and checks the /u/score/lib file
for loadable objects.cc -pg dhry.c -o dhry # Re-compile to produce gprof output.
dhry # Execute program to generate ./gmon.out file.
gprof >gprof.out # Name the report whatever you like
vi gprof.out # Read flat profile first.
export GPROF=profile:thread
dhry # Execute program to generate ./gmon.out file which has thread level granularity
export GPROF=file:multi,filename:mygom
dhry # Execute program to generate ./gmon-dhry-2468.out
export GPROF=profile:thread,file:multithread,scale:10,filename:tgmon
dhry # Execute program to generate ./tgmon-dhry-2468-Pthread215.out
gprof -p fprofile.out ./dhry ./gmon-dhry-2468.out
gprof -g callgraph.out ./dhry ./gmon-dhry-2468.out
cc -pg thread.c -o thread -lpthread
export GPROF=profile:thread,filename:mygmon
thread # Execute program to generate mygmon.out file.
gprof -c my.remote thread mygmon.out
gprof -x my.remote
Throughout this description of the gprof command, most of the examples use the C program dhry.c. However, the discussion and examples apply equally to FORTRAN or COBOL modules by substituting the appropriate compiler name in place of the C compiler, cc, and the word subroutine for the word function. For example, the following commands show how to profile a FORTRAN program named matrix.f:
xlf -pg matrix.f -o matrix # FORTRAN compile of matrix.f program
matrix # Execute with gprof profiling,
# generating gmon.out file
gprof > matrix.out # Generate profile reports in
# matrix.out from gmon.out
vi matrix.out # Read flat profile first.
Item | Description |
---|---|
a.out | Name list and text space |
gmon.out | Dynamic call graph and profile |
gmon.sum | Summarized dynamic call graph and profile |
gprof.remote | File for remote profiling |
/usr/ucb/gprof | Contains the gprof command. |
/usr/ccs/bin/gprof | Contains the gprof command |