4.4BSD/usr/src/contrib/mprof/mprof.1

.TH mprof 1
.SH NAME
mprof \- display dynamic memory allocation data
.SH SYNTAX
.B mprof
[ options ] [ a.out [ mprof.data ] ]
.nf
.PP
.B void set_mprof_autosave(count)
.B int count;
.PP
.B void mprof_stop()
.PP
.B void mprof_restart(filename)
.B char *filename;
.fi
.SH DESCRIPTION
The
.B mprof
command produces four tables that summarize the memory allocation
behavior of C programs, similar in style to the
.B gprof
command.  The arguments to  
.B mprof
are the executable image
.I (a.out
default)
and the profile data file
.I (mprof.data
default).  The
.I mprof.data
file is generated by linking a special version 
.I malloc
into the executing image. This new version, found in the library
.I libc_mp.a
must be linked in at the end of the command that creates the
executable image.  For example:
.sp
.nf
	cc -g -o test main.o sub1.o sub2.o libc_mp.a
.fi
.sp
.PP
Users' programs can contain additional calls to customize the user
interface to
.B mprof.
The function
.I set_mprof_autosave
allows users to save the profile data periodically.  The
.I count
parameter specifies to save after that number of allocations.
A value of 10,000 or 100,000 is typical for the
.I count
parameter for long running programs.  A value of 0 (the default)
causes the the profile data to be written only when the program exits.
The function
.I mprof_stop
causes memory profiling to be discontinued and the profile data to be
written to the output file.
The function
.I mprof_restart
restarts profiling.  The
.I filename
parameter to
.I mprof_restart
specifies the name of the file to write the profile data to.
.PP
The output of
.B mprof
consists of four tables, the fields of which are described in detail
below.  The first table breaks down the memory allocation of the
program by the number of bytes requested.  For each byte size the
number of allocations and frees is listed along with the program
structure types that correspond to that byte size.
.PP
The second table lists partial call chains over which memory was
allocated and never freed (call chains resulting in memory leaks).
The table shows how much memory was allocated by each chain and how
much each chain contributed to the total memory leakage.
.PP
The third table lists the functions in which
allocation occurred directly (i.e., called
.I malloc),
indicates how much memory was allocated, shows how much of that was
not later freed, breaks down allocation roughly by size, and shows how
many times each function was called.
.PP
The fourth table contains the
subgraph of the program's dynamic call graph in which allocation
occurred.  This table allows programmers to identify what functions
were indirectly responsible for memory allocation.
.PP
The following options are available:
.TP
.B \-verbose
Every bin in which memory was allocated is printed; the call chain for
every memory leak is shown.
.TP
.B \-normal
Only bins that contributed a reasonable fraction to the total
allocation are printed; call chains for leaks contributing more than
0.5% to the total are shown.  This is the default verbosity setting.
.TP
.B \-terse
Only bins that contributed a significant fraction to the total
allocation are printed.  Call chains contributing more than 1% to the
total leakage are shown.
.TP
.B \-leaktable
Print out the memory leak table without printing out call site offsets.
This is the default.
.TP
.B \-noleaktable
Do not print out the memory leak table.
.TP
.B \-offsets
Print out the memory leak table and distinguish different call sites
within a function by indicating the offset in the function as part of
the path.  This is useful to identify a particular call site in a
function with many call sites that allocate memory.
.SH FIELDS IN THE OUTPUT
.LP
Often in the tables, percentages are presented in two column fields.
In such a field, a blank
indicates 0%, a dot indicates less than 1%, and two stars
indicate 100%.
.LP
When data is broken down by size categories, the categories mean the
following:
.nf
	s = small      		x <= 32 bytes
	m = medium     		32 < x <= 256 bytes
	l = large      		256 < x <= 2048 bytes
	x = extra large		x > 2048 bytes
.fi
.LP
where x is the exact size of the object being allocated by a call to
malloc.  When data is broken into categories, percentages are always
given in a two column format.
Throughout this document, we refer to such a listing as
a ``breakdown''.
.SH "TABLE 1: ALLOCATION BINS"
.LP
The memory allocation is broken down by the sizes of objects requested
and freed.
.IP size 14
The size in bytes of the object allocated or freed.
.IP allocs 14
The number of calls to malloc requesting allocation of this size.
.IP "bytes (%)" 14
The total number of bytes allocated to objects of this size.  The
percent indicates the percent of the total bytes allocated.
.IP frees 14
The number of times objects of this size were freed.
.IP "kept (%)" 14
The number of bytes of objects of this size that were never freed.
The percent indicates what fraction of unfreed bytes were allocated to
objects of this size.  
.IP types 14
A list of the program names of structure types or typedefs that define
objects of this size.
.SH "TABLE 2: MEMORY LEAKS"
.LP
The memory leak table lists the partial call chains which allocated
memory that was never freed.  At most five functions in the call chain are
listed.
.IP "kept bytes (%)" 14
The number of bytes allocated on this partial call chain
and not subsequently freed.
The table is sorted by decreasing values in this field.
The percent indicates the percent of total bytes not freed.
.IP allocs 14
The number of allocations that occurred on this partial call chain.
.IP "bytes (%)" 14
The number of bytes allocated on this partial call chain.  The percent
indicates the percent of the total bytes allocated and never freed.  
.IP frees 14
The number of frees that occurred on this partial call chain.  If no
objects were freed this and the following field are ommitted.
.IP "bytes (%)" 14
The number of bytes freed on this partial call chain.  This field is
omitted if no bytes were freed.
.IP path 14
The partial call chain.  Call chains starting with "..." indicate that
more callers were present, but were ommitted from the listing.  Call
chains consist of function names (and possible call site offsets)
separated by ">".  Call site offsets are indicated by a +n
following the function name, where n is the distance in bytes of the
call site from the start of the function.  Call site offsets are
printed using the -offset option.
.SH "TABLE 3: DIRECT ALLOCATION"
.LP
The <TOTAL> row of the direct allocation listing contains a summary of
all the functions where such a summary makes sense.
.IP "% mem" 14
Percentage of the total memory allocated that was allocated by this
function.  
.IP bytes 14
The total number of bytes allocated by this function.
.IP "% mem(size)" 14
Size breakdown of the memory allocated by this function as a
percentage of the total memory allocated by the program.  For example,
if the values for function MAIN are s=5, m=20, l=4, x=0, then direct
calls to MALLOC from MAIN account for 5+20+4+0 = 29% of the total
memory allocated by the program.  Moreover, 20% of the total memory
allocated by the program was of medium sized objects (between 33 and
256 bytes) by the function MAIN.  The <TOTAL> row represents the
breakdown by size of all the memory allocated by the program.
.IP "bytes kept" 14
The number of bytes allocated by this function that were never freed
(by calls to FREE).
.IP "% all kept" 14
The size breakdown of objects never freed by this function as a
percentage of all objects never freed.  For example, if <% all kept>
values for function MAIN are s=2, m=10, l=<blank>, x=<blank>, then 10%
of the total bytes not freed were allocated by MAIN and were allocated
in medium-sized chunks.  The <TOTAL> row represents the size breakdown
of all the memory allocated but never freed.
.IP "calls" 14
The number of times this function was called to allocate an object.
.IP "name" 14
The name of the function.
.SH "TABLE 4: ALLOCATION CALL GRAPH"
.LP
A star (*) indicates that this field is omitted for ancestors or
descendents in the same cycle as the function.
.LP
Cycles are listed twice.
The first appearance shows all the functions
that are members of the cycle and the amount of memory allocated
locally in each function, including the breakdown of the local
allocation by size and the breakdown by size as a fraction of the
total cycle.
The second appearance shows what the call
graph would look like if all the functions in the cycle were merged
into a single function.
.IP "index" 14
A unique index used to aid searching for functions in the call graph listing.
.IP "self + desc" 14
The percent of the total allocated memory that was allocated by this
function and its descendents.  
.IP "self (%)" 14
The number of bytes allocated by the function itself.  The percentage
indicates the fraction of the bytes allocated by the function and its
descendents that were allocated in the function itself.
.IP "size-func" 14
The size breakdown of objects allocated in the function itself (not
including its descendents.)
.IP "called" 14
The number of times this function was called while allocating memory.
.IP "recur" 14
The number of recursive function calls while allocating memory.
.IP "name" 14
The function name including possible cycle membership and index.
.SH "ANCESTOR LISTINGS"
.lp
If the word ``all'' appears in the <self + desc> column, then this row
represents a summary of all the ancestors and presents the total
number of bytes requested by all ancestors in the <bytes> column, and
the breakdown of these bytes by size in the <self-ances> breakdown
columns.  If there is only one ancestor, then this summary is omitted.
.IP "*self (%)" 14
The number of bytes allocated by the function and its descendents that
were allocated on behalf of this parent.  The percentage indicates
what fraction of the total bytes allocated by the function and its
descendents were allocated on behalf of this parent.
.IP "*size-ances" 14
The size breakdown of the bytes allocated by the function and its
descendents on behalf of this parent.
.IP "*frac-ances" 14
The size breakdown of the objects allocated in the function and its
descendents on behalf of this parent as a percentage of all objects
allocated by the function and its descendents.  For example if parent
P1 of function F has <frac-ances> values s=<blank>, m=<blank>, l=30,
x=<blank>, then 30% of all objects allocated by F and its descendents
are of large objects allocated on behalf of parent P1.
.IP "called" 14
The number of times this parent called this function while
requesting memory.
.IP "*total" 14
The number of calls this parent made requesting memory from any function.
.IP "ancestors" 14
The name of the parent including possible cycle membership and index.
.SH "DESCENDENT LISTINGS"
.LP
If the word ``all'' appears in the <self + desc> column, then this row
represents a summary of all the descendents and presents the total
number of bytes allocated by all descendents in the <bytes> column,
and the breakdown of these bytes by size in the <self-desc> breakdown
columns.  If there is only one descendent, then this summary is
omitted.
.IP "*self (%)" 14
The number of bytes allocated in this descendent that were allocated
at the request of the function.  The percentage indicates what
fraction of the total bytes allocated in descendents of the function
were allocated in this descendent.
.IP "*size-ances" 14
The size breakdown of the bytes allocated by this descendent on behalf
of the function.
.IP "*frac-desc" 14
The size breakdown of the objects allocated in this descendent on
behalf of the function as a percentage of all objects allocated by all
descendents on behalf of this function.  For example if descendent C1
of function F has <frac-desc> values s=35, m=<blank>, l=<blank>, x=<blank>,
then 35% of all objects allocated by children of F on its behalf were
allocated in child C1 and were small objects. 
.IP "called" 14
The number of times the function called this descendent while
requesting memory.
.IP "*total" 14
The number of times this descendent was called during a memory request.
.IP "descendents" 14
The name of the child including possible cycle membership and index.
.SH FILES
.nf
a.out         	contains symbol table information.
mprof.data    	memory allocation call graph information.
libc_mp.a	special version of malloc which profiles allocation.
                (eventually to be put in /lib/local/mprof/libc_mp.a)
.fi
.SH "SEE ALSO"
cc(1), gprof(1)
.br
.I A Memory Allocation Profiler for C
.I and Lisp Programs,
Benjamin Zorn and Paul Hilfinger, Summer 1988 USENIX Conference.
.SH AUTHOR
Written by Benjamin Zorn, zorn@ernie.berkeley.edu, as part of Ph.D.
research sponsored by the SPUR research project.
.SH BUGS
The code that determines the names and sizes of user types is poorly
written and depends on the program being compiled with the -g option.
In some cases (mostly very simple cases) the user type names are
not correctly determined.
.PP
If the user application calls
.I valloc
or
.I memalign
and later tries to free that memory,
.B mprof
will cause a segmentation fault.