4.1cBSD/usr/doc/sdb/sdbsymtab.n

.so /usr/lib/tmac/tmac.s
.MF
.TL 
Symbol Table Format for Sdb
.ft R
.br
.ps -2
Case: 39394  File 39394
.ps +2
.ft R
.AU
H.P. Katseff
.AI
.HO
.NH
Introduction.
.PP
A symbolic debugger, sdb,
has been implemented for the UNIX/32V operating system.
This document describes modifications made to 
the C compiler to generate additional
information about the compiled program and to the
assembler and loader to process the information.
It also describes information recognized by the assembler, the loader
and sdb which are intended for use by compilers for other languages
such as F77.
.NH
The C Compiler
.PP
The C compiler was modified to generate additional symbol table
information describing a compiled program.
Two new types of symbol table entries are made.
One describes the variables, giving
their class (local, register, parameter, global, etc.),
their declared type in the program
and their address or offset.
An additional entry is made for structures giving their size.
The other type of entry provides a mapping between the source program
and the object program.
There is an entry for each source line, procedure and source file
giving their addresses in the object file.
All line numbers are relative to the beginning of
the source file.
.PP
All entries are generated with the new assembler pseudo-operation
`.stab'.
It always takes 12 arguments of which the first eight usually
represent the name of the symbol as declared in the C program.
An underscore is
.ul
not
prepended to the name as in some other symbol table entries.
A typical entry would be
.DS C
\&.stab	'e,'r,'r,'f,'l,'g,0,0,046,0,05,_errflg
.DE
For expository convenience,
names in .stab entries will be listed as one word instead of
eight separate characters.
.NH 2
External symbols defined with .comm
.PP
The following entry is made for each external symbol which is
defined with a .comm pseudo-op.
.DS C
\&.stab	name,040,0,type,0
.DE
The type is a 16-bit value describing the variable's declared type.
This field is described in section 2.13.
The debugger determines the variable's address
from the entry made with the .comm.
It assumes that the name for this entry is _name.
.NH 2
Symbols defined within .data areas
.PP
The following entry is made for each symbol which is defined 
as a label in a data area.
.DS C
\&.stab	name,046,0,type,address
.DE
The type is the variable's declared type.
The address is given symbolically as the label.
.NH 2
Symbols defined with .lcomm
.PP
The following entry is made for each symbol which is defined with 
a .lcomm pseudo-op.
.DS C
\&.stab	name,048,0,type,address
.DE
The type is the variable's declared type.
The address is given symbolically as the label.
The specification of an octal constant with an 8 occurs for historical
reasons.
.NH 2
Register symbols
.PP
The following entry is made for each variable whose value
is in a register.
.DS C
\&.stab	name,0100,0,type,register
.DE
The type is the variable's declared type. 
The register is the register number assigned to the variable.
.NH 2
Local non-register symbols
.PP
The following entry is made for each local, non-register variable.
.DS C
\&.stab	name,0200,0,type,offset
.DE
The type is the variable's declared type.
The offset is a positive number indicating its offset in bytes for the
frame pointer.
.NH 2
Parameter symbols
.PP
The following entry is made for each procedure parameter.
.DS C
\&.stab	name,0240,0,type,offset
.DE
The type is the variable's declared type.
The offset is a positive number indicating its offset in bytes from
the stack pointer.
.NH 2
Structure elements
.PP
The following entry is made for each structure element.
.DS C
\&.stab	name,0140,0,type,offset
.DE
The type is the element's declared type.
The offset is its offset within the structure in bytes.
.NH 2
Structure symbols
.PP
An additional entry is made for structures giving their size in bytes.
It immediately follows their defining .stab entry.
It is of the form
.DS C
\&.stab	name,0376,0,0,length
.DE
.NH 2
Common blocks
.PP
The following sequence of entries is used to describe elements of
Fortran equivalence and common blocks.
The first is of the form
.DS C
\&.stab	0,0342,0,0,0
.DE
The entries for each element of the block should then appear
as if they were structure elements.
Finally, one of the following
two entries is used depending on the type of common
or equivalence block.
If the block is defined as a .globl symbol, use the entry
.DS C
\&.stab	name,0344,0,0,0
.DE
where name is the name of the block defined in the .globl statement.
It the block is defined in some other way, use
.DS C
\&.stab	0,0348,0,0,address
.DE
.NH 2
Brackets
.PP
Since C is a block-structured language,
it is necessary to know the extent of each block containing symbol definitions.
An entry is made for each right and left bracket which encloses
a block with definitions.
The following entries are for left and right brackets respectively.
.DS C
\&.stab	0,0300,0,nesting level,address
\&.stab	0,0340,0,nesting level,address
.DE
The nesting level is the static nesting level of the block.
It is currently ignored by the debugger.
The address is the address of the first byte of code for the block
for the left brackets and the first byte following the block
for right brackets.
.NH 2
Procedures
.PP
The following entry is made for each procedure.
.DS C
\&.stab	name,044,0,linenumber,address
.DE
The linenumber is the number of the first line of the procedure in the
source file.
The address is the address of the first byte of the procedure.
.NH 2
Lines
.PP
The following entry is made for each line in the source program.
.DS C
\&.stab	0,0104,0,linenumber,address
.DE
The linenumber is its number.
The address is the address of the first byte of code for the line.
For each block of the program,
the linenumber entries for that block should follow the entries
for the variables of that block.
.NH 2
Source files
.PP
The following entries are made for each source file.
.DS C
\&.stab	name1,0144,0,0,address
\&.stab	name2,0144,0,0,address
\&...
\&.stab	namen,0144,0,0,address
.DE
Each entry contains 8 successive bytes of the name of the source file.
The name is terminated by a null byte.
All bytes following this one should also be null.
The address is the address of the first byte of code for the first
procedure of the file.
.NH 2
Included source files
.PP
The following entry is made for each included source file which
generates code.
.DS C
\&.stab	name1,0204,0,0,address
\&.stab	name2,0204,0,0,address
\&...
\&.stab	namen,0204,0,0,address
.DE
This entry should appear each time the file is included.
A similar entry giving the name of the original file should be
made at the end of the include.
The format of the name is identical to that for files.
This feature is heavily used by programs generated by yacc and lex.
.NH 2
Format of types.
.PP
This 16 bit quantity type describes the declared type of a variable.
We use the same scheme as in S.C. Johnson's Portable C Compiler
[Johnson, 1978].
The type is divided into the following fields:
.DS
	struct {
		short basic:4;
			d1:2,
			d2:2,
			d3:2,
			d4:2,
			d5:2,
			d6:2,
	}
.DE
There are four derived types:
.DS
	0  none
	1  pointer
	2  function
	3  array
.DE
They are indicated in the two bit fields d1, d2, d3, d4, d5 and d6.
The four bit field basic indicates the basic type as follows:
.DS
	 0  undefined
	 1  function argument
	 2  character
	 3  short
	 4  int
	 5  long
	 6  float
	 7  double
	 8  structure
	 9  union
	10  enumerated type
	11  member of enumerated type
	12  unsigned character
	13  unsigned short
	14  unsigned
	15  unsigned long
.DE
.NH 1
The assembler and loader
.PP
Each .stab pseudo-operation generates one entry in the symbol table.
The entry is of the form:
.DS
	struct {
		char	name[8];
		char	type;
		char	other;
		short	desc;
		unsigned value;
	}
.DE
.PP
The loader uses the four least significant bits of the type field
to determine how to relocate the .stab entry.
The following are currently used.
.DS
	0  none
	4  text
	6  data
.DE
.PP
It is necessary for the assembler and loader to preserve the order
of symbol table entries produced by .stab pseudo-ops.
.SH
Reference
.LP
Johnson, S.C., "A Portable Compiler: Theory and Practice",
.I
Proc. 5th ACM Symp. on Principles of Programming Languages,
.R
January 1978.
.bp
.SH 
Appendix
.PP
The following definitions are extracted from the file /usr/include/a.out.h.
.sp 1
.nf
.na
struct	nlist { /* symbol table entry */
	char	n_name[8];	/* symbol name */
	char	n_type;		/* type flag */
	char	n_other;
	short	n_desc;
	unsigned n_value;	/* value */
};

	/* values for type flag */
#define	N_UNDF	0		/* undefined */
#define	N_ABS	02		/* absolute */
#define	N_TEXT	04		/* text */
#define	N_DATA	06		/* data */
#define	N_BSS	08
#define	N_TYPE	037
#define	N_FN	037		/* file name symbol */

#define	N_GSYM	0040		/* global sym: name,,type,0 */
#define	N_FUN	0044		/* function: name,,linenumber,address */
#define	N_STSYM 0046		/* static symbol: name,,type,address */
#define	N_LCSYM 0048		/* .lcomm symbol: name,,type,address */
#define	N_RSYM	0100		/* register sym: name,,register,offset */
#define	N_SLINE	0104		/* src line: ,,linenumber,address */
#define	N_SSYM	0140		/* structure elt: name,,type,struct_offset */
#define	N_SO	0144		/* source file name: name,,,address */
#define	N_LSYM	0200		/* local sym: name,,type,offset */
#define	N_SOL	0204		/* #line source filename: name,,,address */
#define	N_PSYM	0240		/* parameter: name,,type,offset */
#define	N_LBRAC	0300		/* left bracket: ,,nesting level,address */
#define	N_RBRAC	0340		/* right bracket: ,,nesting level,address */
#define N_BCOMM 0342		/* begin common: name,,, */
#define N_ECOMM 0344		/* end common: name,,, */
#define N_ECOML 0348		/* end common (local name): ,,,address */
#define	N_LENG	0376		/* second stab entry with length information */

#define	N_EXT	01		/* external bit, or'ed in */

#define	FORMAT	"%08x"

#define	STABTYPES	0340
.fi
.ad
.SG HO-1353-HPK-sdb
.sp 2
Copy to
.br
R.W. Lucky
.br
C.S. Roberts