.TL Assembler Reference Manual .AU John F. Reiser .AI .HO .AU Robert R. Henry\s-2\u*\d\s+2 .FS \&\s-2\u*\d\s+2 Preparation of this paper supported in part by the National Science Foundation under grant MCS # 78-07291. .FE .AI Electronics Research Laboratory University of California Berkeley, CA 94720 .ND November 5, 1979 .NH Introduction .PP This document describes the usage and input syntax of the \s8UNIX VAX\s10-11 assembler \fIas\fP. \fIAs\fP is designed for assembling the code produced by the C compiler; certain concessions have been made to handle code written directly by people, but in general little sympathy has been extended. This document is intended only for the writer of a compiler or a maintainer of the assembler. .NH Usage .PP \fIas\fP is used as follows: .in +5 as [ \fB\-LVWJR\fR ] [ \fB\-d\fIn\fR ] [ \fB\-DTC\fR ] [ \fB\-t \fIdirectory\fR ] [ \fB\-o \fIoutput\fR ] [ \fIname\d\s-2\&1\s+2\u1\fP ] [ \fIname\d\s-2\&2\s+2\u ... \fP ] .br .in -5 .PP The \fB\-L\fP flag instructs the assembler to save labels beginning with a 'L' in the symbol table portion output file. Labels are not saved by default, as the default action of the link editor \fIld\fP is to discard them anyway. .PP The \fB\-V\fP flag tells the assembler to place its interpass temporary file into virtual memory. In normal circumstances, the system manager will decide where the temporary file should lie. Our experiments with very large temporary files show that placing the temporary file into virtual memory will save about 13% of the assembly time, where the size of the temporary file is about 350K bytes. Most assembler sources will not be this long. .PP The \fB\-W\fP turns off reporting all errors. .PP The \fB\-J\fP flag forces \s-2UNIX\s+2 style pseudo\-branch instructions with destinations further away than a byte displacement to be turned into jump instructions with 4 byte offsets. The \fB\-J\fP flag buys you nothing if \fB\-d2\fP is set. (See \(sc 9.4) .PP The \fB\-R\fP flag effectively turns \fB.data\fP\fI n\fP segment changing directives into \fB.text\fP\fI n\fP directives. This obviates the need to run editor scripts on assembler source to ``read\-only'' fix initialized data segments. Uninitialized data (via \fB.lcomm\fP and \fP.comm\fP directives) is still assembled into the data or bss segments. .PP The \fB\-d\fP flag specifies the number of bytes which the assembler should allow for a displacement when the value of the displacement expression is undefined in the first pass. The possible values of \fIn\fP are 1, 2, or 4; the assembler uses 4 bytes if \fB-d\fP is not specified. See \(sc 9.2. .PP Provided the \fB\-V\fP flag is not set, the \fB\-t\fP flag causes the assembler to place its single temporary file in the \fIdirectory\fP instead of in \fI/tmp\fP. .PP The \fB\-o\fP flag causes the output to be placed on the named file. The output of the assembler is by default placed on the file \fIa.out\fR in the current directory. .PP The input to the assembler is normally taken from the standard input. If file arguments occurs, then the input is taken sequentially from the files \fIname\d\s-2\&1\s+2\u\fP, \fIname\d\s-2\&2\s+2\u\fP... This is not to say that the files are assembled seperately; \fIname\d\s-2\&2\s+2\u\fP is effectively concatenated to \fIname\d\s-2\&1\s+2\u\fP, so multiple definitions cannot occur amongst the input sources. .PP The \fB\-D\fP flag enables debugging information, provided that the assembler has been compiled to have debugging information available. .PP The \fB\-T\fP flag enables a trace to be generate of each token read by \fIas\fP to be printed. This is long and boring, but useful when debugging the assembler. .NH Lexical conventions .PP Assembler tokens include identifiers (alternatively, ``symbols'' or ``names''), constants, and operators. .NH Identifiers .PP An identifier consists of a sequence of alphanumeric characters (including period ``\|\fB.\fR\|'', underscore ``\(ul'' and dollar ``\|$\|'') of which the first may not be numeric. If the assembler has been compiled to support flexible length symbols, identifiers may be (practically) arbitrarily long with all characters significant; otherwise, only the first NCPS (a symbol defined in \fI/usr/include/a.out.h\fP, and normally 8) characters are significant. .NH 2 Constants .NH 3 Simple constants .PP All integer constants are 64 bits wide and interpreted as two's complement numbers. 64 bit wide integer constants (quads) are only partially supported by the \s-2VAX\s+2 hardware, and are supported only to provide immediate constants to \s-2VAX\s+2 instructions with quad operands. Floating-point constants are 64 bits wide. The digits are ``0123456789abcdefABCDEF'' with the obvious values. .PP An octal constant consists of a sequence of digits with a leading zero. .PP A decimal constant consists of a sequence of digits without a leading zero. .PP A hexadecimal constant consists of the characters ``0x'' (or ``0X'') followed by a sequence of digits. .PP A single-character constant consists of a single quote ``\|\(fm\|'' followed by an \s8ASCII\s10 character, including \s8ASCII\s10 newline. The constant's value is the code for the given character. .PP A floating-point constant consists of the characters ``0f'', ``0d'', ``0F'', or ``0D'' followed by a sequence of characters which \fIatof\fP will recognize as a floating-point number; either ``e'', ``E'', ``d''or ``D'' may be used to designate the exponent field. .NH 3 String Constants .PP A string constant is defined using the same syntax and semantics as ``C'' beginning and ending with a ``"'' (double quote). The \s8DEC\s10 assembler conventions for flexible string quoting is not implemented. All ``C'' backslash conventions are observed; the backslash conventions peculiar to the \s-2PDP\-11\s+2 assembler are not observed. Strings are known by their value and their length; the assembler does not implicitly end strings with a null byte. .NH 2 Operators .PP There are several single-character operators; see \(sc7. .NH 2 Blanks .PP Blank and tab characters may be interspersed freely between tokens, but may not be used within tokens (except character constants). A blank or tab is required to separate adjacent identifiers or constants not otherwise separated. .NH 2 Comments .NH 3 Decadent Comments .PP The character ``\|#\|'' introduces a comment, which extends through the end of the line on which it appears. Comments starting in column 1, of the format ``\|# \fIexpression string\fP\|" are interpreted as an indication that the assembler is now assembling file \fIstring\fP at line \fIexpression\fP. Thus, one can use the C preprocessor on an assembly language source file, and use the \fI#include\fP and \fI#define\fP preprocessor directives. (Note that their may not be an assembler comment starting in column 1 if the assembler source is given to the C preprocessor, as it will be intrepreted by the preprocessor in a way not intended.) Comments are otherwise ignored by the assembler. .NH 3 C Style Comments .PP The assembler will recognize C style comments, introduced with the prologue \fB/*\fP and ending with the epilogue \fB*/\fP. C style comments may extend across multiple lines, and are the preferred comment style to use if one chooses to use the C preprocessor. .NH 1 Segments and Location Counters .PP Assembled code and data fall into three segments: the text segment, the data segment, and the bss segment. The operating system makes some assumptions about the content of these segments; the assembler does not. Within the text and data segments there are a number of sub-segments, distinguished by number (``text 0'', ``text 1'', .\|.\|. ``data 0'', ``data 1'', .\|.\|.\|). Currently there are four subsegments each in text and data. The subsegments are for programming convenience only. Before writing the output file, the assembler zero-pads each text subsegment to a multiple of four bytes and then concatenates the subsegments in order to form the text segment; an analogous operation is done for the data segment. Requesting that the loader define symbols and storage regions is the only action allowed by the assembler with respect to the bss segment. Assembly begins in ``text 0''. .PP Associated with each (sub)segment is an implicit location counter which begins at zero and is incremented by 1 for each byte assembled into the (sub)segment. There is no way to explicitly reference a location counter. Note that the location counters of subsegments other than ``text 0'' and ``data 0'' behave peculiarly due to the concatenation used to form the text and data segments. .NH 1 Statements .PP A source program is composed of a sequence of \fIstatements\fP. Statements are separated either by new-lines or by semicolons. There are two kinds of statements: null statements and keyword statements. Either kind of statement may be preceded by one or more labels. .NH 2 Labels .NH 3 Name (Global) Labels .PP A global label consists of a name followed by a colon ``\|:\|''. The effect of a name label is to assign the current value and type of the location counter to the name. An error is indicated in pass 1 if the name is already defined; an error is indicated in pass 2 if the value assigned changes the definition of the label. .PP A global label is referenced by its name. .PP Global labels beginning with a ``\|L\|'' are discarded unless the \fB-L\fP option is in effect. .NH 3 Numeric (Local) Labels .PP A numeric label consists of a digit \fI0\fP to \fI9\fP followed by a colon (``\|:\|''). Such a label serves to define temporary symbols of the form ``\fIn\fPb'' and ``\fIn\fPf'', where \fIn\fP is the digit of the label. As in the case of name labels, a numeric label assigns the current value and type of the location counter to the temporary symbol. However, several numeric labels with the same digit may be used within the same assembly. References to symbols of the form ``\fIn\fPb'' refer to the first numeric label ``\fIn\|:\fP'' \fIb\fP\|ackwards from the reference; ``\fIn\fPf'' symbols refer to the first numeric label ``\fIn\|:\fP'' \fIf\fP\P\|orwards from the reference. Such numeric labels tend to conserve the inventive powers of the programmer. .NH 2 Null statements .PP A null statement is an empty statement (which may, however, have labels). A null statement is ignored by the assembler. Common examples of null statements are empty lines or lines containing only a label. .NH 2 Keyword statements .PP A keyword statement begins with one of the many predefined keywords of the assembler; the syntax of the remainder depends on the keyword. All instruction opcodes are keywords. The remaining keywords are assembler pseudo-operations, also called directives. The pseudo-operations are listed below with the syntax they require. .NH 1 Expressions .PP An expression is a sequence of symbols representing a value. Its constituents are identifiers, constants, operators, and parentheses. Each expression has a type. .PP All operators in expressions are fundamentally binary in nature. Arithmetic is two's complement and has 32 bits of precision. There are four levels of precedence, listed here from lowest precedence level to highest: .IP (binary) 16 \|+\|, -\| .IP (binary) 16 \||\|, \|&\|, \|^\|, \|!\| .IP (binary) 16 \|*\|, \|/\|, \|%\|, \|!\| .IP (unary) 16 \|-\|, \|!\| .PP All operators of the same precedence are evaluated strictly left to right, except for evaluation order enforced by parenthesis. .NH 2 Expression operators .PP The operators are: .IP + 16 addition .IP \- 16 subtraction .IP * 16 multiplication .IP / 16 division .IP % modulo .IP & 16 bitwise and .IP \(bv 16 bitwise or .IP ^ 16 bitwise exclusive or .IP "> (or >>)" 16 logical right shift .IP "< (or <<)" 16 logical left shift .hc .IP ! 8 \fIa\fR\|!\|\fIb\fR is \fIa \fBor \fR(\|\fBnot \fIb\fR\|); i.e., the \fBor\fR of the first operand and the one's complement of the second; most common use is as a unary operator. .PP Expressions may be grouped by use of parentheses ``\|(\|\|)\|''. .NH 2 Types .PP The assembler deals with a number of types of expressions. Most types are attached to keywords and used to select the routine which treats that keyword. The types likely to be met explicitly are: .IP undefined 8 .br Upon first encounter, each symbol is undefined. It may become undefined if it is assigned an undefined expression. It is an error to attempt to assemble an undefined expression in pass 2; in pass 1, it is not (except that certain keywords require operands which are not undefined). .IP "undefined external" 8 .br A symbol which is declared \fB.globl\fR but not defined in the current assembly is an undefined external. If such a symbol is declared, the link editor \fIld\fR must be used to load the assembler's output with another routine that defines the undefined reference. .IP absolute 8 .br An absolute symbol is defined ultimately from a constant. Its value is unaffected by any possible future applications of the link-editor to the output file. .IP text 8 .br The value of a text symbol is measured with respect to the beginning of the text segment of the program. If the assembler output is link-edited, its text symbols may change in value since the program need not be the first in the link editor's output. Most text symbols are defined by appearing as labels. At the start of an assembly, the value of ``\|\fB.\fP\|'' is text 0. .IP data 8 .br The value of a data symbol is measured with respect to the origin of the data segment of a program. Like text symbols, the value of a data symbol may change during a subsequent link-editor run since previously loaded programs may have data segments. After the first \fB.data\fR statement, the value of ``\|\fB.\fP\|'' is data 0. .IP bss 8 .br The value of a bss symbol is measured from the beginning of the bss segment of a program. Like text and data symbols, the value of a bss symbol may change during a subsequent link-editor run, since previously loaded programs may have bss segments. .IP "external absolute, text, data, or bss" 8 .br symbols declared \fB.globl\fR but defined within an assembly as absolute, text, data, or bss symbols may be used exactly as if they were not declared \fB.globl\fR; however, their value and type are available to the link editor so that the program may be loaded with others that reference these symbols. .IP register 8 .br The symbols .DS \fBr0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15\fP \fBap fp sp pc\fP .DE are predefined as register symbols. In addition, the % operator converts an absolute value to type register. .IP "other types" 8 .br Each keyword known to the assembler has a type which is used to select the routine which processes the associated keyword statement. The behavior of such symbols when not used as keywords is the same as if they were absolute. .NH 2 Type propagation in expressions .PP When operands are combined by expression operators, the result has a type which depends on the types of the operands and on the operator. The rules involved are complex to state but were intended to be sensible and predictable. For purposes of expression evaluation the important types are .DS undefined absolute text data bss undefined external other .DE The combination rules are then: If one of the operands is undefined, the result is undefined. If both operands are absolute, the result is absolute. If an absolute is combined with one of the ``other types'' mentioned above, the result has the other type. An ``other type'' combined with an explicitly discussed type other than absolute it acts like an absolute. .PP Further rules applying to particular operators are: .IP + If one operand is text-, data-, or bss-segment relocatable, or is an undefined external, the result has the postulated type and the other operand must be absolute. .IP \- If the first operand is a relocatable text-, data-, or bss-segment symbol, the second operand may be absolute (in which case the result has the type of the first operand); or the second operand may have the same type as the first (in which case the result is absolute). If the first operand is external undefined, the second must be absolute. All other combinations are illegal. .PP .IP others .br It is illegal to apply these operators to any but absolute symbols. .NH 1 Pseudo-operations (Directives) .PP The keywords listed below introduce influence the later operations of the assembler. The metanotation .DS [ stuff ] .\|.\|. .DE means that 0 or more instances of the given stuff may appear. The metatnotation .DS ( stuff )\|*\|\|\fIn\fP\| .DE means that exactly \fIn\fP occurances of stuff must occur. .PP Boldface tokens are literals, italic words are substitutable. .PP The pseudo\-operations listed below are grouped into functional categories, and not alphabetically. .NH 2 Interface to a Previous Pass .in +5m .NH 3 \&.ABORT .PP As soon as the assembler sees this directive, it ignores all further input (but it does read to the end of file), and aborts the assembly. No files are created. It is anticipated that this would be used in a pipe interconnected version of a compiler, where the first major syntax error would cause the compiler to issue this directive, saving unnecessary work in assembling code that would have to be discarded anyway. .NH 3 \&.file \fIstring\fP .PP This directive causes the assembler to think it is in file \fIstring\fP so error messages reflect the proper source file. .NH 3 \&.line \fIexpression\fP .PP This directive causes the assembler to think it is on line \fIexpression\fP so error messages reflect the proper source file. .PP The only effect of assembling multiple files specified in the command string is to insert the \fIfile\fP and \fIline\fP directives, with the appropriate values, at the beginning of the source from each file. .NH 3 Preprocessor Interface .DS \fI# expression string\fP \fI# expression\fP .DE .PP This is the only instance where a comment is meaningful to the assembler. The ``\|#\|'' .ul 1 must be in the first column. This meta comment causes the assembler to believe it is on line \fIexpression\fP. The second argument, if included, causes the assembler to believe it is in file \fIstring\fP, otherwise the current file name does not change. .in -5m .NH 2 Location Counter Control .in +5m .NH 3 \&\fB.align\fP \fIexpression\fP .PP The location counter is adjusted (by assembling bytes containing zeroes, if necessary) so that the \fIexpression\fP lowest bits become zero. Thus ``.align 2'' makes the location counter evenly divisible by 4. The expression must be defined, absolute, nonnegative, and less than 16. (Note that the subsegment concatenation convention and the current loader conventions may not preserve attempts at aligning to more than 2 low-order zero bits.) .NH 3 Subsegment switching .DS \fB.data\fP [ \fIexpression\fP ] \fB.text\fP [ \fIexpression\fP ] .DE .PP These two pseudo-operations cause the assembler to begin assembling into the indicated text or data subsegment. If specified, the expression must be defined and absolute; an omitted expression is treated as zero. The effect of a \fB.data\fP directive is treated as a \fB.text\fP directive if the \fB\-R\fP assembly flag is set. Assembly starts in the text 0 subsegment. .NH 3 \&\fB.org\fP \fIexpression\fP .PP The location counter is set equal to the value of the expression. The expression must be defined. The value of the expression must be greater than the current value of the location counter. .NH 3 \&\fB.space\fP \fIexpression\fP .PP \&\fIexpression\fP bytes of zeroes are assembled. .in -5m .NH 2 Initialized Data .in +5m .NH 3 Expression Initialized Data .DS \fB.byte \fIexpression \fR[ \fB, \fIexpression \fR] .\|.\|. \fB.word \fIexpression \fR[ \fB, \fIexpression \fR] .\|.\|. \fB.int \fIexpression \fR[ \fB, \fIexpression \fR] .\|.\|. \fB.long \fIexpression \fR[ \fB, \fIexpression \fR] .\|.\|. \fB.quad \fIexpression \fR[ \fB, \fIexpression \fR] .\|.\|. \fB.float \fIexpression \fR[ \fB, \fIexpression \fR] .\|.\|. \fB.double \fIexpression \fR[ \fB, \fIexpression \fR] .\|.\|. .DE .PP The \fIexpression\fP\|s in the comma-separated list are truncated to the indicated size (byte=8 bits, word=16, int=32, long=32, quad=64, float=32, double=64) and assembled in successive locations. The expressions must be absolute. The value assembled in bits 32-63 for \fB.double\fP is zero if the expression is not of type double. .PP Except for \fB.quad\fP, \fB.float\fP and \fB.double\fP, each expression may optionally be of the form .DS \fIexpression\d\s-21\&\s+2\u\fP \fB:\fP \fIexpression\d\s-2\&2\s+2\u\fP. .DE In this case the value of \fIexpression\d\s-2\&2\s+2\u\fP is truncated to \fIexpression\d\s-2\&1\s+2\u\fP bits and assembled in the next \fIexpr\d\s-2\&1\s+2\u\fP-bit field which fits in the natural data size being assembled. Bits which are skipped because a field does not fit are made zero. Thus "\fB.byte\fP 123" is equivalent to "\fB.byte\fP 8:123" and "\fB.byte\fP 3:1,2:1,5:1" assembles two bytes, containing the values 9 and 1. .br \fBNB:\fP Since no \s-2VAX\s+2 compilers currently use bit fields, these bit field constructs are liable to disappear in the future. .NH 3 String Initialized Data .DS \fB.ascii\fP \fIstring\fP [ \fB,\fP \fIstring\fP ] \fB.asciz\fP \fIstring\fP [ \fB,\fP \fIstring\fP ] .DE .PP Each \fIstring\fP in the list is assembled into successive locations, with the first letter in the string being placed into the first location, etc. The \fB.ascii\fP directive will not null pad the string; the \fB.asciz\fP directive will null pad the string. (Recall that strings are known by their length, and need not be terminated with a null, and that the C conventions for escaping are understood.) The \fB.ascii\fP directive is identical to: .DS \&\fB.byte\fP \fIstring\d\s-2\&0\s+2\u\fP\fB,\fP \fIstring\d\s-2\&1\s+2\u\fP\fB,\fP ... .DE .NH 3 Zero Filled Data .DS \fB.space\fP \fIexpression\fP .DE .PP (See \(sc 8.2.4) \&\fIexpression\fP bytes of zeroes are assembled. \&\fIexpression\fP must be absolute. .NH 3 Arbitrarily Filled Data .DS \fB.fill\fP \fIrep_expr\fP\fB, \fP \fIsize_expr\fP\fB, \fP \fIvalue_expr\fP\fR .DE .PP All three expressions must be absolute. \fIvalue_expr\fP, treated as an expression of size \fIsize_expr\fP bytes, is assembled and replicated \fIrep_expr\fP times. The effect is to advance the current location counter \fIrep_expr\fP \(** \fIsize_expr\fP bytes. \fIsize_expr\fP must be between 1 and 8. .in -5m .NH 2 Symbol Definition .in +5m .NH 3 General .in +5m .NH 4 \&\fB.comm\fI name \fB, \fIexpression\fR .PP Provided the \fIname\fR is not defined elsewhere, its type is made ``undefined external'', and its value is \fIexpression\fR. In fact the \fIname\fR behaves in the current assembly just like an undefined external. However, the link editor \fIld\fR has been special-cased so that all external symbols which are not otherwise defined, and which have a non-zero value, are defined to lie in the bss segment, and enough space is left after the symbol to hold \fIexpression\fR bytes. .NH 4 \&\fB.lcomm\fI name \fB, \fIexpression\fR .PP \fIexpression\fP bytes will be allocated in the bss segment and \fIname\fP assigned the location of the first byte, but the \fIname\fP is not declared as global and hence will be unknown to the link editor. .NH 4 \&\fB.globl\fP \fIname\fP .PP This statement makes the \fIname\fR external. If it is otherwise defined (by \fB.set\fP or by appearance as a label) it acts within the assembly exactly as if the \fB.globl\fR statement were not given; however, the link editor may be used to combine this routine with other routines that refer to this symbol. .PP Conversely, if the given symbol is not defined within the current assembly, the link editor can combine the output of this assembly with that of others which define the symbol. The assembler makes all otherwise undefined symbols external. .NH 4 \&\fB.set\fP \fIname\fP \fB,\fP \fIexpression\fP .PP The (\fIname\fP, \fIexpression\fP) pair is entered into the symbol table. Multiple \fB.set\fP statements with the same name are legal; the most recent value replaces all previous values. .in -5m .NH 3 Debugger Support .in +5m .NH 4 \&\fB.lsym\fP \fIname\fP \fB,\fP \fIexpression\fP .PP A unique and otherwise unreferenceable instance of the (\fIname\fP, \fIexpression\fP) pair is created in the symbol table. The Fortran 77 compiler uses this mechanism to pass local symbol definitions to the link editor and debugger. .NH 4 Special Symbol Table entries .DS \&\fB.stab\fP (\fIexpr\d\s-2i\s+2\u \fB,\fR)\|*NCPS\| \fIexpr\d\s-2\&1\s+2\u\fB,\fP expr\d\s-2\&2\s+2\u\fB,\fP expr\d\s-2\&3\s+2\u\fB,\fP expr\d\s-2\&4\s+2\u\fR .in +5m \fR(normal \fBs\fPymbol \fBtab\fPle entry)\fR .in -5m \&\fB.stabs\fP \fIstring, expr\d\s-2\&1\s+2\u, expr\d\s-2\&2\s+2\u, expr\d\s-2\&3\s+2\u, expr\d\s-2\&4\s+2\u\fR .in +5m \fR(\fBstab s\fPtring)\fR .in -5m \&\fB.stabn\fP \fIexpr\d\s-2\&1\s+2\u\fB,\fP expr\d\s-2\&2\s+2\u\fB,\fP expr\d\s-2\&3\s+2\u\fB,\fP expr\d\s-2\&4\s+2\u\fR .in +5m \fR(\fBstab n\fPone)\fR .in -5m \&\fB.stabd\fP \fIexpr\d\s-2\&1\s+2\u\fB,\fP expr\d\s-2\&2\s+2\u\fB,\fP expr\d\s-2\&3\s+2\u\fR .in +5m \fR(\fBstab d\fPot)\fR .in -5m .DE .PP The \fIstab\fP directives place symbols in the symbol table for the symbolic debugger, \fIsdb\fP\s-2\u*\d\s+2. .FS .in +5 .ti -5 \s-2\u*\d\s+2Katseff, H.P. \fISdb: A Symbol Debugger\fP. Bell Laboratories, Holmdel, NJ. April 12, 1979. .br .ti -5 \&Katseff, H.P. \fISymbol Table Format for Sdb\fP. File 39394, Bell Laboratores, Holmdel, NJ. March 14, 1979. .in -5 .FE In the \fB.stab\fP directive, the first NCPS expressions are used for the symbol name, which may be zero. The \fB.stab\fP directive makes no sense if the assembler recognizes arbitrary length symbols; if so, the assembler complains. The \fIstring\fP in the \fB.stabs\fP directive more generally serves the same purpose as the NCPS expressions. If the symbol name is zero, the \&\fB.stabn\fP directive may be used instead. .PP The other expressions are stored in the name list structure in the symbol table and preserved by the loader for reference by \fIsdb\fP\fR; the value of the expressions are peculiar to formats required by \fIsdb\fP\fR. .in +5m .ti -5 \&\fIexpr\d\s-2\&1\s+2\u\fP is used as a symbol table tag (nlist field \fIn_type\fP). .br .ti -5 \&\fIexpr\d\s-2\&2\s+2\u\fP seems to always be zero (nlist field \fIn_other\fP). .br .ti -5 \&\fIexpr\d\s-2\&3\s+2\u\fP is used for either the source line number, or for a nesting level (nlist field \fIn_desc\fP). .br .ti -5 \fIexpr\d\s-2\&4\s+2\u\fR is used as tag specific information (nlist field \fIn_value\fP). In the case of the \fB.stabd\fP directive, this expression is nonexistant, and is taken to be the value of the location counter at the following instruction. Since there is no associated name for a \fB.stabd\fP directive, it can only be used in circumstances where the name is zero. The effect of a \fB.stabd\fP directive can be achieved by one of the other \&\fB.stab\fPx directives in the following manner: .in -5m .DS \& \fB.stabs\fP \fIstring\fB,\fP expr\d\s-2\&1\s+2\u\fB,\fP expr\d\s-2\&2\s+2\u\fB,\fP expr\d\s-2\&3\s+2\u\fB,\fP \fP LL\fIn\fP LL\fIn\fP\fB:\fP .DE The \fB.stabd\fP directive is prefered, because it does not clog the symbol table with labels used only for the stab symbol entries. .in -5m .in -5m .NH 1 Machine instructions .PP The syntax of machine instruction statements accepted by \fIas\fP is generally similar to the syntax of \s8DEC MACRO\s10-32. There are differences, however. .NH 2 Character set .PP \fIas\fP uses the character `$' instead of `#', and the character `*' instead of `@'. Opcodes and register names are spelled with lower-case rather than upper-case letters. .NH 2 Lengths .PP Under certain circumstances, the following constructs are (optionallly) recognized by \&\fIas\fP to indicate the number of bytes to allocate for unresolved expressions used to specify displacement or indirect displacement addressing modes: .DS \&\fBB^\fP or \fBB\`\fP to indicate byte lengths (1 byte) \&\fBW^\fP or \fBW\`\fP to indicate word lengths (2 bytes) \&\fBL^\fP or \fBL\`\fP to indicate long word lengths (3 bytes) .DE One can also use lower case \fBb\fP, \fBw\fP or \fBl\fP instead of the upper case letters. There must be no space between the size specifier letter and the \fB^\fP or \&\fB\`\fP. The constructs \fBS^\fP and \fBG^\fP are not recognized by \fIas\fP as they are by the \s-2DEC\s+2 assembler. It is preferred to use the "\`" displacement specifier, so that the ``^'' is not misinterpreted as the \fBxor\fP operator. .PP Literal values (including floating-point literals used where the hardware expects a floating-point operand) are assembled as short literals if possible, hence not needing the \fBS^\fP \s-2DEC\s+2 directive. If the value of the displacement is known exactly in the first pass \fIas\fP determines the length automatically, assembling it in the shortest possible way, ignoring (if present) the length expression. If the value of the displacement is not known in the first pass, \&\fI\fP will use the value of the displacement given by the optional length specifier, or will use the value specified by the \fB\-d\fP argument, or will default to 4 bytes. .NH 2 CASE instructions .PP \fIas\fP considers the instructions \fBcaseb\fP, \fBcasel\fP, \fBcasew\fP to have three operands (namely: selector, base, limit). The displacements must be explicitly assembled using one or more \fB.word\fP statements. .NH 2 Extended branch instructions .PP These opcodes (formed in general by substituting a ``j'' for the initial ``b'' of the standard opcodes) take as branch destinations the name of a label in the current subsegment. If the destination is close enough then the corresponding ``b'' instruction is assembled. Otherwise the assembler choses a sequence of one or more instructions which together have the same effect as if the ``b'' instruction had a larger span. In general, \fIas\fP chooses the inverse branch followed by a \fBbrw\fP, but a \fBbrw\fP is sometimes pooled among several ``j'' instructions with the same destination. If the \fB\-J\fP assembler option is given, a \fBjmp\fP instruction is used instead of a \fBbrw\fP instruction for \fBALL\fP (!!) ``j'' instructions with distant destinations. This makes assembly of large (>32K bytes) assembly programs (inefficiently) possible. The current assembler does not try to use clever combinations of \fBbrb\fP, \fBbrw\fP and \fBjmp\fP instructions. The \fBjmp\fP instructions use PC relative addressing, with the length of the offset given by the ``\fB\-d\fP'' assembler option. .KS .DS .ft B .ta 1.0i 2.0i 3.0i jeql jeqlu jneq jnequ jgeq jgequ jgtr jgtru jleq jlequ jlss jlssu jbcc jbsc jbcs jbss jlbc jlbs jcc jcs jvc jvs jbc jbs jbr .DE .KE \fBjbr\fR turns into \fBbrb\fR if its target is close enough; else a \fBbrw\fP is used. .NH 1 Diagnostics .PP Diagnostics are intended to be self explanatory and appear on the standard output. .NH 1 Limits .DS .ta 2.0i Arbitrary Files to assemble Arbitrary Significant characters per name 127 Characters per input line 127 Characters per string Arbitrary Symbols 4 Text segments 4 Data segments .DE