AUSAM/source/mac/macdoc/mac.nr

.pl 72
.RP
.tr ~~
.tc
.ND
.sp 5
.TL
M A C
.sp 2
Multiple Assembly-language
Compiler
.sp 3
.AU
Ross Nealon
.AI
University of Wollongong.
.AB
MAC is a generalised cross-assembler,
table driven, with a finite-state parsing
algorithm built in. MAC accepts a description
of the target machine's architecture, a
list of the symbolic opcodes and their values,
and a list of bitwise instruction formats.
MAC generates output in three forms -
listings, code dumps, and loader format object files.
MAC comes equipped with a table-formatting
program, to format r-files from symbolic
descriptions.
.sp
.ti +3
This document comprises the first part
of the description of the MAC cross-assembler.
The second part describes the table formatter
for producing a target machine description for MAC,
and the third part describes the system from an implementors
point of view.
.AE
.NH 1
GENERAL DESCRIPTION
.PP
MAC is a two-pass finite state assembler
that accepts symbolic assembly-language
statements, and performs a one-to-one
translation, compiling each instruction into
its equivalent binary machine code.
.PP
MAC will read source statements until
end-of-file, or until an 'end' directive is
encountered.
Any error in the first pass will cause termination
of the assembly at the end of the first pass.
.PP
Pass one is devoted to scanning the source line,
performing syntax checking, re-coding the source and
building a file of intermediate coded source (intercode)
one record per source line, and building the symbol
table, a list of labels and their values.
.PP
Pass two re-reads the intercode records and uses the
entries in the symbol table to generate the binary
machine code, and produce useable output in the
form of listings, code dumps, and loader format
object files.
These formats are described in a later section of this manual.
.PP
The programmer wishing to use MAC must supply MAC with
the name of a pre-formatted file, containing the
description of the target, the parser table and so on.
These pre-formatted files (r-files) reside in the
library '/usr/lib/mac'.
It is only necessary to provide the name of a file
in the library,
as MAC first searches the current directory,
then the library for the file named.
.PP
MAC generates listings by re-reading
the source code, so that an assembly of source
from the standard input cannot generate a listing.
.bp
.NH 1
SYNOPSIS
.sp
.PP
mac [-opts [...]] r-file [source] [object]
.sp 2
.po +8
.IP "opts:" 12
Options appear one per argument.
.sp
-l:  Generate source/code listing
.br
-d:  Generate code dump
.br
-a:  Generate loader object file
.br
-s:  Print symbol table
.br
-f:  Don't use form-feeds
.br
-h:  Data output in Hexadecimal (default)
.br
-o:  In octal
.br
-b:  In binary
.br
-u:  Do not unlink temp file
.br
-e:  Supress ALL error messages
.sp
.IP "r-file:" 12
Name  of a file  containing a set of
pre-formatted tables for MAC.
.sp
.IP "source:" 12
The name of a file containing the source
to be assembled.  If not present, MAC will
read from the standard input.
.sp
.IP "object:" 12
If present, MAC assumes '-a' option, and
generates a loader format object file
with that name.
If '-a' is on and this argument is not
present, MAC uses the name "a.out".
.sp 2
.IP "examples:" 12
mac -l -s m6800 vdu.s vdu.o
.br
mac z80 t.s >dump
.br
mac -a -n 8080 copy.s
.br
.po -8
.bp
.NH 1
SOURCE LINE
.sp
.PP
[label]  [opcode [args]]  [;comment]
.sp 2
.po +8
.IP "label:" 12
A previously non-defined label can tag
the start of each source line. The label
is then defined to have the value of the current
location counter.
The label must start in column one,
otherwise it is treated as an opcode.
.sp
.IP "opcode:" 12
A special operation code symbolic or a
pseudo opcode to indicate what code to
generate. If this field is not present,
the args field is not allowed.
.sp
.IP "args:" 12
This field is made up of expressions,
literals, special characters and delimiters,
with no intervening spaces or tabs. This
field adds to the information describing
the code to be generated.
This field is often called the
argument "picture".
.sp
.IP "comment:" 12
All characters after a ';' (not in a
character constant) are treated as a comment
and are ignored.  A comment
terminates scan of the source line.
.sp 2
.PP
All intervening white-space separating fields
on the source line may be any number
of blanks and/or tabs.
.br
.po -8
.bp
.NH 1
LITERALS
.PP
A literal is a label, which has been defined (in
the r-file) as reserved. This label cannot be
used in expressions, as it has no value.
It cannot be set to a value.
Literals are useful only to recognise argument
pictures on instructions.
Refer to the description of the r-file
that you will be using for a list of defined
literals.
.sp 2
.NH 1
LABELS
.PP
A label is from one to eight lowercase alpha-alpha/numeric
characters. The first character must be from the set
{a-z @ _ .}
and may optionally be followed by one to seven
alpha/numerics {a-z @ _ . 0-9}.
.sp
.IP "examples:" 12
.sp
ll1     .sin     .mul
.br
@reg1   __acc    _._
.br
ret     loop1    ent27
.br
.sp 2
.NH 1
OPERATORS
.PP
Mac recognises the following operators:-
.sp
.po +12
.DS
+     addition  (binary or unary)
-     subtraction  (binary or unary)
*     multiplication
/     division
%     modulus
|     bit-wise logical or
&     bit-wise logical and
\~     exclusive or
>     right-shift
<     left-shift
.DE
.po -12
.sp 2
.PP
The unary operators plus (+), minus (-), and
complement
(\~)
may be added to prefix any term
in an expression.  Note that only one unary
operator per term is allowed.
.bp
.NH 1
EXPRESSIONS
.PP
An expression is an unparenthesized
list of labels, constants, location
counter symbols (!) called terms; and
operators.
An expression must consist of at least
one term, optionally followed by
one or more operator-term pairs.
Any term may be optionally prefixed
by a unary operator.
.sp 2
.PP
EXAMPLES:
.sp 2
.DS
ll+1
entr-adc+isp/4
entr-adc+isp>2
-mask|e_bit
.sp 2
NOTE:
        2+3*4   is evaluated as (2+3)*4,
                not 2+(3*4).
.DE
.bp
.NH 1
CONSTANTS
.PP
Numeric constants can be described as C-type
constants.  A constant beginning with {1-9} is
interpreted as decimal, beginning with 0 is
interpreted as octal,
beginning with 0x as hexadecimal, and
0b as binary.
.sp
.DS
    examples:-
.sp
        123     55      91234        (decimal)
        0       0666    077777       (octal)
        0xf618  0xff    0x34ac       (hex)
        0b101   0b111011011011       (binary)
.DE
.sp
.PP
Negative constants are obtained by combining a positive
constant with the unary negation operator (-).
.sp
.PP
Character constants are enclosed in single quotes,
and are treated as small integer constants,
with their values being made up of a concatenation
of their respective ASCII values.
The C-language escape conventions apply,
\\n => newline, \\r => carriage return,
\\f => form-feed, \\b => backspace,
\\t => tab, and \\0 => ASCII NUL.
.sp
.PP
Strings are enclosed in double quotes, and have no
real numeric value. They cannot be
used in expressions. The characters of the string
are assembled one per consecutive byte in memory.
Strings are only useful as title information
(See later - pseudo opcode 'title') and
for the special definition of constants
(See later - pseudo opcode 'dc').
.PP
Strings are limited to thirty characters in length,
but space exists in the listing to display ten.
Any more than ten characters,
and the remainder will not be listed.
This does not affect the code generated.
.bp
.NH 1
PSEUDO OPCODES
.sp 2
.PP
A special type of opcode symbolic is the pseudo
opcode. These are always defined to the assembler,
and provide the user with the means to control
the location counters, their values, code generation,
constant and storage definition and listings.
.sp 2
.po +8
.IP "eject:" 12
If a listing is being generated,
skip to the top of a new page and
output title and header information.
.sp
[label]     eject
.sp
.IP "title:" 12
Set title information, and perform 'eject's function.
.sp
[label]     title   "[string]"
.br
.sp
.IP "end:" 12
End of source code indicator. This will
terminate pass one, check the symbol table
for errors (undefined labels etc.) and reset
various states within the assembler in
preparation for pass two.
Any source code present after an 'end' directive
will be ignored.
.sp
[label]   end
.sp
.IP "seg:" 12
Seg selects a particular location counter to use.
MAC comes equipped with eight distinct location
counters, any of which may be selected.
Each location counter will generate
a distinct segment of code.
Location counter 0 is the default.
Segments may be interleaved within the source
code, MAC will assemble each segment distinctly.
Each segment exists exclusively of any others.
.sp
[label]   seg     expression
.sp
.IP "equ:" 12
Equ equates the label tag to the defined expression.
This expression must be defined before this
instruction is processed.
.sp
label     equ     expression
.bp
.IP "org:" 12
Org (abbreviation for origin) is used to set the current
location counter's value.
.sp
[label]   org     expression
.sp
.IP "align:" 12
Align sets the current location counter
to the next even multiple of the argument
expression. No alignment is performed
if it is not necessary.
.sp
[label]   align   expression
.sp
.IP "global:" 12
This is only useful
when an object file is being generated,
as any global label is dumped with the
code in a symbol table at the end of the object file.
.sp
[label1]  global  label2
.sp
.IP "ds:" 12
Defines a number of null bytes of storage
specified by the argument expression.
.sp
[label]   ds      expression
.sp
.IP "dc:" 12
Define constant allows the definition
of a constant in memory with a value equal
to the argument expression.
The format of the constant in memory
is dependant upon the format described in
the r-file. Refer to the write-up
on the r-file that you will be using.
If the argument is a string, then each
character of the string is assembled
(its ASCII value) into consecutive bytes
of memory.
.sp
[label]   dc      expression
.br
[label]   dc      "string"
.bp
.IP "struc:" 12
Struc allows the user to create "structures"
that are really labels equated to offsets
from the start of the structure.
The general form of struct is a label, and
the storage in bytes for the label.
The label will be set to the value of the 
current offset, and the offset counter
then incremented by the storage length.
This is equivalent to the C
struct declaration.
If the label is omitted, then the
structure offset counter is incremented.
.sp
[label]   struc   expression
.sp
.IP "ends:" 12
Ends equates the label tag (if given)
to the value of the structure offset
counter and then resets the counter to zero.
The next struc pseudo op will therefore
start a new structure definition.
If the label tag is present, it will be
given a value corresponding to the size of the
structure in bytes.
.sp
[label]   ends
.sp 2
.po -8
.PP
Up to four other special define constant pseudo
opcodes can be declared in the r-file.
These may exist to define funny length
constants (e.g.- double word) or funny format
constants (e.g.- two bytes, with the bytes swapped).
The user should consult
the description of his r-file, to find the
exact nature and name of each of the dc's.
Generally - they are of the form dc?, where
? is any legal alpha/numeric.
.bp
.NH 1
ERRORS
.sp 2
.PP
MAC generates three types of error messages:
non-fatal warnings, severe errors that will
cause incorrect code generation, and fatal
internal errors.
.PP
No action is taken on warnings, severe errors
cause termination of the current pass (one or two),
and fatal errors cause immediate termination of
the assembly.
The error messages appear before the
line in error.
.sp 2
.NH 2
WARNINGS
.IP "assemble overflow" 5
.br
An expression has been assembled which
is numerically too large to
fit into the assigned space in the instruction.
Check the r-file write-up as to the
exact format of the instruction in error.
.IP "Listing from std. input impossible!!" 5
.br
Since MAC re-reads the source code to generate
a listing, re-reading from the standard input
is impossibe, hence no listing can be generated.
The -l option is turned off.
.IP "no end stmt" 5
.br
End-of-file has been encountered,
no end directive seen.
.sp 2
.NH 2
SEVERE
.IP "bad argument" 5
.br
Part of an expression is not a label,
a constant or the location counter symbol '!'.
(Possible control character.)
.IP "dc not allowed" 5
.br
The special dc pseudo op (with no
identifying character) has been used
when not declared in the r-file.
Check the r-file write-up.
.IP "delimiter unexpected" 5
.br
Some delimiter has been encountered
unexpectedly in an expression.
.IP "div by zero" 5
.br
An attempt to divide by zero has been
trapped.  Check validity of the
expression(s) on the instruction argument(s).
.IP "expression required" 5
.br
An expression is required as part of the
instruction's arguments.
.bp
.IP "illegal instruction" 5
.br
An attempt has been made to assemble an
instruction that has no legal opcode value.
This means that the argument picture
used on this instruction is illegal,
this instruction cannot have this
format of arguments.
.IP "label required" 5
.br
A label is required for the global
pseudo op.
The label must appear in the argument picture,
and must not be part of an expression.
.IP "label tag required" 5
.br
A label tag starting in column one
is required here.
.IP "label undefined" 5
.br
An attempt has been made to use a label
in an expression, but as yet is has not been
defined and has no value.
.IP "missing argument" 5
.br
An expression is expected in the argument field,
but none was found.
.IP "mod by zero" 5
.br
An attempt has been made to find the value of an
expression modulus zero.
.IP "multi def. label" 5
.br
The label tag on this line has been
previously defined.
.IP "negative ds" 5
.br
The result of the expression argument
to the ds pseudo instruction is negative.
MAC cannot define a negative number of
storage bytes.
.IP "negative org" 5
.br
The location counters cannot be set to
a negative value.
.IP "no such location counter" 5
.br
The argument expression to the seg
pseudo op is not in the allowable range.
(0 to 7 currently).
.IP "op not found" 5
.br
The opcode symbolic on this line is not a legal
symbolic for this r-file. Check the r-file
write-up.
.IP "syntax error" 5
.br
The argument picture for this instruction
is syntactically incorrect.
(E.g.- two adjacent operators, no delimiters
separating expressions, etc.)
.IP "title not a string" 5
.br
The argument to the title pseudo opcode
must be a string. Null strings ("")
turn the title off.
.bp
.IP "Undefined labels" 5
.br
Self explanatory.
.IP "wrong # of args" 5
.br
The instruction being assembled requires more
or fewer expressions in the argument picture
than have been given. Consult the r-file write-up.
.sp 2
.NH 2
FATAL
.IP "buffer overflow" 5
.br
More than the maximum buffer size of characters
has been typed on one line.
.IP "Can't create a.out" 5
.br
A.out file exists and is protected, or cannot be
created in this directory.
No object is generated.
.IP "Can't create object file" 5
.br
As for a.out.
.IP "Can't create temp" 5
.br
The temporary file for intermediate source
code cannot be created in this directory,
or one exists and is protected.
.IP "Can't open r-file" 5
.br
MAC cannot locate the named r-file.
Check that it exists in your directory,
or in /usr/lib/mac.
.IP "Can't re-open source" 5
.br
Something has happened to the source file
since it was last read. A listing cannot be
generated.
.IP "Can't re-open temp" 5
.br
This is similar to re-open source,
but causes MAC to halt.
.IP "Can't find source file" 5
.br
Input source file cannot be opened.
.IP "corrupted format descriptor <item>" 5
.br
This means that the r-file in use
has been corrupted. Report this error
as soon as practical.
.IP "Errors in pass 1." 5
.br
Self explanatory. Pass two is inhibited.
.IP "internal error mode <scan-mode>" 5
.br
MAC's internal table lookup routines
have been called in error.
Report this as soon as practical.
.bp
.IP "label <name> is undefined" 5
.br
The named label has been
referenced in the source program,
but never defined.
.IP "no core for assembly" 5
.br
The program being assembled is
so large that it cannot be assembled in the
host machines memory.
(Split the program into smaller pieces,
and assemble each independantly.)
.IP "Pass 1 non-existant action" 5
.br
MAC's internal pass one parser table
or MAC itself is bad. Report this as soon as is practical.
.IP "Symbol table overflow." 5
.br
Too many labels are being defined.
MAC cannot get enough of the host machine's
memory to define them all.
.IP "Usage: <name> opcode-file [source] [object]" 5
.br
Incorrect parameters on the call.
Usually r-file missing.
.bp
.NH 1
LISTING FORMAT
.sp
.PP
loc.-counter   code   line-#   source
.sp
.PP
The location counter field displays
the value of the location counter before
the next instruction is assembled.
.PP
The code field is the actual assembled code
from the following source instruction.
When using 'dc' to assemble strings, any
more than ten characters per string will
cause MAC to truncate the listing of the string
to ten characters, and not list the remaining
characters of the string. This does not affect
the code being generated in an object file.
For listing purposes, it is best to define long strings as
several short strings.
.PP
The line number is the source line number
and is useful for locating lines in
error.
.PP
The source field is the actual source code as
seen by MAC.
.sp 3
.NH 1
DUMP FORMAT
.sp
.PP
Code is dumped to the standard output
with the segment number, start address and
segment length.
The code is then dumped in the default format
(hex, octal or binary).
.sp 3
.NH 1
OBJECT FILE FORMAT
.sp 1
.PP
The format of object files is generally unimportant,
as several linkage editors or loaders exist for
the various machines that MAC can currently assemble
code for.
The user is therefore advised to consult the
manual concerning the particular
loader that he or she will use.
.bp