4.2BSD/usr/lib/lisp/manual/ch12.r
CHAPTER 12
Liszt - the lisp compiler
12.1. General strategy of the compiler
The purpose of the lisp compiler, Liszt, is to
create an object module which when brought into the
lisp system using _f_a_s_l will have the same effect as
bringing in the corresponding lisp coded source module
with _l_o_a_d with one important exception, functions will
be defined as sequences of machine language instruc-
tions, instead of lisp S-expressions. Liszt is not a
function compiler, it is a _f_i_l_e compiler. Such a file
can contain more than function definitions; it can
contain other lisp S-expressions which are evaluated
at load time. These other S-expressions will also be
stored in the object module produced by Liszt and will
be evaluated at fasl time.
As is almost universally true of Lisp compilers,
the main pass of Liszt is written in Lisp. A subse-
quent pass is the assembler, for which we use the
standard UNIX assembler.
12.2. Running the compiler
The compiler is normally run in this manner:
% liszt foo
will compile the file foo.l or foo (the preferred way
to indicate a lisp source file is to end the file name
with `.l'). The result of the compilation will be
placed in the file foo.o if no fatal errors were
detected. All messages which Liszt generates go to
the standard output. Normally each function name is
printed before it is compiled (the -q option
suppresses this).
12.3. Special forms
Liszt makes one pass over the source file. It
processes each form in this way:
9
9Liszt - the lisp compiler 12-1
Liszt - the lisp compiler 12-2
12.3.1. macro expansion
If the form is a macro invocation (i.e it is a
list whose car is a symbol whose function binding
is a macro), then that macro invocation is
expanded. This is repeated until the top level
form is not a macro invocation. When Liszt begins,
there are already some macros defined, in fact some
functions (such as defun) are actually macros. The
user may define his own macros as well. For a
macro to be used it must be defined in the Lisp
system in which Liszt runs.
12.3.2. classification
After all macro expansion is done, the form is
classified according to its _c_a_r (if the form is not
a list, then it is classified as an _o_t_h_e_r).
12.3.2.1. eval-when
The form of eval-when is (_e_v_a_l-
_w_h_e_n (_t_i_m_e_1 _t_i_m_e_2 ...) _f_o_r_m_1 _f_o_r_m_2 ...) where
the time_i are one of _e_v_a_l, _c_o_m_p_i_l_e, or _l_o_a_d.
The compiler examines the form_i in sequence and
the action taken depends on what is in the time
list. If _c_o_m_p_i_l_e is in the list then the com-
piler will invoke _e_v_a_l on each form_i as it exam-
ines it. If _l_o_a_d is in the list then the com-
pile will recursively call itself to compile
each form_i as it examines it. Note that if _c_o_m_-
_p_i_l_e and _l_o_a_d are in the time list, then the
compiler will both evaluate and compile each
form. This is useful if you need a function to
be defined in the compiler at both compile time
(perhaps to aid macro expansion) and at run time
(after the file is _f_a_s_led in).
12.3.2.2. declare
Declare is used to provide information
about functions and variables to the compiler.
It is (almost) equivalent to (_e_v_a_l-
_w_h_e_n (_c_o_m_p_i_l_e) ...). You may declare functions
to be one of three types: lambda (*expr),
nlambda (*fexpr), lexpr (*lexpr). The names in
parenthesis are the Maclisp names and are
Printed: July 25, 1983
Liszt - the lisp compiler 12-3
accepted by the compiler as well (and not just
when the compiler is in Maclisp mode). Func-
tions are assumed to be lambdas until they are
declared otherwise or are defined differently.
The compiler treats calls to lambdas and lexprs
equivalently, so you needn't worry about declar-
ing lexprs either. It is important to declare
nlambdas or define them before calling them.
Another attribute you can declare for a function
is localf which makes the function `local'. A
local function's name is known only to the func-
tions defined within the file itself. The
advantage of a local function is that is can be
entered and exited very quickly and it can have
the same name as a function in another file and
there will be no name conflict.
Variables may be declared special or unspe-
cial. When a special variable is lambda bound
(either in a lambda, prog or do expression), its
old value is stored away on a stack for the
duration of the lambda, prog or do expression.
This takes time and is often not necessary.
Therefore the default classification for vari-
ables is unspecial. Space for unspecial vari-
ables is dynamically allocated on a stack. An
unspecial variable can only be accessed from
within the function where it is created by its
presence in a lambda, prog or do expression
variable list. It is possible to declare that
all variables are special as will be shown
below.
You may declare any number of things in
each declare statement. A sample declaration is
(_d_e_c_l_a_r_e
(_l_a_m_b_d_a _f_u_n_c_1 _f_u_n_c_2)
(*_f_e_x_p_r _f_u_n_c_3)
(*_l_e_x_p_r _f_u_n_c_4)
(_l_o_c_a_l_f _f_u_n_c_5)
(_s_p_e_c_i_a_l _v_a_r_1 _v_a_r_2 _v_a_r_3)
(_u_n_s_p_e_c_i_a_l _v_a_r_4))
You may also declare all variables to be
special with (_d_e_c_l_a_r_e (_s_p_e_c_i_a_l_s _t)). You may
declare that macro definitions should be com-
piled as well as evaluated at compile time by
(_d_e_c_l_a_r_e (_m_a_c_r_o_s _t)). In fact, as was mentioned
above, declare is much like (_e_v_a_l-
_w_h_e_n (_c_o_m_p_i_l_e) ...). Thus if the compiler sees
(_d_e_c_l_a_r_e (_f_o_o _b_a_r)) and foo is defined, then it
will evaluate (_f_o_o _b_a_r). If foo is not defined
then an undefined declare attribute warning will
Printed: July 25, 1983
Liszt - the lisp compiler 12-4
be issued.
12.3.2.3. (progn 'compile form1 form2 ... formn)
When the compiler sees this it simply com-
piles form1 through formn as if they too were
seen at top level. One use for this is to allow
a macro at top-level to expand into more than
one function definition for the compiler to com-
pile.
12.3.2.4. include/includef
_I_n_c_l_u_d_e and _i_n_c_l_u_d_e_f cause another file to
be read and compiled by the compiler. The
result is the same as if the included file were
textually inserted into the original file. The
only difference between _i_n_c_l_u_d_e and _i_n_c_l_u_d_e_f is
that include doesn't evaluate its argument and
includef does. Nested includes are allowed.
12.3.2.5. def
A def form is used to define a function.
The macros _d_e_f_u_n and _d_e_f_m_a_c_r_o expand to a def
form. If the function being defined is a
lambda, nlambda or lexpr then the compiler con-
verts the lisp definition to a sequence of
machine language instructions. If the function
being defined is a macro, then the compiler will
evaluate the definition, thus defining the macro
withing the running Lisp compiler. Furthermore,
if the variable _m_a_c_r_o_s is set to a non nil
value, then the macro definition will also be
translated to machine language and thus will be
defined when the object file is fasled in. The
variable _m_a_c_r_o_s is set to t by
(_d_e_c_l_a_r_e (_m_a_c_r_o_s _t)).
When a function or macro definition is com-
piled, macro expansion is done whenever possi-
ble. If the compiler can determine that a form
would be evaluated if this function were inter-
preted then it will macro expand it. It will
not macro expand arguments to a nlambda unless
the characteristics of the nlambda is known (as
is the case with _c_o_n_d). The map functions (
Printed: July 25, 1983
Liszt - the lisp compiler 12-5
_m_a_p, _m_a_p_c, _m_a_p_c_a_r, and so on) are expanded to a
_d_o statement. This allows the first argument to
the map function to be a lambda expression which
references local variables of the function being
defined.
12.3.2.6. other forms
All other forms are simply stored in the
object file and are evaluated when the file is
_f_a_s_led in.
12.4. Using the compiler
The previous section describes exactly what the
compiler does with its input. Generally you won't
have to worry about all that detail as files which
work interpreted will work compiled. Following is a
list of steps you should follow to insure that a file
will compile correctly.
[1] Make sure all macro definitions precede their use
in functions or other macro definitions. If you
want the macros to be around when you _f_a_s_l in the
object file you should include this statement at
the beginning of the file: (_d_e_c_l_a_r_e (_m_a_c_r_o_s _t))
[2] Make sure all nlambdas are defined or declared
before they are used. If the compiler comes
across a call to a function which has not been
defined in the current file, which does not
currently have a function binding, and whose type
has not been declared then it will assume that
the function needs its arguments evaluated (i.e.
it is a lambda or lexpr) and will generate code
accordingly. This means that you do not have to
declare nlambda functions like _s_t_a_t_u_s since they
have an nlambda function binding.
[3] Locate all variables which are used for communi-
cating values between functions. These variables
must be declared special at the beginning of a
file. In most cases there won't be many special
declarations but if you fail to declare a vari-
able special that should be, the compiled code
could fail in mysterious ways. Let's look at a
common problem, assume that a file contains just
these three lines:
9
9 Printed: July 25, 1983
Liszt - the lisp compiler 12-6
(_d_e_f _a_a_a (_l_a_m_b_d_a (_g_l_o_b _l_o_c) (_b_b_b _l_o_c)))
(_d_e_f _b_b_b (_l_a_m_b_d_a (_m_y_l_o_c) (_a_d_d _g_l_o_b _m_y_l_o_c)))
(_d_e_f _c_c_c (_l_a_m_b_d_a (_g_l_o_b _l_o_c) (_b_b_b _l_o_c)))
We can see that if we load in these two defini-
tions then (aaa 3 4) is the same as (add 3 4) and
will give us 7. Suppose we compile the file con-
taining these definitions. When Liszt compiles
aaa, it will assume that both glob and loc are
local variables and will allocate space on the
temporary stack for their values when aaa is
called. Thus the values of the local variables
glob and loc will not affect the values of the
symbols glob and loc in the Lisp system. Now
Liszt moves on to function bbb. Myloc is assumed
to be local. When it sees the add statement, it
find a reference to a variable called glob. This
variable is not a local variable to this function
and therefore glob must refer to the value of the
symbol glob. Liszt will automatically declare
glob to be special and it will print a warning to
that effect. Thus subsequent uses of glob will
always refer to the symbol glob. Next Liszt com-
piles ccc and treats glob as a special and loc as
a local. When the object file is _f_a_s_l'ed in, and
(ccc 3 4) is evaluated, the symbol glob will be
lambda bound to 3 bbb will be called and will
return 7. However (aaa 3 4) will fail since when
bbb is called, glob will be unbound. What should
be done here is to put (_d_e_c_l_a_r_e (_s_p_e_c_i_a_l _g_l_o_b) at
the beginning of the file.
[4] Make sure that all calls to _a_r_g are within the
lexpr whose arguments they reference. If _f_o_o is
a compiled lexpr and it calls _b_a_r then _b_a_r cannot
use _a_r_g to get at _f_o_o's arguments. If both _f_o_o
and _b_a_r are interpreted this will work however.
The macro _l_i_s_t_i_f_y can be used to put all of some
of a lexprs arguments in a list which then can be
passed to other functions.
12.5. Compiler options
The compiler recognizes a number of options which
are described below. The options are typed anywhere
on the command line preceded by a minus sign. The
entire command line is scanned and all options
recorded before any action is taken. Thus
% liszt -mx foo
% liszt -m -x foo
Printed: July 25, 1983
Liszt - the lisp compiler 12-7
% liszt foo -mx
are all equivalent. Before scanning the command line
for options, liszt looks for in the environment for
the variable LISZT, and if found scans its value as if
it was a string of options. The meaning of the
options are:
C The assembler language output of the compiler is
commented. This is useful when debugging the
compiler and is not normally done since it slows
down compilation.
I The next command line argument is taken as a
filename, and loaded prior to compilation.
e Evaluate the next argument on the command line
before starting compilation. For example
% liszt -e '(setq foobar "foo string")' foo
will evaluate the above s-expression. Note that
the shell requires that the arguments be sur-
rounded by single quotes.
i Compile this program in interlisp compatibility
mode. This is not implemented yet.
m Compile this program in Maclisp mode. The reader
syntax will be changed to the Maclisp syntax and
a file of macro definitions will be loaded in
(usually named /usr/lib/lisp/machacks). This
switch brings us sufficiently close to Maclisp to
allow us to compile Macsyma, a large Maclisp pro-
gram. However Maclisp is a moving target and we
can't guarantee that this switch will allow you
to compile any given program.
o Select a different object or assembler language
file name. For example
% liszt foo -o xxx.o
will compile foo and into xxx.o instead of the
default foo.o, and
% liszt bar -S -o xxx.s
will compile to assembler language into xxx.s
instead of bar.s.
p place profiling code at the beginning of each
non-local function. If the lisp system is also
created with profiling in it, this allows func-
tion calling frequency to be determined (see
_p_r_o_f(_1))
q Run in quiet mode. The names of functions being
compiled and various "Note"'s are not printed.
9
9 Printed: July 25, 1983
Liszt - the lisp compiler 12-8
Q print compilation statistics and warn of strange
constructs. This is the inverse of the q switch
and is the default.
r place bootstrap code at the beginning of the
object file, which when the object file is exe-
cuted will cause a lisp system to be invoked and
the object file _f_a_s_led in. This is known as
`autorun' and is described below.
S Create an assembler language file only.
% liszt -S foo
will create the file assembler language file
foo.s and will not attempt to assemble it. If
this option is not specified, the assembler
language file will be put in the temporary disk
area under a automatically generated name based
on the lisp compiler's process id. Then if there
are no compilation errors, the assembler will be
invoked to assemble the file.
T Print the assembler language output on the stan-
dard output file. This is useful when debugging
the compiler.
u Run in UCI-Lisp mode. The character syntax is
changed to that of UCI-Lisp and a UCI-Lisp compa-
tibility package of macros is read in.
w Suppress warning messages.
x Create an cross reference file.
% liszt -x foo
not only compiles foo into foo.o but also gen-
erates the file foo.x . The file foo.x is lisp
readable and lists for each function all func-
tions which that function could call. The pro-
gram lxref reads one or more of these ".x" files
and produces a human readable cross reference
listing.
12.6. autorun
The object file which liszt writes does not con-
tain all the functions necessary to run the lisp pro-
gram which was compiled. In order to use the object
file, a lisp system must be started and the object
file _f_a_s_led in. When the -r switch is given to liszt,
the object file created will contain a small piece of
bootstrap code at the beginning, and the object file
will be made executable. Now, when the name of the
Printed: July 25, 1983
Liszt - the lisp compiler 12-9
object file is given to the UNIX command interpreter
(shell) to run, the bootstrap code at the beginning of
the object file will cause a lisp system to be started
and the first action the lisp system will take is to
_f_a_s_l in the object file which started it. In effect
the object file has created an environment in which it
can run.
Autorun is an alternative to _d_u_m_p_l_i_s_p. The
advantage of autorun is that the object file which
starts the whole process is typically small, whereas
the minimum _d_u_m_p_l_i_s_ped file is very large (one half
megabyte). The disadvantage of autorun is that the
file must be _f_a_s_led into a lisp each time it is used
whereas the file which _d_u_m_p_l_i_s_p creates can be run as
is. liszt itself is a _d_u_m_p_l_i_s_ped file since it is
used so often and is large enough that too much time
would be wasted _f_a_s_ling it in each time it was used.
The lisp cross reference program, lxref, uses _a_u_t_o_r_u_n
since it is a small and rarely used program.
In order to have the program _f_a_s_led in begin exe-
cution (rather than starting a lisp top level), the
value of the symbol user-top-level should be set to
the name of the function to get control. An example
of this is shown next.
9
9 Printed: July 25, 1983
Liszt - the lisp compiler 12-10
____________________________________________________
_w_e _w_a_n_t _t_o _r_e_p_l_a_c_e _t_h_e _u_n_i_x _d_a_t_e _p_r_o_g_r_a_m _w_i_t_h _o_n_e _w_r_i_t_t_e_n _i_n _l_i_s_p.
% cat lispdate.l
(defun mydate nil
(patom "The date is ")
(patom (status ctime))
(terpr)
(exit 0))
(setq user-top-level 'mydate)
% liszt -r lispdate
Compilation begins with Lisp Compiler 5.2
source: lispdate.l, result: lispdate.o
mydate
%Note: lispdate.l: Compilation complete
%Note: lispdate.l: Time: Real: 0:3, CPU: 0:0.28, GC: 0:0.00 for 0 gcs
%Note: lispdate.l: Assembly begins
%Note: lispdate.l: Assembly completed successfully
3.0u 2.0s 0:17 29%
_W_e _c_h_a_n_g_e _t_h_e _n_a_m_e _t_o _r_e_m_o_v_e _t_h_e "._o", (_t_h_i_s _i_s_n'_t _n_e_c_e_s_s_a_r_y)
% mv lispdate.o lispdate
_N_o_w _w_e _t_e_s_t _i_t _o_u_t
% lispdate
The date is Sat Aug 1 16:58:33 1981
%
____________________________________________________
12.7. pure literals
Normally the quoted lisp objects (literals) which
appear in functions are treated as constants. Consider
this function:
(_d_e_f _f_o_o
(_l_a_m_b_d_a _n_i_l (_c_o_n_d ((_n_o_t (_e_q '_a (_c_a_r (_s_e_t_q _x '(_a
_b)))))
(_p_r_i_n_t '_i_m_p_o_s_s_i_b_l_e!!))
(_t (_r_p_l_a_c_a _x '_d)))))
At first glance it seems that the first cond clause
will never be true, since the _c_a_r of (_a _b) should
always be _a. However if you run this function twice,
it will print 'impossible!!' the second time. This is
Printed: July 25, 1983
Liszt - the lisp compiler 12-11
because the following clause modifies the 'constant'
list (_a _b) with the _r_p_l_a_c_a function. Such modifica-
tion of literal lisp objects can cause programs to
behave strangely as the above example shows, but more
importantly it can cause garbage collection problems
if done to compiled code. When a file is _f_a_s_led in,
if the symbol $purcopylits is non nil, the literal
lisp data is put in 'pure' space, that is it put in
space which needn't be looked at by the garabage col-
lector. This reduces the work the garbage collector
must do but it is dangerous since if the literals are
modified to point to non pure objects, the marker may
not mark the non pure objects. If the symbol $pur-
copylits is nil then the literal lisp data is put in
impure space and the compiled code will act like the
interpreted code when literal data is modified. The
default value for $purcopylits is t.
12.8. transfer tables
A transfer table is setup by _f_a_s_l when the object
file is loaded in. There is one entry in the transfer
table for each function which is called in that object
file. The entry for a call to the function _f_o_o has
two parts whose contents are:
[1] function address - This will initially point to
the internal function _q_l_i_n_k_e_r. It may some time
in the future point to the function _f_o_o if cer-
tain conditions are satisfied (more on this
below).
[2] function name - This is a pointer to the symbol
_f_o_o. This will be used by _q_l_i_n_k_e_r.
When a call is made to the function _f_o_o the call will
actually be made to the address in the transfer table
entry and will end up in the _q_l_i_n_k_e_r function.
_Q_l_i_n_k_e_r will determine that _f_o_o was the function being
called by locating the function name entry in the
transfer table[]. If the function being called is not
compiled then _q_l_i_n_k_e_r just calls _f_u_n_c_a_l_l to perform
____________________
9 []_Q_l_i_n_k_e_r does this by tracing back the call stack until
it finds the _c_a_l_l_s machine instruction which called it. The
address field of the _c_a_l_l_s contains the address of the
transfer table entry.
9 Printed: July 25, 1983
Liszt - the lisp compiler 12-12
the function call. If _f_o_o is compiled and if
(_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k) is non nil, then _q_l_i_n_k_e_r will
modify the function address part of the transfer table
to point directly to the function _f_o_o. Finally
_q_l_i_n_k_e_r will call _f_o_o directly . The next time a call
is made to _f_o_o the call will go directly to _f_o_o and
not through _q_l_i_n_k_e_r. This will result in a substan-
tial speedup in compiled code to compiled code
transfers. A disadvantage is that no debugging infor-
mation is left on the stack, so _s_h_o_w_s_t_a_c_k and _b_a_k_t_r_a_c_e
are useless. Another disadvantage is that if you
redefine a compiled function either through loading in
a new version or interactively defining it, then the
old version may still be called from compiled code if
the fast linking described above has already been
done. The solution to these problems is to use
(_s_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k _v_a_l_u_e). If value is
_n_i_l All transfer tables will be cleared, i.e. all
function addresses will be set to point to
_q_l_i_n_k_e_r. This means that the next time a func-
tion is called _q_l_i_n_k_e_r will be called and will
look at the current definition. Also, no fast
links will be set up since (_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k)
will be nil. The end result is that _s_h_o_w_s_t_a_c_k
and _b_a_k_t_r_a_c_e will work and the function defini-
tion at the time of call will always be used.
_o_n This causes the lisp system to go through all
transfer tables and set up fast links wherever
possible. This is normally used after you have
_f_a_s_led in all of your files. Furthermore since
(_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k) is not nil, _q_l_i_n_k_e_r will make
new fast links if the situation arises (which
isn't likely unless you _f_a_s_l in another file).
_t This or any other value not previously mentioned
will just make (_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k) be non nil, and
as a result fast links will be made by _q_l_i_n_k_e_r
if the called function is compiled.
12.9. Fixnum functions
The compiler will generate inline arithmetic code
for fixnum only functions. Such functions include +,
-, *, /, \, 1+ and 1-. The code generated will be
much faster than using _a_d_d, _d_i_f_f_e_r_e_n_c_e, etc. However
it will only work if the arguments to and results of
the functions are fixnums. No type checking is done.
9
9 Printed: July 25, 1983