4.1cBSD/usr/src/ucb/lisp/lisplib/manual/ch12.r








                        CHAPTER  12


                 Liszt - the lisp compiler







12.1.  General strategy of the compiler

     The purpose of the lisp compiler, Liszt, is  to  create
an  object  module  which  when brought into the lisp system
using _f_a_s_l will have the same  effect  as  bringing  in  the
corresponding  lisp  coded  source module with _l_o_a_d with one
important exception, functions will be defined as  sequences
of   machine  language  instructions,  instead  of  lisp  S-
expressions.  Liszt is not a function compiler, it is a _f_i_l_e
compiler.   Such  a  file    can  contain more than function
definitions; it can contain other lisp  S-expressions  which
are  evaluated at load time.  These other S-expressions will
also be stored in the object module produced  by  Liszt  and
will be evaluated at fasl time.

     As is almost universally true of  Lisp  compilers,  the
main pass of Liszt is written in Lisp.  A subsequent pass is
the assembler, for which we use the standard UNIX assembler.




12.2.  Running the compiler

     The compiler is normally run in this manner:
% liszt foo
will compile the file foo.l or foo  (the  preferred  way  to
indicate  a  lisp  source  file is to end the file name with
`.l').  The result of the compilation will be placed in  the
file  foo.o  if no fatal errors were detected.  All messages
which Liszt generates go to the standard  output.   Normally
each  function name is printed before it is compiled (the -q
option suppresses this).




12.3.  Special forms

     Liszt makes one pass over the source file. It processes
each form in this way:
9

9Liszt - the lisp compiler                               12-1







Liszt - the lisp compiler                               12-2


12.3.1.  macro expansion

     If the form is a macro invocation (i.e  it  is  a  list
whose  car  is  a symbol whose function binding is a macro),
then that macro invocation is expanded.   This  is  repeated
until  the  top  level form is not a macro invocation.  When
Liszt begins, there are already some macros defined, in fact
some  functions  (such  as  defun) are actually macros.  The
user may define his own macros as well.  For a macro  to  be
used  it  must  be defined in the Lisp system in which Liszt
runs.




12.3.2.  classification

     After all macro expansion is done, the form is  classi-
fied  according  to its _c_a_r (if the form is not a list, then
it is classified as an _o_t_h_e_r).




12.3.2.1.  eval-when

     The     form      of      eval-when      is      (_e_v_a_l-
_w_h_e_n (_t_i_m_e_1 _t_i_m_e_2 ...) _f_o_r_m_1 _f_o_r_m_2 ...)  where the time_i are
one of _e_v_a_l, _c_o_m_p_i_l_e, or _l_o_a_d.  The  compiler  examines  the
form_i in sequence and the action taken depends on what is in
the time list.  If _c_o_m_p_i_l_e is in the list then the  compiler
will  invoke  _e_v_a_l on each form_i as it examines it.  If _l_o_a_d
is in the list then the compile will recursively call itself
to  compile each form_i as it examines it.  Note that if _c_o_m_-
_p_i_l_e and _l_o_a_d are in the time list, then the  compiler  will
both  evaluate and compile each form.  This is useful if you
need a function to be defined in the compiler at  both  com-
pile  time  (perhaps to aid macro expansion) and at run time
(after the file is _f_a_s_led in).




12.3.2.2.  declare

     Declare is used to provide information about  functions
and  variables to the compiler. It is (almost) equivalent to
(_e_v_a_l-_w_h_e_n (_c_o_m_p_i_l_e) ...).  You may declare functions to  be
one  of three types: lambda (*expr), nlambda (*fexpr), lexpr
(*lexpr).  The names in parenthesis are  the  Maclisp  names
and  are accepted by the compiler as well (and not just when
the compiler is in Maclisp mode).  Functions are assumed  to
be  lambdas until they are declared otherwise or are defined


                                     Printed: March 23, 1982







Liszt - the lisp compiler                               12-3


differently. The compiler treats calls to lambdas and lexprs
equivalently,  so  you  needn't worry about declaring lexprs
either. It is important to declare nlambdas or  define  them
before  calling them.  Another attribute you can declare for
a function is localf which makes the  function  `local'.   A
local function's name is known only to the functions defined
within the file itself.  The advantage of a  local  function
is that is can be entered and exited very quickly and it can
have the same name as a function in another file  and  there
will be no name conflict.

     Variables may be declared special or unspecial.  When a
special  variable  is lambda bound (either in a lambda, prog
or do expression), its old value is stored away on  a  stack
for the duration of the lambda, prog or do expression.  This
takes time  and  is  often  not  necessary.   Therefore  the
default  classification  for  variables is unspecial.  Space
for unspecial variables is dynamically allocated on a stack.
An  unspecial  variable can only be accessed from within the
function where it is created by its presence  in  a  lambda,
prog  or  do  expression  variable  list.  It is possible to
declare that all variables are  special  as  will  be  shown
below.

     You may declare any number of things  in  each  declare
statement.  A sample declaration is
(_d_e_c_l_a_r_e
     (_l_a_m_b_d_a _f_u_n_c_1 _f_u_n_c_2)
     (*_f_e_x_p_r _f_u_n_c_3)
     (*_l_e_x_p_r _f_u_n_c_4)
     (_l_o_c_a_l_f _f_u_n_c_5)
     (_s_p_e_c_i_a_l _v_a_r_1 _v_a_r_2 _v_a_r_3)
     (_u_n_s_p_e_c_i_a_l _v_a_r_4))

     You may also declare all variables to be  special  with
(_d_e_c_l_a_r_e (_s_p_e_c_i_a_l_s _t)).   You may declare that macro defini-
tions should be compiled as well  as  evaluated  at  compile
time  by  (_d_e_c_l_a_r_e (_m_a_c_r_o_s _t)).   In  fact, as was mentioned
above, declare is much like (_e_v_a_l-_w_h_e_n (_c_o_m_p_i_l_e) ...).  Thus
if the compiler sees (_d_e_c_l_a_r_e (_f_o_o _b_a_r)) and foo is defined,
then it will evaluate (_f_o_o _b_a_r).  If foo is not defined then
an undefined declare attribute warning will be issued.




12.3.2.3.  (progn 'compile form1 form2 ... formn)

     When the compiler sees this it  simply  compiles  form1
through  formn  as  if they too were seen at top level.  One
use for this is to allow a macro at top-level to expand into
more  than  one function definition for the compiler to com-
pile.


                                     Printed: March 23, 1982







Liszt - the lisp compiler                               12-4


12.3.2.4.  include/includef

     _I_n_c_l_u_d_e and _i_n_c_l_u_d_e_f cause another file to be read  and
compiled  by the compiler.  The result is the same as if the
included file were  textually  inserted  into  the  original
file.   The  only difference between _i_n_c_l_u_d_e and _i_n_c_l_u_d_e_f is
that include doesn't  evaluate  its  argument  and  includef
does.  Nested includes are allowed.




12.3.2.5.  def

     A def form is used to define a  function.   The  macros
_d_e_f_u_n  and  _d_e_f_m_a_c_r_o  expand to a def form.  If the function
being defined is a lambda, nlambda or lexpr  then  the  com-
piler  converts the lisp definition to a sequence of machine
language instructions.  If the function being defined  is  a
macro,  then the compiler will evaluate the definition, thus
defining  the  macro  withing  the  running  Lisp  compiler.
Furthermore,  if  the  variable  _m_a_c_r_o_s  is set to a non nil
value, then the macro definition will also be translated  to
machine  language  and  thus will be defined when the object
file is fasled in.  The variable  _m_a_c_r_o_s  is  set  to  t  by
(_d_e_c_l_a_r_e (_m_a_c_r_o_s _t)).

     When a function or macro definition is compiled,  macro
expansion  is  done  whenever possible.  If the compiler can
determine that a form would be evaluated  if  this  function
were  interpreted then it will macro expand it.  It will not
macro expand arguments to a nlambda unless the  characteris-
tics  of  the  nlambda  is known (as is the case with _c_o_n_d).
The map functions (  _m_a_p,  _m_a_p_c,  _m_a_p_c_a_r,  and  so  on)  are
expanded  to a _d_o statement.  This allows the first argument
to the map function to be a lambda expression  which  refer-
ences local variables of the function being defined.




12.3.2.6.  other forms

     All other forms are simply stored in  the  object  file
and are evaluated when the file is _f_a_s_led in.




12.4.  Using the compiler

     The previous section describes exactly  what  the  com-
piler  does  with  its  input.   Generally you won't have to


                                     Printed: March 23, 1982







Liszt - the lisp compiler                               12-5


worry about all that detail as files which work  interpreted
will work compiled.  Following is a list of steps you should
follow to insure that a file will compile correctly.

[1]  Make sure all macro definitions precede  their  use  in
     functions  or other macro definitions.  If you want the
     macros to be around when you _f_a_s_l in  the  object  file
     you  should  include this statement at the beginning of
     the file: (_d_e_c_l_a_r_e (_m_a_c_r_o_s _t))

[2]  Make sure all nlambdas are defined or  declared  before
     they  are used.  If the compiler comes across a call to
     a function which has not been defined  in  the  current
     file, which does not currently have a function binding,
     and whose type  has not  been  declared  then  it  will
     assume that the function needs  its arguments evaluated
     (i.e. it is a lambda or lexpr) and will  generate  code
     accordingly.   This  means  that  you  do  not  have to
     declare nlambda functions like _s_t_a_t_u_s since  they  have
     an nlambda function binding.

[3]  Locate all variables which are used  for  communicating
     values  between  functions.   These  variables  must be
     declared special at the beginning of a file.   In  most
     cases  there  won't be many special declarations but if
     you fail to declare a variable special that should  be,
     the compiled code could fail in mysterious ways.  Let's
     look at a common problem, assume that a  file  contains
     just these three lines:


     (_d_e_f _a_a_a (_l_a_m_b_d_a (_g_l_o_b _l_o_c) (_b_b_b _l_o_c)))
     (_d_e_f _b_b_b (_l_a_m_b_d_a (_m_y_l_o_c) (_a_d_d _g_l_o_b _m_y_l_o_c)))
     (_d_e_f _c_c_c (_l_a_m_b_d_a (_g_l_o_b _l_o_c) (_b_b_b _l_o_c)))


     We can see that if we load  in  these  two  definitions
     then  (aaa  3 4) is the same as (add 3 4) and will give
     us 7.  Suppose we compile  the  file  containing  these
     definitions.   When  Liszt compiles aaa, it will assume
     that both glob and loc are  local  variables  and  will
     allocate  space on the temporary stack for their values
     when aaa is called.  Thus the values of the local vari-
     ables  glob  and  loc will not affect the values of the
     symbols glob and loc in the  Lisp  system.   Now  Liszt
     moves  on  to  function  bbb.   Myloc  is assumed to be
     local.  When it sees  the  add  statement,  it  find  a
     reference  to a variable called glob.  This variable is
     not a local variable to  this  function  and  therefore
     glob must refer to the value of the symbol glob.  Liszt
     will automatically declare glob to be  special  and  it
     will  print  a warning to that effect.  Thus subsequent
     uses of glob will always  refer  to  the  symbol  glob.


                                     Printed: March 23, 1982







Liszt - the lisp compiler                               12-6


     Next  Liszt  compiles  ccc and treats glob as a special
     and loc as a local.  When the object  file  is  _f_a_s_l'ed
     in, and (ccc 3 4) is evaluated, the symbol glob will be
     lambda bound to 3 bbb will be called and will return 7.
     However  (aaa  3 4) will fail since when bbb is called,
     glob will be unbound.  What should be done here  is  to
     put  (_d_e_c_l_a_r_e (_s_p_e_c_i_a_l _g_l_o_b)  at  the  beginning of the
     file.

[4]  Make sure that all calls to _a_r_g are  within  the  lexpr
     whose  arguments  they reference.  If _f_o_o is a compiled
     lexpr and it calls _b_a_r then _b_a_r cannot use _a_r_g  to  get
     at  _f_o_o's  arguments.   If  both _f_o_o and _b_a_r are inter-
     preted this will work however.  The macro  _l_i_s_t_i_f_y  can
     be  used  to put all of some of a lexprs arguments in a
     list which then can be passed to other functions.




12.5.  Compiler options

     The compiler recognizes a number of options  which  are
described below.  The options are typed anywhere on the com-
mand line preceded by a minus sign.  The entire command line
is  scanned  and  all  options recorded before any action is
taken.  Thus
% liszt -mx foo
% liszt -m -x foo
% liszt foo -mx
are all equivalent. Before scanning  the  command  line  for
options, liszt looks for in the environment for the variable
LISZT, and if found scans its value as if it was a string of
options.  The meaning of the options are:

C    The assembler language output of the compiler  is  com-
     mented.  This is useful when debugging the compiler and
     is not normally done since it slows down compilation.

i    Compile this program in interlisp  compatibility  mode.
     This is not implemented yet.

m    Compile this program in Maclisp mode.  The reader  syn-
     tax will be changed to the Maclisp syntax and a file of
     macro definitions will  be  loaded  in  (usually  named
     /usr/lib/lisp/machacks).   This switch brings us suffi-
     ciently  close  to  Maclisp  to  allow  us  to  compile
     Macsyma, a large Maclisp program.  However Maclisp is a
     moving target and we can't guarantee that  this  switch
     will allow you to compile any given program.

o    Select a different object file name.  This is the  only
     switch  with an argument, which must follow the switch.


                                     Printed: March 23, 1982







Liszt - the lisp compiler                               12-7


     For example
     % liszt foo -o xxx.o
     will compile foo and into xxx.o instead of the  default
     foo.o.

p    place profiling code at the beginning of each non-local
     function.  If the lisp system is also created with pro-
     filing in it, this allows function calling frequency to
     be determined (see _p_r_o_f(_1))

q    Run in quiet mode. The names of  functions  being  com-
     piled and various "Note"'s are not printed.

Q    print compilation statistics and warn of  strange  con-
     structs. This is the inverse of the q switch and is the
     default.

r    place bootstrap code at the  beginning  of  the  object
     file, which when the object file is executed will cause
     a lisp system to be invoked and the object file  _f_a_s_led
     in. This is known as `autorun' and is described below.

S    Create an assembler language file only.
     % liszt -S foo
     will create the file assembler language file foo.s  and
     will not attempt to assemble it.  If this option is not
     specified, the assembler language file will be  put  in
     the temporary disk area under a automatically generated
     name based on the lisp compiler's process id.  Then  if
     there  are no compilation errors, the assembler will be
     invoked to assemble the file.

T    Print the assembler language  output  on  the  standard
     output  file.   This  is useful when debugging the com-
     piler.

u    Run in UCI-Lisp mode.  The character syntax is  changed
     to  that of UCI-Lisp and a UCI-Lisp compatibility pack-
     age of macros is read in.

w    Suppress warning messages.

x    Create an cross reference file.
     % liszt -x foo
     not only compiles foo into foo.o but also generates the
     file  foo.x .   The  file  foo.x   is lisp readable and
     lists for each function all functions which that  func-
     tion  could  call.  The program lxref reads one or more
     of these ".x" files and produces a human readable cross
     reference listing.


9

9                                     Printed: March 23, 1982







Liszt - the lisp compiler                               12-8


12.6.  autorun

     The object  file which liszt writes  does  not  contain
all  the  functions  necessary to run the lisp program which
was compiled.  In order to use the object file, a lisp  sys-
tem must be started and the object file _f_a_s_led in.  When the
-r switch is given to liszt, the object  file  created  will
contain  a  small  piece of bootstrap code at the beginning,
and the object file will be made executable.  Now, when  the
name  of the object file is given to the UNIX command inter-
preter (shell) to run, the bootstrap code at  the  beginning
of  the  object  file will cause a lisp system to be started
and the first action the lisp system will  take is  to  _f_a_s_l
in  the  object file which started it.  In effect the object
file has created an environment in which it can run.

     Autorun is an alternative to _d_u_m_p_l_i_s_p.   The  advantage
of  autorun  is  that the object file which starts the whole
process is typically small, whereas the  minimum  _d_u_m_p_l_i_s_ped
file is very large (one half megabyte).  The disadvantage of
autorun is that the file must be _f_a_s_led  into  a  lisp  each
time  it is used whereas the file which _d_u_m_p_l_i_s_p creates can
be run as is.  liszt itself is a _d_u_m_p_l_i_s_ped file since it is
used  so often and is large enough that too much time  would
be wasted _f_a_s_ling it in each time it  was  used.   The  lisp
cross  reference  program, lxref, uses _a_u_t_o_r_u_n since it is a
small and rarely used program.

     In order to have the program _f_a_s_led in begin  execution
(rather  than  starting  a lisp top level), the value of the
symbol user-top-level should be set to the name of the func-
tion to get control.  An example of this is shown next.




















9

9                                     Printed: March 23, 1982







Liszt - the lisp compiler                               12-9



    ____________________________________________________

    _w_e _w_a_n_t _t_o _r_e_p_l_a_c_e _t_h_e _u_n_i_x _d_a_t_e _p_r_o_g_r_a_m _w_i_t_h _o_n_e _w_r_i_t_t_e_n _i_n _l_i_s_p.

    % cat lispdate.l
    (defun mydate nil
       (patom "The date is ")
       (patom (status ctime))
       (terpr)
       (exit 0))
    (setq user-top-level 'mydate)

    % liszt -r lispdate
    Compilation begins with Lisp Compiler 5.2
    source: lispdate.l, result: lispdate.o
    mydate
    %Note: lispdate.l: Compilation complete
    %Note: lispdate.l:  Time: Real: 0:3, CPU: 0:0.28, GC: 0:0.00 for 0 gcs
    %Note: lispdate.l: Assembly begins
    %Note: lispdate.l: Assembly completed successfully
    3.0u 2.0s 0:17 29%

     _W_e _c_h_a_n_g_e _t_h_e _n_a_m_e _t_o _r_e_m_o_v_e _t_h_e "._o", (_t_h_i_s _i_s_n'_t _n_e_c_e_s_s_a_r_y)
    % mv lispdate.o lispdate

     _N_o_w _w_e _t_e_s_t _i_t _o_u_t
    % lispdate
    The date is Sat Aug  1 16:58:33 1981
    %
    ____________________________________________________







12.7.  pure literals

     Normally  the  quoted  lisp  objects  (literals)  which
appear  in functions are treated as constants. Consider this
function:
(_d_e_f _f_o_o
   (_l_a_m_b_d_a _n_i_l (_c_o_n_d ((_n_o_t (_e_q '_a (_c_a_r (_s_e_t_q  _x  '(_a  _b)))))
                              (_p_r_i_n_t         '_i_m_p_o_s_s_i_b_l_e!!))
                     (_t (_r_p_l_a_c_a _x '_d)))))
At first glance it seems that the  first  cond  clause  will
never  be  true,  since the _c_a_r of (_a _b) should always be _a.
However if you  run  this  function  twice,  it  will  print
'impossible!!' the second time.  This is because the follow-
ing clause modifies  the  'constant'  list  (_a _b)  with  the
_r_p_l_a_c_a  function.  Such modification of literal lisp objects


                                     Printed: March 23, 1982







Liszt - the lisp compiler                              12-10


can cause programs to behave strangely as the above  example
shows,  but more importantly it can cause garbage collection
problems if done to compiled code.  When a  file  is  _f_a_s_led
in,  if the symbol $purcopylits is non nil, the literal lisp
data is put in 'pure' space, that is it put in  space  which
needn't  be  looked  at  by  the  garabage  collector.  This
reduces the work the garbage collector must  do  but  it  is
dangerous since if the literals are modified to point to non
pure objects, the marker may not mark the non pure  objects.
If the symbol $purcopylits is nil then the literal lisp data
is put in impure space and the compiled code will  act  like
the  interpreted  code  when  literal data is modified.  The
default value for $purcopylits is t.




12.8.  transfer tables

     A transfer table is setup by _f_a_s_l when the object  file
is  loaded in.  There is one entry in the transfer table for
each function which is called  in  that  object  file.   The
entry  for  a  call  to the function _f_o_o has two parts whose
contents are:

[1]  function address - This will  initially  point  to  the
     internal   function  _q_l_i_n_k_e_r.   It may some time in the
     future point to the function _f_o_o if certain  conditions
     are satisfied (more on this  below).

[2]  function name - This is a pointer to  the  symbol  _f_o_o.
     This will be used by _q_l_i_n_k_e_r.



When a call is made to the function _f_o_o the call will  actu-
ally  be made to the address in the transfer table entry and
will end up in the _q_l_i_n_k_e_r function.  _Q_l_i_n_k_e_r will determine
that _f_o_o was the function being called by locating the func-
tion name entry in the transfer table[].   If  the  function
being called is not compiled then _q_l_i_n_k_e_r just calls _f_u_n_c_a_l_l
to perform the function call.  If _f_o_o  is  compiled  and  if
(_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k)  is non nil, then _q_l_i_n_k_e_r will modify the
function  address  part  of  the  transfer  table  to  point
directly to the function _f_o_o.  Finally _q_l_i_n_k_e_r will call _f_o_o
directly .  The next time a call is made  to  _f_o_o  the  call
____________________
9   []_Q_l_i_n_k_e_r does this by tracing back the call stack  until
it finds the _c_a_l_l_s machine instruction which called it.  The
address field of the  _c_a_l_l_s  contains  the  address  of  the
transfer table entry.



9                                     Printed: March 23, 1982







Liszt - the lisp compiler                              12-11


will go directly to _f_o_o and not through _q_l_i_n_k_e_r.  This  will
result in a substantial speedup in compiled code to compiled
code transfers.  A disadvantage is that no debugging  infor-
mation  is  left on the stack, so _s_h_o_w_s_t_a_c_k and _b_a_k_t_r_a_c_e are
useless.  Another disadvantage is that  if  you  redefine  a
compiled function either through loading in a new version or
interactively defining it, then the old version may still be
called  from  compiled  code  if  the fast linking described
above has already been done.  The solution to these problems
is to use (_s_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k _v_a_l_u_e).  If value is

_n_i_l  All transfer tables will be cleared, i.e. all  function
     addresses  will be set to point to _q_l_i_n_k_e_r.  This means
     that the next time a function is called _q_l_i_n_k_e_r will be
     called  and will look at the current definition.  Also,
     no fast links will be set up  since  (_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k)
     will be nil.  The end result is that _s_h_o_w_s_t_a_c_k and _b_a_k_-
     _t_r_a_c_e will work and the function definition at the time
     of call will always be used.

_o_n   This causes the lisp system to go through all  transfer
     tables  and  set up fast links wherever possible.  This
     is normally used after you have _f_a_s_led in all  of  your
     files. Furthermore since (_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k) is not nil,
     _q_l_i_n_k_e_r will make  new  fast  links  if  the  situation
     arises  (which  isn't likely unless you _f_a_s_l in another
     file).

_t    This or any other value not previously  mentioned  will
     just  make  (_s_t_a_t_u_s _t_r_a_n_s_l_i_n_k)  be  non  nil,  and as a
     result fast links will  be  made   by  _q_l_i_n_k_e_r  if  the
     called function is compiled.




12.9.  Fixnum functions

     The compiler will generate inline arithmetic  code  for
fixnum  only functions.  Such functions include +, -, *,  /,
\, 1+ and 1-.  The code generated will be much  faster  than
using  _a_d_d,  _d_i_f_f_e_r_e_n_c_e,  etc.  However it will only work if
the arguments to and results of the functions  are  fixnums.
No type checking is done.








9

9                                     Printed: March 23, 1982