4.3BSD/usr/contrib/icon/docs/tr84-14

.so tmac.tr
.DA "August 22, 1984"
.TR 84-14
.Gr
.TL
Personalized Interpreters for Icon
.AU
Ralph E. Griswold
.AU
Robert K. McConeghy
.AU
William H. Mitchell
.AE
.tr *\(**
.NH
Introduction
.PP
Despite the fact that the Icon programming language has a large repertoire
of functions and operations for string and list manipulation, as
well as for more conventional computations [1], users frequently
need to extend that repertoire. While many extensions can be
written as procedures that build on the existing repertoire, there
are some kinds of extensions for which this approach is unacceptably
inefficient, inconvenient, or simply impractical.
.PP
Icon itself is written primarily in C [2] and its built-in functions
are written as corresponding C functions. Thus the natural way to
extend Icon's computational repertoire is to add new C functions to it.
.PP
The Icon system is organized so that this is comparatively easy to do.
Adding a new function
does not require changes to the Icon translator,
since all functions have a common syntactic form. An entry must be made in
a table that is used by the linker and the run-time system in order to
identify built-in functions and connect references to them to the
code itself.
.PP
The problem arises in incorporating the C code in the Icon run-time
system. Prior to Version 5.9 of Icon, there were two separate but
similar implementations of Icon: a compiler [3] and an interpreter [4].
The primary difference between the two systems is that the linker for
the compiler generates assembly-language code, while the linker
for the interpreter generates code that is ready to be interpreted.
The interpreter
uses a
preconstructed run-time
system, so that the assembly and loading phases of the compiler implementation
is not needed.
.PP
The loading phase in the compiler is quite slow, so that when the
compiler implementation of Icon is used, there is a substantial delay
before getting into execution. This is a significant problem during
program debugging. Furthermore, a compiled Icon program runs only
slightly faster than an interpreted Icon program. This is due in large part
to the fact that most programs spend only a small percentage
of their time in
code generated for the program itself; most of the time is spent
executing code in the run-time system, which is essentially the
same in the two implementations.
.PP
The primary advantage of the compiler is that it is possible to
add new functions during the loading phase. In order to communicate
the names of new functions to the linker, it is necessary to
include ``external'' declarations in the Icon source programs that
use these functions. There is no way to do this in the interpreter
implementation, since the run-time system is preconstructed, rather
than being built when
the source-language program is processed.
.PP
One disadvantage of the external function approach is that every source
program that uses an external function must contain a declaration for
that function. In addition to the necessity for having to remember
these declarations, external functions are, by their nature,
not logically part of Icon proper. This results in problems of documentation
and distribution of such functions to other users.
.PP
An alternative method of adding new functions to either the
compiler or the interpreter implementation of Icon is to
add the corresponding C functions to the Icon system itself
and to rebuild the entire system. This approach is impractical
for many applications. If the extensions are not of general
interest, it is inappropriate to include them in the public
version of Icon. On the other hand, Icon is a large and complicated system,
and having many private versions may create serious problems of
maintenance and disk usage. Furthermore, rebuilding the Icon system
is slow, cumbersome, and comparatively complicated. This approach therefore
is impractical in a situation
such as a class in which students implement their own versions
of an extension.
.PP
To remedy these problems, a mechanism for building ``personalized
interpreters'' has been added to Version 5.9 of Icon. This mechanism
allows a user to add C functions and to build a corresponding
interpreter quickly, easily, and without the necessity to have
a copy of the source code for the entire Icon system.
.PP
To construct a personalized interpreter, the user must perform
a one-time set up that copies relevant source files to a
directory specified by the user and builds the nucleus of a run-time system. Once this is
done, the user can add and modify C functions and include them
in the personalized run-time system with little effort.
.PP
Since the linker must know the names of built-in functions,
a personalized linker is constructed. In order to run
Icon programs with the personalized run-time system, a
personalized command processor, which knows the location of
the personalized linker and run-time system, is provided also.
.PP
The modifications that can be made to Icon via a personalized
interpreter essentially are limited to the run-time system: the
addition of new functions, modifications to existing functions
and operations, and modifications and additions to support routines. There
is no provision for changing the syntax of Icon, incorporating
new operators, keyword, or control structures.
.NH
Building and Using a Personalized Interpreter
.NH 2
Setting Up a Personalized Interpreter System
.PP
To set up a personalized interpreter, a new directory should
be created solely for the use of the interpreter; otherwise
files may be accidentally destroyed by the set-up process.
For the purpose of example, suppose this directory is
named \*Mmyicon\fR. The set-up process consists of
.Ds
mkdir myicon
cd myicon
icon\-pi
.De
Note that \*Micon\-pi\fR must be run in the area in which the personalized
interpreter is to be built.
The
location of \*Micon\-pi\fR may vary from site to site [5].
.PP
The shell script \*Micon\-pi\fR constructs three subdirectories:
\*Mh\fR, \*Mstd\fR, and \*Mpi\fR. The subdirectory \*Mh\fR
contains header files that are needed in C routines. The subdirectory
\*Mstd\fR contains the portions of the Icon system that are needed
to build a personalized interpreter. The subdirectory \*Mpi\fR
contains a \*MMakefile\fR for building a personalized interpreter
and also is the place where source code for new C functions normally
resides. Thus work on the personalized interpreter is done in
\*Mmyicon/pi\fR.
.PP
The \*MMakefile\fR that is constructed by \*Micon\-pi\fR
contains two definitions to facilitate building personalized
interpreters:
.IP \*MOBJS\fR .5i
a list of object modules that are to be added to or replaced
in the run-time system. \*MOBJS\fR initially is empty.
.IP \*MLIB\fR
a list of library options that are used when the run-time system
is built. \*MLIB\fR initially is empty.
.LP
See the listing of a generic version of this \*MMakefile\fR in
Appendix A.
.NH 2
Building a Personalized Interpreter
.PP
Performing a \fImake\fR in \*Mmyicon/pi\fR creates three files
in \*Mmyicon\fR:
.Ds
.ta 1i
picont	\fRcommand processor\*M
pilink	\fRlinker\*M
piconx	\fRrun-time system\*M
.De
A link to \*Mpicont\fR also is constructed in \*Mmyicon/pi\fR so that
the new personalized interpreter can be tested in the directory in
which it is made.
.PP
The file \*Mpicont\fR normally is built only on the first \fImake\fR. The
file \*Mpilink\fR is built on the first \fImake\fR and is
rebuilt whenever the repertoire of built-in functions is changed.
The file \*Mpiconx\fR is rebuilt whenever the source code in the
run-time system is changed.
.PP
The user of the personalized interpreter uses \*Mpicont\fR in
the same fashion that the standard \*Micont\fR is used [4].
(Note that the accidental use of \*Micont\fR in place of
\*Mpicont\fR may produce mysterious results.)
In turn, \*Mpicont\fR translates a source program using the
standard Icon translator and links it using \*Mpilink\fR.
The resulting icode file uses \*Mpiconx\fR.
.PP
The relocation bits and symbol tables in \*Mpicont\fR, \*Mpilink\fR,
and \*Mpiconx\fR can be removed by
.Ds
make Strip
.De
in \*Mmyicon/pi\fR. This reduces the sizes of these files substantially
but may interfere with debugging.
.PP
If a \fImake\fR is performed in \*Mmyicon/pi\fR before any
run-time files are added or modified, the resulting personalized
interpreter is identical to the standard one. Such a \fImake\fR can
be performed to verify that the personalized interpreter system
is performing properly.
.PP
Note that a personalized interpreter inherits the parameters and
configuration of the locally installed version of Icon in \*Mv5g\fR, including
optional language extensions [6].
The file \*Mmyicon/h/config.h\fR contains configuration information.
The definitions in this file should not be changed.
.NH 2
Adding a New Function
.PP
To add a new function to the personalized interpreter, it is first
necessary to provide the C code, adhering to the conventions and
data structures used throughout Icon. See [2]. Some examples of
C functions taken from the Icon program library [7] are included in Appendix B
of this report. The source code for these functions is contained in
\*Mv5g/pifunc\fR, where \*Mv5g\fR is the root of the Icon system.
The location of \*Mv5g\fR varies from site to site [5].
The directory
\*Mv5g/functions\fR contains the source code for the standard built-in
functions, which also can be used as models for new ones.
.PP
Suppose that \*Mgetenv\fR from the Icon program library is to be
added to a personalized interpreter. The source code can be obtained by
.Ds
cp v5g/pifuncs/getenv.c myicon/pi
.De
(Note that the actual paths will be different, depending on the
local hierarchy.)
.PP
Three things now need to be done to
incorporate this function in the personalized interpreter:
.IP 1.
Add a line consisting of
.Ds
PDEF(getenv)
.De
to \*Mmyicon/h/pdef.h\fR in proper alphabetical order.
This causes the linker and the run-time system to know about the new function.
.IP 2.
Add \*Mgetenv.o\fR to the definition of \*MOBJS\fR in
\*Mmyicon/pi/Makefile\fR.
This causes \*Mgetenv.c\fR to be compiled and the resulting
object file to be loaded with the run-time system when a \fImake\fR is performed.
.IP 3.
Perform a \fImake\fR in \*Mmyicon/pi\fR. The result is
new versions of \*Mpilink\fR and \*Mpiconx\fR in \*Mmyicon\fR.
.LP
The function \*Mgetenv\fR now can be used like any other built-in
function.
.PP
More than one function can be included in a single source file.
See \*Mmath.c\fR in Appendix B. Note that \*Mmath.c\fR uses the
math library. To add this module to the run-time system of
a personalized interpreter, \*MPDEF\fR entries should be
made for each function in \*Mmath.c\fR, \*Mmath.o\fR should be added to
\*MOBJS\fR, and \*M\-lm\fR should be added to \*MLIB\fR in
the \*MMakefile\fR.
.NH 2
Modifying the Existing Run-Time System
.PP
The use of personalized interpreters is not limited to the addition
of new functions. Any module in the standard run-time system can
be modified as well. The run-time system is divided into five
parts:
.RS
.IP \*Mv5g/functions\fR 1.2i
built-in functions
.IP \*Mv5g/operators\fR
built-in operators
.IP \*Mv5g/rt\fR
run-time support routines
.IP \*Mv5g/lib\fR
routines called by the interpreter
.IP \*Mv5g/iconx\fR
the interpreter and start-up routines
.RE
.LP
For example, storage allocation routines are contained in \*Mv5g/rt/alc.c\fR.
.PP
To modify an existing portion of the Icon run-time system,
copy the source code file from the standard system to \*Mmyicon/pi\fR.
(Source code for a few run-time routines is placed in \*Mmyicon/std\fR
when a personalized interpreter is set up. Check this directory
first and use that file, if appropriate, rather than making
another copy in \*Mmyicon/pi\fR.) When a source-code file in
\*Mmyicon/pi\fR has been modified, place it in the \*MOBJS\fR
list just like a new file and perform a \fImake\fR. Note that
an entire module must be replaced, even if a change is made to
only one routine.
Any module that is replaced must contain all the global variables in
the original module to prevent \fIld(1)\fR from also loading the
original module. There is no way to delete routines from the run-time
system.
.PP
The directory \*Mmyicon/h\fR contains header files that are included
in various source-code files. For example, error message text for
a new run-time error can be provided by adding it to \*Mmyicon/h/err.h\fR.
The file \*Mmyicon/h/rt.h\fR contains declarations and definitions that
are used throughout the run-time system. This is where the declaration
for the structure of a new type of data object would be placed.
.PP
Care
must be taken when modifying header files not to make changes that
would produce inconsistencies between previously compiled components
of the Icon run-time system and new ones.
.SH
References
.IP 1.
Griswold, Ralph E. and Griswold, Madge T. \fIThe Icon Programming
Language\fR. Prentice-Hall, Inc., Englewood Cliffs, New Jersey. 1983.
.IP 2.
Griswold, Ralph E., Robert K. McConeghy, and William H. Mitchell.
\fIA Tour Through the C Implementation of Icon; Version 5.9\fR.
Technical Report TR 84-11, Department of Computer Science,
The University of Arizona. August 1984.
.IP 3.
Griswold, Ralph E. and William H. Mitchell. \fIICONC(1)\fR,
manual page for \fIUNIX Programmer's Manual\fR, Department of
Computer Science, The University of Arizona. July 1983.
.IP 4.
Griswold, Ralph E. and William H. Mitchell. \fIICONT(1)\fR,
manual page for \fIUNIX Programmer's Manual\fR, Department of
Computer Science, The University of Arizona. August 1984.
.IP 5.
Griswold, Ralph E. and William H. Mitchell.
\fIInstallation and Maintenance Guide for Version 5.9 of Icon\fR.
Technical Report TR 84-13, Department of Computer Science, The
University of Arizona, Tucson, Arizona. August 1984.
.IP 6.
Griswold, Ralph E., Robert K. McConeghy, and William H. Mitchell.
\fIExtensions to Version 5 of the Icon Programming Language\fR.
Technical Report TR 84-10a, Department of Computer Science,
The University of Arizona. August 1984.
.IP 7.
Griswold, Ralph E. \fIThe Icon Program Library\fR, Technical Report
TR 84-12, Department of Computer Science, The University of
Arizona. August 1984.
.am Ds
.ps 8
.vs 9
..
.am De
.ps 10
.vs 12
..
.de Ta
.ta .8i +.8i +.8i +.8i +.8i +.8i +.8i +.8i
..
.Ap "Appendix A \(em Makefile for Personalized Interpreters"
.sp
.PP
The ``generic'' \*MMakefile\fR for personalized interpreters follows.
The values of \*MPATH\fR and \*MDIR\fR are filled in when \*MPimake\fR is run.
.Ds
CFLAGS=
LDFLAGS=
LIB=
iroot=PATH
V5GBIN=$(iroot)/bin
DIR=
.Dd
#
# To add or replace object files, add their names to the OBJS list below.
#  For example, to add foo.o and bar.o, use:
#
#	OBJS=foo.o bar.o         (this is a sample line)
#
# For each object file added to OBJS, add a dependency line to reflect files
#  that are depended on.  For example, if foo.c includes rt.h 
#  which is located in the h directory use
#  	
#	foo.o:	../h/rt.h
#
.Dd
OBJS=
.Dd
PIOBJS=../std/init.o ../std/strprc.o
RTOBJS=$(PIOBJS) $(OBJS)
.Dd
Pi:	../picont ../piconx ../pilink
.Dd
../picont: ../std/icont.c
	rm -f ../picont picont
	cc -o ../picont -DIntBin="\e"$(DIR)\e"" -DIconx="\e"$(DIR)/piconx\e"" \e
		-DIconxEnv="\e"ICONX=$(DIR)/piconx\e"" \e
		-DILINK="\e"$(DIR)/pilink\e"" \e
		-DITRAN="\e"$(V5GBIN)/itran\e"" -DFORK=QFORK \e
		../std/icont.c
	ln ../picont
.Dd
../pilink: ../std/linklib ../std/builtin.o
	cc $(LDFLAGS) -X -o ../pilink ../std/builtin.o ../std/linklib
.Dd
../piconx: ../std/rtlib $(RTOBJS)
	cc $(LDFLAGS) -X -o ../piconx -e start -u start $(RTOBJS) ../std/rtlib $(LIB)
	
../std/init.o:		../h/rt.h ../h/err.h ../h/config.h ../h/pdef.h
	cd ../std;	cc -c init.c
.Dd
../std/builtin.o:	../std/ilink.h ../h/config.h ../h/pdef.h
	cd ../std;	cc -c builtin.c
.Dd
../std/strprc.o:	../h/rt.h ../h/pnames.h ../h/config.h ../h/pdef.h
	cd ../std;	cc -c strprc.c
.Dd
Strip:	../picont ../piconx ../pilink
	strip ../picont ../piconx ../pilink
.De
.Ap "Appendix B \(em C Functions from the Icon Program Library"
.sp
.SH
getenv.c:
.LP
.Ds
/*
#	GETENV(3.icon)
#
#	Get contents of environment variables
#
#	Stephen B. Wampler
#
#	Last modified 8/19/84
#
*/
.Dd
#include "../h/rt.h"
.Dd
/*
 * getenv(s) - return contents of environment variable s
 */
.Dd
Xgetenv(nargs, arg1, arg0)
int nargs;
struct descrip arg1, arg0;
   {
   register char *p;
   register int len;
   char sbuf\^[MAXSTRING];
   extern char *getenv();
   extern char *alcstr();
.Dd
   DeRef(arg1)
.Dd
   if (!QUAL(arg1))			/* check legality of argument */
      runerr(103, &arg1);
   if (STRLEN(arg1) \*(<= 0 || STRLEN(arg1) \*(>= MAXSTRING)
      runerr(401, &arg1);
   qtos(&arg1, sbuf);			/* convert argument to C-style string */
.Dd
   if ((p = getenv(sbuf)) != NULL) {		/* get environment variable */
      len = strlen(p);
      sneed(len);
      STRLEN(arg0) = len;
      STRLOC(arg0) = alcstr(p, len);
      }
   else				/* fail if variable not in environment */
      fail();
   }
.Dd
Procblock(getenv,\*b-1)
.De
.bp
.SH
math.c:
.LP
.Ds
/*
#	MATH(3.icon)
#
#	Miscellaneous math functions
#
#	Ralph E. Griswold
#
#	Last modified 8/19/84
#
*/
.Dd
#include "../h/rt.h"
#include <errno.h>
.Dd
int errno;
/*
 * exp(x), x in radians
 */
Xexp(nargs, arg1, arg0)
int nargs;
struct descrip arg1, arg0;
   {
   int t;
   double y;
   union numeric r;
   double exp();
   
   if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1);
   y = exp(r.real);
   if (errno == ERANGE) runerr(252, NULL);
   mkreal(y,\*b&arg0);
   }
Procblock(exp,\*b1)
.Dd
/*
 * log(x), x in radians
 */
Xlog(nargs, arg1, arg0)
int nargs;
struct descrip arg1, arg0;
   {
   int t;
   double y;
   union numeric r;
   double log();
   
   if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1);
   y = log(r.real);
   if (errno == EDOM) runerr(251, NULL);
   mkreal(y,\*b&arg0);
   }
Procblock(log,\*b1)
.Dd
/*
 * log10(x), x in radians
 */
Xlog10(nargs, arg1, arg0)
int nargs;
struct descrip arg1, arg0;
   {
   int t;
   double y;
   union numeric r;
   double log10();
   
   if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1);
   y = log10(r.real);
   if (errno == EDOM) runerr(251, NULL);
   mkreal(y,\*b&arg0);
   }
Procblock(log10,\*b1)
.Dd
/*
 * sqrt(x), x in radians
 */
Xsqrt(nargs, arg1, arg0)
int nargs;
struct descrip arg1, arg0;
   {
   int t;
   double y;
   union numeric r;
   double sqrt();
   
   if ((t = cvreal(&arg1, &r)) == NULL) runerr(102, &arg1);
   y = sqrt(r.real);
   if (errno == EDOM) runerr(251, NULL);
   mkreal(y,\*b&arg0);
   }
Procblock(sqrt,\*b1)
.De
.bp
.SH
seek.c:
.Ds
/*
#	SEEK(3.icon)
#
#	Seek to position in stream
#
#	Stephen B. Wampler
#
#	Last modified 8/19/84
#
*/
.Dd
#include "../h/rt.h"
.Dd
/*
 * seek(file,\*boffset,\*bstart) - seek to offset byte from start in file.
 */
.Dd
Xseek(nargs, arg3, arg2, arg1, arg0)
int nargs;
struct descrip arg3, arg2, arg1, arg0;
   {
   long l1, l2;
   int status;
   FILE *fd;
   long ftell();
.Dd
   DeRef(arg1)
   if (arg1.type != D_FILE)
      runerr(106);
.Dd
   defint(&arg2, &l1, 0);
   defshort(&arg3, 0);
.Dd
   fd = BLKLOC(arg1)->file.fd;
.Dd
   if ((BLKLOC(arg1)->file.status == 0) ||
       (fseek(fd, l1, arg3.value.integr) == -1))
      fail();
   mkint(ftell(fd), &arg0);
   }
.Dd
Procblock(seek,\*b3)
.De
.LP