This document discusses how to assemble or compile various parts of the MINI-UNIX system software. This may be necessary because a command or library is accidentally deleted or otherwise destroyed; also, it may be desirable to install a modified version of some command or library routine. It should be noted that in the system as distributed, there are quite a few commands that depend to some degree on the current configuration of the system; thus in any new system modifications to some commands are advisable. Most of the likely modifications relate to the standard disk devices contained in the system. For example, the .ul df ("disk free") command has built into it the names of the standardly present disk storage drives (e.g. "/dev/rk0", "/dev/rk1"). .ul Df takes an argument to indicate which disk to examine, but it is convenient if its default argument is adjusted to reflect the ordinarily present devices. .pg The companion document "Setting up MINI-UNIX" discusses which commands are likely to require changes. .pg The greater part of the source files for commands resides in several subdirectories of the directory /usr/source. Each directory and subdirectory contains a "run" file which contains "shell" sequences for re-compiling all commands in that directory. These subdirectories, and a general description of their contents, are .sp .in 8 .ti -4 s1``Source files for most commands with names beginning with "a" through "l". .sp .ti -4 s2``Source files for most commands with names beginning with "m" through "z". .sp .ti -4 s3``Source files for subroutines contained in the standard system library, "/lib/liba.a" (see below). .sp .ti -4 s4``Source files for the C library, "/lib/libc.a" (see below). .sp .ti -4 s5``Source files for more of the C library. .sp .ti -4 s7``Contains the source files for all the text formatters roff, nroff and neqn. They are separate because they overloaded the s2 directory. .sp .ti -4 as````Source files for assembler. .sp .ti -4 c`````Source files for C compiler. .sp .ti -4 cref``Source files for cross reference program. .sp .ti -4 fort``Source files for Fortran Compiler. .sp .ti -4 iolib`Source files for Portable C library. .sp .ti -4 m6````Source files for Macro Processor. .sp .ti -4 mdec``Source files for utility and boot programs. .sp .ti -4 rat```Source files for Ratfor. .sp .ti -4 salloc`Source files for storage allocation routines. .sp .ti -4 sno```Source files for Snobol Interpreter. .sp .ti -4 tmg```Source files for TMG compiler-compiler. .sp .ti -4 yacc``Source files for YACC compiler-compiler. .sp .in 0 To regenerate most commands in the s1 and s2 directories is straightforward. The appropriate directory will contain one or more source files for the command. These will all have the suffix ".s" if the command is written in assembler language, or ".c" if it is written in C. The first part of the name begins with the name of the command. If there are several source files, the command name will be followed by a character which distinguishes the several files. it is typically "1", "2", ...; Sometimes the last is "x". For example, The "bas" command has source files (in s1) called "bas0.s", "bas1.s", ..., "bas4.s", "basx.s". In all cases, the lexicographical order of the distinguishing character is the order in which the source files should be compiled or assembled. Thus, for example, the way to reassemble a new "bas" is to say (in s1) .sp as bas?.s .sp Some of the assembly-language commands are completely stand-alone and require no inclusion of routines from system libraries. Unfortunately there is no .ul a priori way of determining which need library routines. A simple .ul a posteriori method is to assemble the command as discussed above, then say .sp nm -u a.out .sp which will list the undefined external symbols. If any appear, the loader should be called by saying .sp ld a.out -l .sp However .ul all assembly-language programs require the application of the link editor .ul ld (also loosely called the loader), since the link editor automatically relocates the object code to 060000 for a 12K MINI-UNIX system. .sp One important command which needs slightly special treatment is "tp" which has to be loaded with the C library: .pg as tp?.s .br ld a.out -l -lc .pg because it calls the C-language ctime subroutine. .pg As it happens, there are no commands written in C (except those described below) which consist of more than one file. The command "com.c" can therefore be recompiled simply by saying .sp cc -O com.c .sp Here the "-O" indicates the desire to use the optimizer pass of the C compiler. .sp Some of the most important commands are considerably more complicated to regenerate, and these are discussed specifically below. The contents of libraries are also discussed. .sp .ul AS .sp The assembler consists of two executable files: /bin/as and /lib/as2. The first is the 0-th pass: it reads the source program, converts it to an intermediate form in a temporary file "/tmp/atm0?", and estimates the final locations of symbols. It also makes two or three other temporary files which contain the ordinary symbol table, a table of temporary symbols (like n_:) and possibly an overflow intermediate file. The program /lib/as2 acts as an ordinary two-pass assembler with input taken from the files produced by /bin/as. .pg The source files for /bin/as are named "/usr/source/s1/as1?.s" (there are 9 of them); /lib/as2 is produced from the source files "/usr/source/s1/as2?.s"; they likewise are 9 in number. Considerable care should be exercised in replacing either component of the assembler. Remember that if the assembler is lost, the only recourse is to replace it from some backup storage; a broken assembler cannot assemble itself. .pg .ul C .sp The C compiler consists of four files: "/bin/cc", which expands compiler control lines and which calls the phases of the compiler proper, the assembler, and the loader; "/lib/c0", which is the first phase of the compiler; "/lib/c1", which is the second phase of the compiler ; and "/lib/c2", which is the optional third phase optimizer. The loss of the C compiler is as serious as that of the assembler. .pg The source for /bin/cc resides in "/usr/source/s1/cc.c". Its loss alone is not fatal. Provided that prog.c does not contain any compiler control lines, prog.c can be compiled by .sp /lib/c0 prog.c temp0 temp1 .br /lib/c1 temp0 temp1 temp2 .br as - temp2 .br ld /lib/crt0.o a.out -lc -l .sp If /bin/cc is lost, it can be recovered in this way, since it contains no compiler control lines. .pg The source for the compiler proper is in the directory /usr/c. The first phase (c0) is generated from the files c00.c, ..., c05.c, which must be compiled by C; c0t.s, which must be assembled; and c0h.c, which is a header file which should not be compiled but is a file .it included by the C programs of the first phase. The c0t.s program contains a parameter "fpp" which determines whether C is to be used on a machine which has PDP 11/45 floating-point hardware; it should be set to 1 if so, 0 if not. In the standard system fpp is 0. To make a new /lib/c0, assemble c0t.s, name the output c0t.o, and .sp cc c0t.o c0[0-5].c .sp Before installing the new c0, it is prudent to save the old one someplace. .pg The second phase of C (/lib/c1) is generated from the C source files c10.c, ..., c13.c, the assembly-language program c1t.s, the include-file c1h.c, and a library of object-code tables called tab.a. To generate a new second phase, assemble c1t.s, call it c1t.o, and .sp cc c1t.o c1[0-3].c tab.a .sp It is likewise prudent to save c1 before installing a new version. In fact in general it is wise to save the object files for the C compiler so that if disaster strikes C can be reconstituted without a working version of the compiler. .pg In a similar manner, the third phase of the C compiler (/lib/c2) is made up from the files c20.c and c21.c together with c2h.c. Its loss is not critical since it is completely optional. .pg The library of tables mentioned above is generated from the files regtab.s, sptab.s, cctab.s, and efftab.s. The order is not important. These ".s" files are not in fact assembler source; they must be converted by use of the .it cvopt program, whose source and object are located in the C directory. For example: .sp .nf cvopt regtab.s temp as temp mv a.out regtab.o ar r tab.a regtab.o .fi .sp Refer to the .it run shell sequence in the C directory for more complete details. .sp .ul FORTRAN .sp Probably because it is a very large subsystem written entirely in assembly language, Fortran is quite complicated to regenerate. On the other hand, Fortran is vital only to its own users; since none of the compiler nor any important part of the run-time system is written in Fortran, both can be regenerated in case of loss. .sp The .it fc command itself is essentially equivalent to a long shell command file; for a single source program .ul prog.f, it amounts to saying .xy /usr/fort/fc1 prog.f as - f.tmp1 ld /lib/fr0.o a.out /lib/filib.a -lf -l .yx Thus, /usr/fort/fc1 is the compiler proper; fc1 leaves its output in the current directory in the file "f.tmp1". /lib/fr0.o is the runtime startoff. Filib.a is the library of operators; Fortran is essentially interpretive, and operations such as "add floating variable to floating variable" are short routines loaded from the filib.a library. .sp /lib/libf.a (specified by the "-lf") is an archive file containing the language builtin functions plus a few others. The standard assembly language library (the "-l", or /lib/liba.a) is referenced by certain of the builtin functions (for routines like .ul sin ). .sp The source and object of the compiler are stored in subdirectories of the /usr/fort directory, named f1, f2, f3, f4, and fx. The first four represent putatively separable phases; the last contains subroutines used by several of the phases. Each directory contains an archive file with the object programs corresponding to the source programs in that directory; it is called f?_o.a where "?" is the last letter in the directory name. To reload Fortran from theses libraries, see the Shell command file /usr/fort/ld, which should contain .xy ld -u pass1 -u pass2 -u pass3 -u pass4 \\ f1/f1o.a f2/f2o.a f3/f3o.a f4/f4o.a fx/fxo.a -l .yx Each subdirectory should contain a Shell command file called "as" which assembles a particular file in that subdirectory; the one for f1 contains, for example, .xy as ../fx/fhd.s f1$1.s mv a.out $1.o ar r f1o.a f1$1.o rm f1$1.o .yx so that the command .sp sh as 5 .sp would assemble f15.s (preceded by the definition file /usr/fort/fx/fhd.s) and place it in the library for that subdirectory. .sp Actually we hope that no one will be required to make a new Fortran from the pieces, or fix it themselves. For those who are curious, we will say that phase 1 analyzes declarations, phase 2 does storage allocation, phase 3 code generation, and phase 4 puts out constants, code from format and data statements, and the actual storage-reserving code for variables. .sp .ul MINI-UNIX .pg The source and object programs for MINI-UNIX are kept in .ul /usr/sys and three subdirectories therein. The main directory contains several files with names ending in ".h"; these are header files which are picked up (via "#include ...") as required by each system module. The files lib1 and lib2 are libraries (archives) of (almost) all the object programs in the system. Lib1 is made from the source programs in the subdirectory .ul mxsys; lib2 is made from the programs in subdirectory .ul dev. The latter consists mostly of the device drivers together with a few other things, the former is the rest of the system. .pg Subdirectory .ul source contains the source code for all MINI-UNIX user programs which have been modified from the standard UNIX programs. .pg The .ul mxsys subdirectory contains the progams which control the device configuration of the system. .ul Low.s specifies the contents of the interrupt vectors; .ul conf.c contains the tables which relate device numbers to handler routines. A third program, mch.s, contains all the machine-language code in the system. A fourth program, emul.s contains the software emulation package to handle the extended instruction set, i.e. those instructions which are not implemented in the PDP-11/20 and PDP-11/10 processor hardware. .pg To recreate the system, compile conf.c and move the output to /usr/sys/conf.o. Assemble low.s and move the output to /usr/sys/low.o. Then change to /usr/sys, and load the whole system: .pg ld -a -x low.o conf.o lib1 lib2 .pg For convenience, this command line has been placed into /usr/sys/shld. Consult the "run" file and the companion document "Setting Up MINI-UNIX - Sixth Edition" for further details on creating a new system. .pg When the .ul ld is done, the new system is present as .ul a.out. It can also be tested by putting it on tape (tp-I) and using tboot or mboot, or directly using uboot (boot procedures-VIII). When you have satisfied yourself that it works, it should be copied to /mx so that programs like ps (I) can use it to pick up addresses in the system. .pg A word of caution is in order here. The size of a.out must be less than 055400 bytes for the system to run properly. If the system is bigger than this, its size can be reduced by removing one more system buffer (NBUF in param.h) and recompiling all of the system source using the "shs" shell sequence file in the .ul mxsys subdirectory. If enough space cannot be achieved in this manner, the system size must grow beyond 12K words to the next convenient boundary. This requires major surgery; therefore think twice before you do it. To form a new system with the size greater than 12K words, the file "mxsys/param.h" must be edited to change the following three parameters: .in+5 .nf UCORE TOPSYS SWPSIZ. .in-5 .fi Re-compile the complete system using the "run" command as before. A new root file system must be made and all system command programs must be re-compiled. Before proceeding, change the value of the TOPSYS parameter in "sys/source/ld.c" to the appropriate value and re-compile the link editor .ul ld. At some point the value of "uorg" in sys/source/db1.s must also be changed and the debugger .ul db re-assembled and link-edited for the new root file system. The complete re-compilation of all user command programs is likely to take the better part of a day. .pg To install a new device driver, compile it and place the object in lib2 if necessary. (All the device drivers distributed with the system are already there.) The device's interrupt vector must be entered in low.s. This involves placing a pointer to a callout routine and the device's priority level in the vector. As an example, consider installing the interrupt vector for DC11 number 2. Its receiver interrupts at location 320 and the transmitter at 324, both at priority level 5. Then low.s has: .xy . = 320^. dcin; br5+2 dcou; br5+2 .yx First, notice that the entries in low.s must be in order, since the assembler does not permit moving the location counter "." backwards. The assembler also does not permit assignation of an absolute number to ".", which is the reason for the ". = 320^." subterfuge; consult the Assembler Manual for the meaning of the notation. If a constant smaller than 16(10) is added to the priority level, this number will be available as the first argument of the interrupt routine. This stratagem is used when several similar devices share the same interrupt routine. .pg At the end of low.s, add .xy .globl _dcrint dcin: jsr r0,call; _dcrint .sp .globl _dcxint dcou: jsr r0,call; _dcxint .yx The .it call routine saves registers as required and makes a C-style call on the actual interrupt routine (here _dcrint and _dcxint) named after the jsr instruction. When the routine returns, .it call restores the registers and performs an rti instruction. .pg To install a new device thus requires knowing the name of its interrupt routines. These routines are in general easily found in the driver; they typically end in the letters "int" or "intr." Notice that external names in C programs have an underscore "_" prepended to them. .pg The second step which must be performed to add a new device is to add it to the configuration table /usr/sys/mxsys/conf.c. This file contains two subtables, one for block-type devices, and one for character-type devices. Block devices include disks, DECtape, and magtape. All other devices are character devices. A line in each of these tables gives all the information the system needs to know about the device handler; the ordinal position of the line in the table implies its major device number, starting at 0. The appropriate editing must be done in conf.c and then it must be re-compiled and the object module moved to /usr/sys/conf.o. .pg There are four subentries per line in the block device table, which give its open routine, close routine, strategy routine, and device table. The open and close routines may be nonexistent, in which case the name "nulldev" is given; this routine merely returns. The strategy routine is called to do any I/O, and the device table contains status information for the device. .pg For character devices, each line in the table specifies a routine for open, close, read, and write, and one which sets and returns device-specific status (used, for example, for stty and gtty on typewriters). If there is no open or close routine, "nulldev" may be given; if there is no read, write, or status routine, "nodev" may be given. This return sets an error flag and returns. .pg The above discussion is admittedly rather cryptic in the absence of a general description of system I/O interfaces. .pg The final step which must be taken to install a device is to make a special file for it. This is done by mknod (VIII), to which you must specify the device class (block or character), major device number (relative line in the configuration table) and minor device number (which is made available to the driver at appropriate times).