Mini-Unix/usr/doc/unix/p4

.s1
6. The Shell
.es
For most users,
communication with \*sUNIX\*n is carried on with the
aid of a program called the Shell.
The Shell is a command
line interpreter: it reads lines typed by the user and
interprets them as requests to execute
other programs.
In simplest form, a command line consists of the command
name followed by arguments to the command, all separated
by spaces:
.dc
command arg\*t\d1\u\*n arg\*t\d2\u\*n .\|.\|. arg\*t\dn\u\*n
.ec
The Shell splits up the command name and the arguments into
separate strings.
Then a file with name \fIcommand\fR is sought;
\fIcommand\fR may be a path name including the ``/'' character to
specify any file in the system.
If \fIcommand\fR is found, it is brought into
core and executed.
The arguments
collected by the Shell are accessible
to the command.
When the command is finished, the Shell
resumes its own execution, and indicates its readiness
to accept another command by typing a prompt character.
.pg
If file \fIcommand\fR cannot be found,
the Shell prefixes the string  \fI/\|bin/\fR  to \fIcommand\fR and
attempts again to find the file.
Directory \fI/\|bin\fR contains all the commands
intended to be generally used.
.s2
6.1 Standard I/O
.es
The discussion of I/O in \(sc3 above seems to imply that
every file used by a program must be opened or created by the program in
order to get a file descriptor for the file.
Programs executed by the Shell, however, start off with
two open files which have file descriptors
0 and 1.
As such a program begins execution, file 1 is open for writing,
and is best understood as the standard output file.
Except under circumstances indicated below, this file
is the user's typewriter.
Thus programs which wish to write informative or diagnostic
information ordinarily use file descriptor 1.
Conversely, file 0 starts off open for reading, and programs which
wish to read messages typed by the user usually
read this file.
.pg
The Shell is able to change the standard assignments of
these file descriptors from the
user's typewriter printer and keyboard.
If one of the
arguments to a command is prefixed by ``>'', file descriptor
1 will, for the duration of the command, refer to the
file named after the ``>''.
For example,
.dc
ls
.ec
ordinarily lists, on the typewriter, the names of the files in the current
directory.
The command
.dc
ls >there
.ec
creates a file called \fIthere\fR and places the listing there.
Thus the argument ``>there'' means, ``place output on \fIthere\fR.''
On the other hand,
.dc
ed
.ec
ordinarily enters the editor, which takes requests from the
user via his typewriter.
The command
.dc
ed <script
.ec
interprets \fIscript\fR as a file of editor commands;
thus ``<script'' means, ``take input from \fIscript\fR.''
.pg
Although the file name following ``<'' or ``>'' appears
to be an argument to the command, in fact it is interpreted
completely by the Shell and is not passed to the
command at all.
Thus no special coding to handle I/O redirection is needed within each
command; the command need merely use the standard file
descriptors 0 and 1 where appropriate.
.s2
6.2 Filters
.es
An extension of the standard I/O notion is used
to direct output from one command to
the input of another.
A sequence of commands separated by
vertical bars causes the Shell to
execute all the commands simultaneously and to arrange
that the standard output of each command
be delivered to the standard input of
the next command in the sequence.
Thus in the command line
.dc
ls | pr \(mi2 | opr
.ec
.it ls
lists the names of the files in the current directory;
its output is passed to \fIpr\fR,
which
paginates its input with dated headings.
The argument ``\(mi2'' means
double column.
Likewise the output from \fIpr\fR is input to \fIopr\fR.
This command spools its input onto a file for off-line
printing.
.pg
This procedure could have been carried out
more clumsily by
.dc
ls >temp1
.ti 1i
pr \(mi2 <temp1 >temp2
.ti 1i
opr <temp2
.ec
followed by removal of the temporary files.
In the absence of the ability
to redirect output and input,
a still clumsier method would have been to
require the
.it ls
command
to accept user requests to paginate its output,
to print in multi-column format, and to arrange
that its output be delivered off-line.
Actually it would be surprising, and in fact
unwise for efficiency reasons,
to expect authors of
commands such as
.it ls
to provide such a wide variety of output options.
.pg
A program
such as \fIpr\fR
which copies its standard input to its standard output
(with processing)
is called a \fIfilter\fR.
Some filters which we have found useful
perform
character transliteration,
sorting of the input,
and encryption and decryption.
.s2
6.3 Command Separators; Multitasking
.es
Another feature provided by the Shell is relatively straightforward.
Commands need not be on different lines; instead they may be separated
by semicolons.
.dc
ls; ed
.ec
will first list the contents of the current directory, then enter
the editor.
.pg
A related feature is more interesting.
If a command is followed
by ``&'', the Shell will not wait for the command to finish before
prompting again; instead, it is ready immediately
to accept a new command.
For example,
.dc
as source >output &
.ec
causes \fIsource\fR to be assembled, with diagnostic
output going to \fIoutput;\fR no matter how long the
assembly takes, the Shell returns immediately.
When the Shell does not wait for
the completion of a command,
the identification of the
process running that command is printed.
This identification may be used to
wait for the completion of the command or to
terminate it.
The ``&'' may be used
several times in a line:
.dc
as source >output & ls >files &
.ec
does both the assembly and the listing in the background.
In the examples above using ``&'', an output file
other than the typewriter was provided; if this had not been
done, the outputs of the various commands would have been
intermingled.
.pg
The Shell also allows parentheses in the above operations.
For example
.dc
(\|date; ls\|) >x &
.ec
prints the current date and time followed by
a list of the current directory onto the file \fIx.\fR
The Shell also returns immediately for another request.
.s2
6.4 The Shell as a Command; Command Files
.es
The Shell is itself a command, and may be called recursively.
Suppose file \fItryout\fR contains the lines
.dc
as source
.ti 1i
mv a.out testprog
.ti 1i
testprog
.ec
The \fImv\fR command causes the file \fIa.out\fR to be renamed \fItestprog.\fR
\fIA.out\fR is the (binary) output of the assembler, ready to be executed.
Thus if the three lines above were typed on the console,
\fIsource\fR would be assembled, the resulting program renamed \fItestprog\fR,
and \fItestprog\fR executed.
When the lines are in \fItryout\fR, the command
.dc
sh <tryout
.ec
would cause the Shell \fIsh\fR to execute the commands
sequentially.
.pg
The Shell has further capabilities, including the
ability to substitute parameters
and
to construct argument lists from a specified
subset of the file names in a directory.
It is also possible to
execute commands conditionally on character string comparisons
or on existence of given files
and to perform transfers of control
within filed command sequences.
.s2
6.5 Implementation of the Shell
.es
The outline of the operation of the Shell can now be understood.
Most of the time, the Shell
is waiting for the user to type a command.
When the
new-line character ending the line
is typed, the Shell's \fIread\fR call returns.
The Shell analyzes the command line, putting the
arguments in a form appropriate for \fIexecute\fR.
Then \fIfork\fR is called.
The child process, whose code
of course is still that of the Shell, attempts
to perform an \fIexecute\fR with the appropriate arguments.
If successful, this will bring in and start execution of the program whose name
was given.
Meanwhile, the other process resulting from the \fIfork\fR, which is the
parent process, \fIwait\|\fRs for the child process to die.
When this happens, the Shell knows the command is finished, so
it types its prompt and reads the typewriter to obtain another
command.
.pg
Given this framework, the implementation of background processes
is trivial; whenever a command line contains ``&'',
the Shell merely refrains from waiting for the process
which it created
to execute the command.
.pg
Happily, all of this mechanism meshes very nicely with
the notion of standard input and output files.
When a process is created by the \fIfork\fR primitive, it
inherits not only the core image of its parent
but also all the files currently open in its parent,
including those with file descriptors 0 and 1.
The Shell, of course, uses these files to read command
lines and to write its prompts and diagnostics, and in the ordinary case
its children_the command programs_inherit them automatically.
When an argument with ``<'' or ``>'' is given however, the
offspring process, just before it performs \fIexecute,\fR
makes the standard I/O
file descriptor 0 or 1 respectively refer to the named file.
This is easy
because, by agreement,
the smallest unused file descriptor is assigned
when a new file is \fIopen\fR\|ed
(or \fIcreate\fR\|d);
it is only necessary to close file 0 (or 1)
and open the named file.
Because the process in which the command program runs simply terminates
when it is through, the association between a file
specified after ``<'' or ``>'' and file descriptor 0 or 1 is ended
automatically when the process dies.
Therefore
the Shell need not know the actual names of the files
which are its own standard input and output, since it need
never reopen them.
.pg
Filters are straightforward extensions
of standard I/O redirection with pipes used
instead of files.
.pg
In ordinary circumstances, the main loop of the Shell never
terminates.
(The main loop includes that
branch of the return from \fIfork\fR belonging to the
parent process; that is, the branch which does a \fIwait\fR, then
reads another command line.)
The one thing which causes the Shell to terminate is
discovering an end-of-file condition on its input file.
Thus, when the Shell is executed as a command with
a given input file, as in
.dc
sh <comfile
.ec
the commands in \fIcomfile\fR will be executed until
the end of \fIcomfile\fR is reached; then the instance of the Shell
invoked by \fIsh\fR will terminate.
Since this Shell process
is the child of another instance of the Shell, the \fIwait\fR
executed in the latter will return, and another
command may be processed.
.s2
6.6 Initialization
.es
The instances of the Shell to which users type
commands are themselves children of another process.
The last step in the initialization of \*sUNIX\*n is the creation of
a single process and the invocation (via \fIexecute\fR)
of a program called \fIinit\fR.
The role of \fIinit\fR is to create one process
for each typewriter channel which may be dialed up by a user.
The various subinstances of \fIinit\fR open the appropriate typewriters
for input and output.
Since when \fIinit\fR was invoked there were
no files open, in each process the typewriter keyboard will
receive file descriptor 0 and the printer file descriptor 1.
Each process types out a message requesting that the user log in
and waits, reading the typewriter, for a reply.
At the outset, no one is logged in,
so each process simply hangs.
Finally someone types his name or other identification.
The appropriate instance of \fIinit\fR wakes up, receives the log-in
line, and reads a password file.
If the user name is found, and if
he is able to supply the correct password, \fIinit\fR
changes to the user's default current directory, sets
the process's user \*sID\*n to that of the person logging in, and performs
an \fIexecute\fR of the Shell.
At this point the Shell is ready to receive commands
and the logging-in protocol is complete.
.pg
Meanwhile, the mainstream path of \fIinit\fR (the parent of all
the subinstances of itself which will later become Shells)
does a \fIwait\fR.
If one of the child processes terminates, either
because a Shell found an end of file or because a user
typed an incorrect name or password, this path of \fIinit\fR
simply recreates the defunct process, which in turn reopens the appropriate
input and output files and types another login message.
Thus a user may log out simply by typing the end-of-file
sequence in place of a command to the Shell.
.s2
6.7 Other programs as Shell
.es
The Shell as described above is designed to allow users
full access to the facilities of the system, since it will
invoke the execution of any program
with appropriate protection mode.
Sometimes, however, a different interface to the system
is desirable, and this feature is easily arranged.
.pg
Recall that after a user has successfully logged in by supplying
his name and password, \fIinit\fR ordinarily invokes the Shell
to interpret command lines.
The user's entry
in the password file may contain the name
of a program to be invoked after login instead of the Shell.
This program is free to interpret the user's messages
in any way it wishes.
.pg
For example, the password file entries
for users of a secretarial editing system
specify that the
editor \fIed\fR is to be used instead of the Shell.
Thus when editing system users log in, they are inside the editor and
can begin work immediately; also, they can be prevented from
invoking \*sUNIX\*n programs not intended for their use.
In practice, it has proved desirable to allow a temporary
escape from the editor
to execute the formatting program and other utilities.
.pg
Several of the games (e.g., chess, blackjack, 3D tic-tac-toe)
available on \*sUNIX\*n illustrate
a much more severely restricted environment.
For each of these an entry exists
in the password file specifying that the appropriate game-playing
program is to be invoked instead of the Shell.
People who log in as a player
of one of the games find themselves limited to the
game and unable to investigate the presumably more interesting
offerings of \*sUNIX\*n as a whole.