SysIII/usr/src/man/docs/begin

.ds :? UNIX for Beginners
.de PT
.lt \\n(LLu
.pc %
.nr PN \\n%
.if \\n%-1 .if o .tl '\s9\f2\*(:?\fP''\\n(PN\s0'
.if \\n%-1 .if e .tl '\s9\\n(PN''\f2\*(:?\^\fP\s0'
.lt \\n(.lu
..
.tr |\(bv
.de IT
.if n .ul
\%\&\\$3\f2\\$1\fR\&\\$2
..
.de UL
.lg 0
.if n .ul
\%\&\\$3\f3\\$1\fR\&\\$2
.lg
..
.de UC
\\$3\s-1\\$1\s0\\$2
..
.de P1
.DS I 2m
.ss 18
\!.ss 18
.nf
.lg 0
.if n .ls 1
.if n .ta 5 10 15 20 25 30 35 40 45 50 55 60
.if t .ps -\\n(dP
.if t .vs -\\n(dP
.nr P \\n(.s
.nr S \\n(.s+1
.nr s \\n(.s-1
.nr t 5*33u	\" width in 9 point CW
.if t .ta 1u*\\ntu 2u*\\ntu 3u*\\ntu 4u*\\ntu 5u*\\ntu 6u*\\ntu 7u*\\ntu 8u*\\ntu 9u*\\ntu 10u*\\ntu 11u*\\ntu 12u*\\ntu 13u*\\ntu 14u*\\ntu
.ft 3
.tr _\(ul
.tr -\-
.lg 0
..
.de P2
.ps \\n(PS
.vs \\n(VSp
.nr P \\n(PS
.nr S \\n(PS+1
.nr s \\n(PS-1
.ft R
.if n .ls 2
.fi
\!.ss 12
.ss 12
.DE
.tr --
.tr ''
.lg
..
.nr PI 2m
.hy 14
'\"	ND October 2, 1978
'\"	.TR 75
'\"	.RP
.TL
U\s-2NIX\s+2 For Beginners\(emSecond Edition
.AU
Brian W. Kernighan
.AI
.MH
.AB
.nr PS 9
.if t .nr VS 11
.PP
This paper is meant to help
new users get started on
the
.UX
operating system.
It includes:
.IP "\ \(bu"
basics needed for day-to-day use of the system \(em
typing commands, correcting typing mistakes, logging in and out,
mail, inter-terminal communication,
the file system, printing files,
redirecting I/O, pipes, and the shell.
.IP "\ \(bu"
document preparation \(em
a brief discussion of the major formatting programs
and macro packages,
hints on preparing documents,
and capsule descriptions of some supporting software.
.IP "\ \(bu"
.UC UNIX
programming \(em
using the editor, programming the shell, programming in C,
other languages and tools.
.IP "\ \(bu"
An annotated
.UC UNIX
bibliography.
.AE
.if n .ls 2
.if t .2C
.nr PI 2m
.SH
INTRODUCTION
.PP
From the user's point of view,
the
.UC UNIX
operating system
is easy
to learn and use,
and presents few of the usual impediments
to getting the job done.
It is hard, however, for the beginner
to know where to start,
and how to make the best use 
of the facilities available.
The purpose of this introduction
is to help new users
get used to the main ideas of 
the
.UC UNIX
system
and start making effective use of it quickly.
.PP
You should have a couple of other documents with you
for easy reference as you read this one.
The most important is
.ul
The
.ul
.UC UNIX
.IT Programmer's
.IT Manual \|;
it's often easier to tell you to read about something
in the manual
than to repeat its contents here.
The other useful document is
.ul
A Tutorial Introduction to the
.ul
.UC UNIX
.ul
Text Editor,
which will tell you how to use the editor
to get text \(em
programs, data, documents \(em
into the computer.
.PP
A word of warning:
the
.UC UNIX
system
has become quite popular,
and there are several major variants
in widespread use.
Of course details also change with time.
So although the basic structure of 
.UC UNIX
and how to use it is common to all versions,
there will certainly be a few things
which are different on your system from
what is described here.
We have tried to minimize the problem,
but be aware of it.
In cases of doubt,
this paper describes Version 7 
.UC UNIX .
.PP
This paper has five sections:
.IP "\ \ 1."
Getting Started:
How to log in,
how to type,
what to do about mistakes in typing,
how to log out.
Some of this is dependent on which
system
you log into
(phone numbers, for example)
and what terminal you use,
so this section must necessarily be supplemented
by local information.
.IP "\ \ 2."
Day-to-day Use:
Things you need every day to use
the system
effectively:
generally useful commands;
the file system.
.IP "\ \ 3."
Document Preparation:
Preparing manu\%scripts is one of the most common uses
for
.UC UNIX
systems.
This section contains advice,
but not
extensive instructions on any
of the formatting tools.
.IP "\ \ 4."
Writing Programs:
.UC UNIX
is an excellent system for developing programs.
This section talks about some of the tools,
but again is not a tutorial in any of the programming languages
provided by the system.
.IP "\ \ 5."
A
.UC UNIX
Reading List.
An annotated bibliography of 
documents that new users should be aware of.
.SH
I.  GETTING STARTED
.SH
Logging In
.PP
You must have a 
.UC UNIX
login name, which you can get from
whoever administers your system.
You also need to know the phone number,
unless your system uses permanently connected terminals.
The
.UC UNIX
system
is capable of dealing with a wide variety of terminals:
Terminet 300's; Execuport, TI and similar
portables;
video (CRT) terminals like the HP2640, etc.;
high-priced graphics terminals like the Tektronix 4014;
plotting terminals like those from GSI and DASI;
and even the venerable
Teletype in its various forms.
But note:
.UC UNIX
is strongly oriented towards devices with 
.ul
lower case.
If your terminal produces only upper case (e.g., model 33 Teletype, some video and portable terminals),
life will be so difficult that you should look for another
terminal.
.PP
Be sure to set the switches appropriately on your device.
Switches that might need to be adjusted include the speed,
upper/lower case mode,
full duplex, even parity, and any others
that local wisdom advises.
Establish a connection using whatever
magic is needed for your terminal;
this may involve dialing a telephone call or merely flipping a switch.
In either case,
.UC UNIX
should type
.UL login: '' ``
at you.
If it types garbage, you may be at the wrong speed;
check the switches.
If that fails,
push the ``break'' or ``interrupt'' key a few times, slowly.
If that fails to produce a login message, consult a guru.
.PP
When you get a
.UL login:
message,
type your
login name
.ul
in lower case.
Follow it by a 
.UC RETURN ;
the system will not do anything until you type a
.UC RETURN .
If a password is required,
you will be asked for it,
and (if possible)
printing will be turned off while you type it.
Don't forget
.UC RETURN .
.PP
The culmination of your login efforts is a
``prompt character,''
a single character that indicates that
the system
is ready to accept commands from you.
The prompt character is usually a 
dollar sign
.UL $
or a
percent sign
.UL % .
(You may also get a message of the day just before the
prompt character, or a notification that you have mail.)
.SH
Typing Commands
.PP
Once you've seen the prompt character, you can type commands,
which are
requests that
the system
do something.
Try typing
.P1
date
.P2
followed by 
.UC RETURN.
You should get back something like
.P1
Mon Jan 16 14:17:10 EST 1978
.P2
Don't forget the
.UC RETURN
after the command,
or nothing will happen.
If you think you're being ignored,
type a
.UC RETURN ;
something should happen.
.UC RETURN
won't be mentioned
again,
but don't forget it \(em
it has to be there
at the end of each line.
.PP
Another command you might try is
.UL who ,
which tells you everyone who is currently logged in:
.P1
who
.P2
gives something like
.P1
.ta .5i 1i
mb	tty01	Jan 16    09:11
ski	tty05	Jan 16    09:33
gam	tty11	Jan 16    13:07
.P2
The time is when the user logged in;
``ttyxx'' is the system's idea of what terminal
the user is on.
.PP
If you make a mistake typing the command name,
and refer to a non-existent command,
you will be told.
For example, if you type
.P1
whom
.P2
you will be told 
.P1
whom: not found
.P2
Of course, if you inadvertently type the name of some other command,
it will run,
with more or less mysterious results.
.SH
Strange Terminal Behavior
.PP
Sometimes you can get into a state
where your terminal acts strangely.
For example,
each letter may be typed twice,
or the
.UC RETURN
may not cause a line feed
or a return to the left margin.
You can often fix this by logging out and logging back in.
Or you can read the description of the command
.UL stty
in section I of the manual.
To get intelligent treatment of
tab characters
(which are much used in
.UC UNIX )
if your terminal doesn't have tabs,
type the command
.P1
stty \-tabs
.P2
and the system will convert each tab into the right number
of blanks for you.
If your terminal does have computer-settable tabs,
the command
.UL tabs
will set the stops correctly for you.
.SH
Mistakes in Typing
.PP
If you make a typing mistake, and see it before
.UC RETURN
has been typed,
there are two ways to recover.
The sharp-character
.UL #
erases the last character typed;
in fact successive uses of
.UL #
erase characters back to
the beginning of the line (but not beyond).
So if you type badly, you can correct as you go:
.P1
dd#atte##e
.P2
is the same as
.UL date .
.PP
The at-sign
.UL @
erases all of the characters
typed so far
on the current input line,
so if the line is irretrievably fouled up, type an
.UL @
and start the line over.
.PP
What if you must enter a sharp or at-sign
as part of the text?
If you precede either
.UL #
or
.UL @
by a backslash
.UL \e ,
it loses its erase meaning.
So to enter a sharp or at-sign in something, type
.UL \e# 
or
.UL \e@ .
The system will always echo a newline at you after your at-sign,
even if preceded by a backslash.
Don't worry \(em
the at-sign has been recorded.
.PP
To erase a backslash,
you have to type two sharps or two at-signs, as in
.UL \e## .
The backslash is used extensively in
.UC UNIX
to indicate that the following character is in some way special.
.SH
Read-ahead
.PP
.UC UNIX
has full read-ahead,
which means that you can type as fast as you want,
whenever you want,
even when some command is typing at you.
If you type during output,
your input characters will appear intermixed with the output characters,
but they will be stored away
and interpreted in the correct order.
So you can type several commands one after another without
waiting for the first to finish or even begin.
.SH
Stopping a Program
.PP
You can stop most programs by
typing the character
.UC DEL '' ``
(perhaps called ``delete'' or ``rubout'' on your terminal).
The ``interrupt'' or ``break'' key found on most terminals
can also be used.
In a few programs, like the text editor,
.UC DEL
stops whatever the program is doing but leaves you in that program.
Hanging up the phone will stop most programs.
.SH
Logging Out
.PP
The easiest way to log out is to hang up the phone.
You can also type
.P1
login
.P2
and let someone else use the terminal you were on.
It is usually not sufficient just to turn off the terminal.
Most
.UC UNIX
systems
do not use a time-out mechanism, so you'll be
there forever unless you hang up.
.SH
Mail
.PP
When you log in, you may sometimes get the message
.P1
You have mail.
.P2
.UC UNIX
provides a postal system so you can
communicate with
other users of the system.
To read your mail,
type the command
.P1
mail
.P2
Your mail will be printed,
one message at a time,
most recent message first.
After each message,
.UL mail
waits for you to say what to do with it.
The two basic responses are
.UL d ,
which deletes the message,
and
.UC RETURN ,
which does not
(so it will still be there the next time you read your mailbox).
Other responses are described in the manual.
(Earlier versions of
.UL mail
do not process one message at a time,
but are otherwise similar.)
.PP
How do you send mail to someone else?
Suppose it is to go to ``joe'' (assuming ``joe'' is someone's login name).
The easiest way is this:
.P1
mail joe
.ft I
now type in the text of the letter
on as many lines as you like ...
After the last line of the letter
type the character ``control-d'',
that is, hold down ``control'' and type
a letter ``d''.
.P2
And that's it.
The ``control-d'' sequence, often called ``EOF'' for end-of-file, is used throughout 
the system
to mark the end of input from a terminal,
so you might as well get used to it.
.PP
For practice, send mail to yourself.
(This isn't as strange as it might sound \(em
mail to oneself is a handy reminder mechanism.)
.PP
There are other ways to send mail \(em
you can send a previously prepared letter,
and you can mail to a number of people all at once.
For more details see
.UL mail (1).
(The notation
.UL mail (1)
means the command 
.UL mail
in section 1
of the
.ul
.UC UNIX
.ul
.IT Programmer's
.IT Manual .)
.SH
Writing to other users
.PP
At some point, 
out of the blue will come a message
like
.P1
Message from joe tty07...
.P2
accompanied by a startling beep.
It means that Joe wants to talk to you,
but unless you take explicit action you won't be able to talk back.
To respond,
type the command
.P1
write joe
.P2
This establishes a two-way communication path.
Now whatever Joe types on his terminal will appear on yours
and vice versa.
The path is slow, rather like talking to the moon.
(If you are in the middle of something, you have to
get to a state where you can type a command.
Normally, whatever program you are running has to terminate or be terminated.
If you're editing, you can escape temporarily from the editor \(em
read the editor tutorial.)
.PP
A protocol is needed to keep what you type from getting
garbled up with what Joe types. 
Typically it's like this:
.P1
.tr --
.fi
.ft R
Joe types
.UL write
.UL smith
and waits.
.br
Smith types
.UL write
.UL joe
and waits.
.br
Joe now types his message
(as many lines as he likes).
When he's ready for a reply, he
signals it by typing
.UL (o) ,
which
stands for ``over''.
.br
Now Smith types a reply, also
terminated by
.UL (o) .
.br
This cycle repeats until
someone gets tired; he then
signals his intent to quit with
.UL (oo) ,
for ``over
and out''.
.br
To terminate
the conversation, each side must
type a ``control-d'' character alone
on a line. (``Delete'' also works.)
When the other person types his ``control-d'',
you will get the message
.UL EOF
on your terminal.
.P2
.PP
If you write to someone who isn't logged in,
or who doesn't want to be disturbed,
you'll be told.
If the target is logged in but doesn't answer
after a decent interval,
simply type ``control-d''.
.SH
On-line Manual
.PP
The 
.ul
.UC UNIX
.ul
Programmer's Manual
is typically kept on-line.
If you get stuck on something,
and can't find an expert to assist you,
you can print on your terminal some manual section that might help.
This is also useful for getting the most up-to-date
information on a command.
To print a manual section, type
``man command-name''.
Thus to read up on the 
.UL who
command,
type
.P1
man who
.P2
and, of course,
.P1
man man
.P2
tells all about the
.UL man
command.
.SH
Computer Aided Instruction
.PP
Your
.UC UNIX
system may have available
a program called
.UL learn ,
which provides computer aided instruction on
the file system and basic commands,
the editor,
document preparation,
and even C programming.
Try typing the command
.P1
learn
.P2
If 
.UL learn
exists on your system,
it will tell you what to do from there.
.SH
II.  DAY-TO-DAY USE
.SH
Creating Files \(em The Editor
.PP
If you have to type a paper or a letter or a program,
how do you get the information stored in the machine?
Most of these tasks are done with
the
.UC UNIX
``text editor''
.UL ed .
Since
.UL ed
is thoroughly documented in 
.UL ed (1) 
and explained in
.ul
A Tutorial Introduction to the UNIX Text Editor,
we won't spend any time here describing how to use it.
All we want it for right now is to make some
.ul
files.
(A file is just a collection of information stored in the machine,
a simplistic but adequate definition.)
.PP
To create a file 
called
.UL junk
with some text in it, do the following:
.P1
.ta .65i
ed junk	\fR(invokes the text editor)\f3
a	\fR(command to ``ed'', to add text)\f3
.ft I
now type in
whatever text you want ...
.ft 3
\&.	\fR(signals the end of adding text)\f3
.P2
The ``\f3.\fR'' that signals the end of adding text must be
at the beginning of a line by itself.
Don't forget it,
for until it is typed,
no other
.UL ed
commands will be recognized \(em
everything you type will be treated as text to be added.
.PP
At this point you can do various editing operations
on the text you typed in, such as correcting spelling mistakes,
rearranging paragraphs and the like.
Finally, you must write the information you have typed
into a file with the editor command
.UL w :
.P1
w
.P2
.UL ed
will respond with the number of characters it wrote
into the file 
.UL junk .
.PP
Until the
.UL w
command,
nothing is stored permanently,
so if you hang up and go home
the information is lost.\(dg
.FS
\(dg This is not strictly true \(em
if you hang up while editing, the data you were
working on is saved in a file called
.UL ed.hup ,
which you can continue with at your next session.
.FE
But after
.UL w
the information is there permanently;
you can re-access it any time by typing
.P1
ed junk
.P2
Type a
.UL q
command
to quit the editor.
(If you try to quit without writing,
.UL ed
will print a
.UL ?
to remind you.
A second
.UL q
gets you out regardless.)
.PP
Now create a second file called 
.UL temp
in the same manner.
You should now have two files,
.UL junk
and
.UL temp .
.SH
What files are out there?
.PP
The
.UL ls
(for ``list'') command lists the names
(not contents)
of any of the files that
.UC UNIX
knows about.
If you type
.P1
ls
.P2
the response will be
.P1
junk
temp
.P2
which are indeed the two files just created.
The names are sorted into alphabetical order automatically,
but other variations are possible.
For example,
the command
.P1
ls -t
.P2
causes the files to be listed in the order in which they were last changed,
most recent first.
The
.UL \-l
option gives a ``long'' listing:
.P1
ls -l
.P2
will produce something like
.P1
-rw-rw-rw- 1 bwk 41 Jul 22 2:56 junk
-rw-rw-rw- 1 bwk 78 Jul 22 2:57 temp
.P2
The date and time are of the last change to the file.
The 41 and 78 are the number of characters
(which should agree with the numbers you got from
.UL ed ).
.UL bwk
is the owner of the file, that is, the person
who created it.
The
.UL \-rw\-rw\-rw\- 
tells who has permission to read and write the file,
in this case everyone.
.PP
Options can be combined:
.UL ls\ \-lt
gives the same thing as
.UL ls\ \-l ,
but sorted into time order.
You can also name the files you're interested in,
and 
.UL ls
will list the information about them only.
More details can be found in 
.UL ls (1).
.PP
The use of optional arguments that begin with a minus sign,
like
.UL \-t
and
.UL \-lt ,
is a common convention for
.UC UNIX
programs.
In general, if a program accepts such optional arguments,
they precede any filename arguments.
It is also vital that you separate the various arguments with spaces:
.UL ls\-l
is not the same as
.UL ls\ \ \-l .
.SH
Printing Files
.PP
Now that you've got a file of text,
how do you print it so people can look at it?
There are a host of programs that do that,
probably more than are needed.
.PP
One simple thing is to use the editor,
since printing is often done just before making changes anyway.
You can say
.P1
ed junk
1,$p
.P2
.UL ed
will reply with the count of the characters in 
.UL junk
and then print all the lines in the file.
After you learn how to use the editor,
you can be selective about the parts you print.
.PP
There are times when it's not feasible to use the editor for printing.
For example, there is a limit on how big a file
.UL ed
can handle
(several thousand lines).
Secondly, 
it
will only print one file at a time,
and sometimes you want to print several, one after another.
So here are a couple of alternatives.
.PP
First is
.UL cat ,
the simplest of all the printing programs.
.UL cat
simply prints on the terminal the contents of all the files
named in a list.
Thus
.P1
cat junk
.P2
prints one file, and
.P1
cat junk temp
.P2
prints two.
The files are simply concatenated (hence the name
.UL cat '') ``
onto the terminal.
.PP
.UL pr
produces formatted printouts of files.
As with 
.UL cat ,
.UL pr
prints all the files named in a list.
The difference is that it produces 
headings with date, time, page number and file name
at the top of each page,
and
extra lines to skip over the fold in the paper.
Thus,
.P1
pr junk temp
.P2
will print
.UL junk
neatly,
then skip to the top of a new page and print
.UL temp
neatly.
.PP
.UL pr
can also produce multi-column output:
.P1
pr -3 junk 
.P2
prints
.UL junk
in 3-column format.
You can use any reasonable number in place of ``3''
and 
.UL pr
will do its best.
.UL pr
has other capabilities as well;
see
.UL pr (1).
.PP
It should be noted that
.UL pr
is
.ul
not
a formatting program in the sense of shuffling lines around
and justifying margins.
The true formatters are
.UL nroff
and
.UL troff ,
which we will get to in the section on document preparation.
.PP
There are also programs that print files
on a high-speed printer.
Look in your manual under
.UL opr
and
.UL lpr .
Which to use depends on
what equipment is attached to your machine.
.SH
Shuffling Files About
.PP
Now that you have some files in the file system
and some experience in printing them,
you can try bigger things.
For example,
you can move a file from one place to another
(which amounts to giving it a new name),
like this:
.P1
mv junk precious
.P2
This means that what used to be ``junk'' is now ``precious''.
If you do an
.UL ls
command now,
you will get
.P1
precious
temp
.P2
Beware that if you move a file to another one
that already exists,
the already existing contents are lost forever.
.PP
If you want
to make a
.ul
copy
of a file (that is, to have two versions of something),
you can use the 
.UL cp
command:
.P1
cp precious temp1
.P2
makes a duplicate copy of 
.UL precious
in
.UL temp1 .
.PP
Finally, when you get tired of creating and moving
files,
there is a command to remove files from the file system,
called
.UL rm .
.P1
rm temp temp1
.P2
will remove both of the files named.
.PP
You will get a warning message if one of the named files wasn't there,
but otherwise
.UL rm ,
like most
.UC UNIX
commands,
does its work silently.
There is no prompting or chatter,
and error messages are occasionally curt.
This terseness is sometimes disconcerting
to new\%comers,
but experienced users find it desirable.
.SH
What's in a Filename
.PP
So far we have used filenames without ever saying what's
a legal name,
so it's time for a couple of rules.
First, filenames are limited to 14 characters,
which is enough to be descriptive.
Second, although you can use almost any character
in a filename,
common sense says you should stick to ones that are visible,
and that you should probably avoid characters that might be used
with other meanings.
We have already seen, for example,
that in the
.UL ls
command,
.UL ls\ \-t
means to list in time order.
So if you had a file whose name
was
.UL \-t ,
you would have a tough time listing it by name.
Besides the minus sign, there are other characters which
have special meaning.
To avoid pitfalls,
you would do well to 
use only letters, numbers and the period
until you're familiar with the situation.
.PP
On to some more positive suggestions.
Suppose you're typing a large document
like a book.
Logically this divides into many small pieces,
like chapters and perhaps sections.
Physically it must be divided too,
for 
.UL ed
will not handle really big files.
Thus you should type the document as a number of files.
You might have a separate file for each chapter,
called
.P1
chap1
chap2
.ft R
etc...
.P2
Or, if each chapter were broken into several files, you might have
.P1
chap1.1
chap1.2
chap1.3
\&...
chap2.1
chap2.2
\&...
.P2
You can now tell at a glance where a particular file fits into the whole.
.PP
There are advantages to a systematic naming convention which are not obvious
to the novice
.UC UNIX 
user.
What if you wanted to print the whole book?
You could say
.P1
pr chap1.1 chap1.2 chap1.3 ......
.P2
but you would get tired pretty fast, and would probably even make mistakes.
Fortunately, there is a shortcut.
You can say
.P1
pr chap*
.P2
The
.UL *
means ``anything at all,''
so this translates into ``print all files
whose names begin with 
.UL chap '',
listed in alphabetical order.
.PP
This shorthand notation
is not a property of the
.UL pr
command, by the way.
It is system-wide, a service of the program
that interprets commands
(the ``shell,''
.UL sh (1)).
Using that fact, you can see how to list the names of the files in the book:
.P1
ls chap*
.P2
produces
.P1
chap1.1
chap1.2
chap1.3
\&...
.P2
The
.UL *
is not limited to the last position in a filename \(em
it can be anywhere
and can occur several times.
Thus
.P1
rm *junk* *temp*
.P2
removes all files that contain
.UL junk
or
.UL temp
as any part of their name.
As a special case,
.UL *
by itself matches every filename,
so
.P1
pr *
.P2
prints all your files
(alphabetical order),
and
.P1
rm *
.P2
removes
.ul
all files.
(You had better be
.IT  very 
sure that's what you wanted to say!)
.PP
The
.UL *
is not 
the only pattern-matching feature available.
Suppose you want to print only chapters 1 through 4 and 9.
Then you can say
.P1
pr chap[12349]*
.P2
The
.UL [...]
means to match any of the characters inside the brackets.
A range of consecutive letters or digits can be abbreviated,
so you can also do this 
with
.P1
pr chap[1-49]*
.P2
Letters can also be used within brackets:
.UL [a\-z]
matches any character in the range
.UL a
through
.UL z .
.PP
The
.UL ?
pattern matches any single character,
so
.P1
ls ?
.P2
lists all files which have single-character names,
and
.P1
ls -l chap?.1
.P2
lists information about the first file of each chapter
.UL chap1.1 \&, (
.UL chap2.1 ,
etc.).
.PP
Of these niceties,
.UL *
is certainly the most useful,
and you should get used to it.
The others are frills, but worth knowing.
.PP
If you should ever have to turn off the special meaning
of
.UL * ,
.UL ? ,
etc.,
enclose the entire argument in single quotes,
as in
.P1
ls \(fm?\(fm
.P2
We'll see some more examples of this shortly.
.SH
What's in a Filename, Continued
.PP
When you first made that file called
.UL junk ,
how did 
the system
know that there wasn't another
.UL junk
somewhere else,
especially since the person in the next office is also
reading this tutorial?
The answer is that generally each user 
has a private
.IT directory ,
which contains only the files that belong to him.
When you log in, you are ``in'' your directory.
Unless you take special action,
when you create a new file,
it is made in the directory that you are currently in;
this is most often your own directory,
and thus the file is unrelated to any other file of the same name
that might exist in someone else's directory.
.PP
The set of all files
is organized into a (usually big) tree,
with your files located several branches into the tree.
It is possible for you to ``walk'' around this tree,
and to find any file in the system, by starting at the root
of the tree and walking along the proper set of branches.
Conversely, you can start where you are and walk toward the root.
.PP
Let's try the latter first.
The basic tools is the command
.UL pwd
(``print working directory''),
which prints the name of the directory you are currently in.
.PP
Although the details will vary according to the system you are on,
if you give the
command
.UL pwd ,
it will print something like
.P1
/usr/your\(hyname
.P2
This says that you are currently in the directory
.UL your-name ,
which is in turn in the directory
.UL /usr ,
which is in turn in the root directory
called by convention just
.UL / .
(Even if it's not called
.UL /usr
on your system,
you will get something analogous.
Make the corresponding changes and read on.)
.PP
If you now type
.P1
ls /usr/your\(hyname
.P2
you should get exactly the same list of file names
as you get from a plain
.UL ls  :
with no arguments,
.UL ls
lists the contents of the current directory;
given the name of a directory,
it lists the contents of that directory.
.PP
Next, try
.P1
ls /usr
.P2
This should print a long series of names,
among which is your own login name
.UL your-name .
On many systems, 
.UL usr
is a directory that contains the directories
of all the normal users of the system,
like you.
.PP
The next step is to try
.P1
ls /
.P2
You should get a response something like this
(although again the details may be different):
.P1
bin
dev
etc
lib
tmp
usr
.P2
This is a collection of the basic directories of files
that
the system
knows about;
we are at the root of the tree.
.PP
Now try
.P1
cat /usr/your\(hyname/junk
.P2
(if
.UL junk
is still around in your directory).
The name
.P1
/usr/your\(hyname/junk
.P2
is called the
.UL pathname
of the file that
you normally think of as ``junk''.
``Pathname'' has an obvious meaning:
it represents the full name of the path you have to follow from the root
through the tree of directories to get to a particular file.
It is a universal rule in
the
.UC UNIX
system
that anywhere you can use an ordinary filename,
you can use a pathname.
.PP
Here is a picture which may make this clearer:
.P1 1
.ft R
.if t .vs 9p
.if t .tr /\(sl
.if t .tr ||
.ss 12
.ce 100
(root)
/ | \e
/  |  \e
/   |   \e
  bin    etc    usr    dev   tmp 
/ | \e   / | \e   / | \e   / | \e   / | \e
/  |  \e
/   |   \e
adam  eve   mary
/        /   \e        \e
             /     \e       junk
junk temp
.ce 0
.br
.tr //
.P2
.LP
Notice that Mary's
.UL junk
is unrelated to Eve's.
.PP
This isn't too exciting if all the files of interest are in your own
directory, but if you work with someone else
or on several projects concurrently,
it becomes handy indeed.
For example, your friends can print your book by saying
.P1
pr /usr/your\(hyname/chap*
.P2
Similarly, you can find out what files your neighbor has
by saying
.P1
ls /usr/neighbor\(hyname
.P2
or make your own copy of one of his files by
.P1
cp /usr/your\(hyneighbor/his\(hyfile yourfile
.P2
.PP
If your neighbor doesn't want you poking around in his files,
or vice versa,
privacy can be arranged.
Each file and directory has read-write-execute permissions for the owner,
a group, and everyone else,
which can be set
to control access.
See
.UL ls (1)
and
.UL chmod (1)
for details.
As a matter of observed fact,
most users most of the time find openness of more
benefit than privacy.
.PP
As a final experiment with pathnames, try
.P1
ls /bin /usr/bin
.P2
Do some of the names look familiar?
When you run a program, by typing its name after the prompt character,
the system simply looks for a file of that name.
It normally looks first in your directory
(where it typically doesn't find it),
then in
.UL /bin
and finally in
.UL /usr/bin .
There is nothing magic about commands like
.UL cat
or
.UL ls ,
except that they have been collected into a couple of places to be easy to find and administer.
.PP
What if you work regularly with someone else on common information
in his directory?
You could just log in as your friend each time you want to,
but you can also say
``I want to work on his files instead of my own''.
This is done by changing the directory that you are
currently in:
.P1
cd /usr/your\(hyfriend
.P2
(On some systems,
.UL cd
is spelled
.UL chdir .)
Now when you use a filename in something like
.UL cat
or
.UL pr ,
it refers to the file in your friend's directory.
Changing directories doesn't affect any permissions associated
with a file \(em
if you couldn't access a file from your own directory,
changing to another directory won't alter that fact.
Of course,
if you forget what directory you're in, type
.P1
pwd
.P2
to find out.
.PP
It is usually convenient to arrange your own files
so that all the files related to one thing are in a directory separate
from other projects.
For example, when you write your book, you might want to keep all the text
in a directory called
.UL book .
So make one with
.P1
mkdir book
.P2
then go to it with
.P1
cd book
.P2
then start typing chapters.
The book is now found in (presumably)
.P1
/usr/your\(hyname/book
.P2
To remove the directory
.UL book ,
type
.P1
rm book/*
rmdir book
.P2
The first command removes all files from the directory;
the second
removes the empty directory.
.PP
You can go up one level in the tree of files 
by saying
.P1
cd ..
.P2
.UL .. '' ``
is the name of the parent of whatever directory you are currently in.
For completeness,
.UL . '' ``
is an alternate name
for the directory you are in.
.SH
Using Files instead of the Terminal
.PP
Most of the commands we have seen so far produce output
on the terminal;
some, like the editor, also take their input from the terminal.
It is universal in
.UC UNIX
systems
that the terminal can be replaced by a file
for either or both of input and output.
As one example,
.P1
ls
.P2
makes a list of files on your terminal.
But if you say
.P1
ls >filelist
.P2
a list of your files will be placed in the file
.UL filelist
(which
will be created if it doesn't already exist,
or overwritten if it does).
The symbol
.UL >
means ``put the output on the following file,
rather than on the terminal.''
Nothing is produced on the terminal.
As another example, you could combine
several files into one by capturing the output of
.UL cat
in a file:
.P1
cat f1 f2 f3 >temp
.P2
.PP
The symbol
.UL >>
operates very much like
.UL >
does,
except that it means
``add to the end of.''
That is,
.P1
cat f1 f2 f3 >>temp
.P2
means to concatenate
.UL f1 ,
.UL f2 
and
.UL f3
to the end of whatever is already in
.UL temp ,
instead of overwriting the existing contents.
As with
.UL > ,
if 
.UL temp
doesn't exist, it will be created for you.
.PP
In a similar way, the symbol
.UL <
means to take the input
for a program from the following file,
instead of from the terminal.
Thus, you could make up a script of commonly used editing commands
and put them into a file called
.UL script .
Then you can run the script on a file by saying
.P1
ed file <script
.P2
As another example, you can use
.UL ed
to prepare a letter in file
.UL let ,
then send it to several people with
.P1
mail adam eve mary joe <let
.P2
.SH
Pipes
.PP
One of the novel contributions of
the
.UC UNIX
system
is the idea of a
.ul
pipe.
A pipe is simply a way to connect the output of one program
to the input of another program,
so the two run as a sequence of processes \(em
a pipeline.
.PP
For example,
.P1
pr f g h
.P2
will print the files
.UL f ,
.UL g ,
and
.UL h ,
beginning each on a new page.
Suppose you want
them run together instead.
You could say
.P1
cat f g h >temp
pr <temp
rm temp
.P2
but this is more work than necessary.
Clearly what we want is to take the output of
.UL cat
and
connect it to the input of
.UL pr .
So let us use a pipe:
.P1
cat f g h | pr
.P2
The vertical bar 
.UL |
means to
take the output from
.UL cat ,
which would normally have gone to the terminal,
and put it into
.UL pr
to be neatly formatted.
.PP
There are many other examples of pipes.
For example,
.P1
ls | pr -3
.P2
prints a list of your files in three columns.
The program
.UL wc
counts the number of lines, words and characters in
its input, and as we saw earlier,
.UL who
prints a list of currently-logged on people,
one per line.
Thus
.P1
who | wc
.P2
tells how many people are logged on.
And of course
.P1
ls | wc
.P2
counts your files.
.PP
Any program
that reads from the terminal
can read from a pipe instead;
any program that writes on the terminal can drive
a pipe.
You can have as many elements in a pipeline as you wish.
.PP
Many
.UC UNIX
programs are written so that they will take their input from one or more files
if file arguments are given;
if no arguments are given they will read from the terminal,
and thus can be used in pipelines.
.UL pr
is one example:
.P1
pr -3 a b c
.P2
prints files
.UL a ,
.UL b
and
.UL c
in order in three columns.
But in
.P1
cat a b c | pr -3
.P2
.UL pr
prints the information coming down the pipeline,
still in
three columns.
.SH
The Shell
.PP
We have already mentioned once or twice the mysterious
``shell,''
which is in fact
.UL sh (1).
The shell is the program that interprets what you type as
commands and arguments.
It also looks after translating
.UL * ,
etc.,
into lists of filenames,
and
.UL < ,
.UL > ,
and
.UL |
into changes of input and output streams.
.PP
The shell has other capabilities too.
For example, you can run two programs with one command line
by separating the commands with a semicolon;
the shell recognizes the semicolon and
breaks the line into two commands.
Thus
.P1
date; who
.P2
does both commands before returning with a prompt character.
.PP
You can also have more than one program running
.ul
simultaneously
if you wish.
For example, if you are doing something time-consuming,
like the editor script
of an earlier section,
and you don't want to wait around for the results before starting something else,
you can say
.P1
ed file <script &
.P2
The ampersand at the end of a command line
says ``start this command running,
then take further commands from the terminal immediately,''
that is,
don't wait for it to complete.
Thus the script will begin,
but you can do something else at the same time.
Of course, to keep the output from interfering
with what you're doing on the terminal,
it would be better to say
.P1
ed file <script >script.out &
.P2
which saves the output lines in a file
called
.UL script.out .
.PP
When you initiate a command with
.UL & ,
the system
replies with a number
called the process number,
which identifies the command in case you later want
to stop it.
If you do, you can say
.P1
kill process\(hynumber
.P2
If you forget the process number,
the command
.UL ps
will tell you about everything you have running.
(If you are desperate,
.UL kill\ 0
will kill all your processes.)
And if you're curious about other people,
.UL ps\ a
will tell you about
.ul
all
programs that are currently running.
.PP
You can say
.P1 1
(command\(hy1; command\(hy2; command\(hy3) &
.P2
to start three commands in the background,
or you can start a background pipeline with
.P1
command\(hy1 | command\(hy2 &
.P2
.PP
Just as you can tell the editor
or some similar program to take its input
from a file instead of from the terminal,
you can tell the shell to read a file
to get commands.
(Why not? The shell, after all, is just a program,
albeit a clever one.)
For instance, suppose you want to set tabs on
your terminal, and find out the date
and who's on the system every time you log in.
Then you can put the three necessary commands
.UL tabs , (
.UL date ,
.UL who )
into a file, let's call it
.UL startup ,
and then run it with
.P1
sh startup
.P2
This says to run the shell with the file
.UL startup
as input.
The effect is as if you had typed 
the contents of
.UL startup
on the terminal.
.PP
If this is to be a regular thing,
you can eliminate the 
need to type
.UL sh :
simply type, once only, the command
.P1
chmod +x startup
.P2
and thereafter you need only say
.P1
startup
.P2
to run the sequence of commands.
The
.UL chmod (1)
command marks the file executable;
the shell recognizes this and runs it as a sequence of commands.
.PP
If you want 
.UL startup
to run automatically every time you log in,
create a file in your login directory called
.UL .profile ,
and place in it the line
.UL startup .
When the shell first gains control when you log in,
it looks for the 
.UL .profile
file and does whatever commands it finds in it.
We'll get back to the shell in the section
on programming.
.sp
.SH
III. DOCUMENT PREPARATION
.PP
.UC UNIX
systems are used extensively for document preparation.
There are two major 
formatting
programs,
that is,
programs that produce a text with
justified right margins, automatic page numbering and titling,
automatic hyphenation,
and the like.
.UL nroff
is designed to produce output on terminals and
line-printers.
.UL troff
(pronounced ``tee-roff'')
instead drives a phototypesetter,
which produces very high quality output
on photographic paper.
This paper was formatted with
.UL troff .
.SH
Formatting Packages
.PP
The basic idea of
.UL nroff 
and 
.UL troff
is that the text to be formatted contains within it
``formatting commands'' that indicate in detail
how the formatted text is to look.
For example, there might be commands that specify how long
lines are, whether to use single or double spacing,
and what running titles to use on each page.
.PP
Because
.UL nroff
and
.UL troff
are relatively hard to learn to use effectively,
several
``packages'' of canned formatting requests are available
to let you specify
paragraphs, running titles, footnotes, multi-column output,
and so on, with little effort
and without having to learn
.UL nroff
and
.UL troff .
These packages take a modest effort to learn,
but the rewards for using them are so great
that it is time well spent.
.PP
In this section,
we will provide a hasty look at the ``manuscript'' 
package known as
.UL \-ms .
Formatting requests typically consist of a period and two upper-case letters,
such as
.UL .TL ,
which is used to introduce a title,
or
.UL .PP
to begin a new paragraph.
.PP
A document is typed so it looks something like this:
.P1
\&.TL
title of document
\&.AU
author name
\&.SH
section heading
\&.PP
paragraph ...
\&.PP
another paragraph ...
\&.SH
another section heading
\&.PP
etc.
.P2
The lines that begin with a period are the formatting requests.
For example,
.UL .PP
calls for starting a new paragraph.
The precise meaning of
.UL .PP
depends on what output device is being used
(typesetter or terminal, for instance),
and on what publication the document will appear in.
For example,
.UL \-ms
normally assumes that a paragraph is preceded by a space
(one line in
.UL nroff ,
\(12 line in
.UL troff ),
and the first word is indented.
These rules can be changed if you like,
but they are changed by changing the interpretation
of
.UL .PP ,
not by re-typing the document.
.PP
To actually produce a document in standard format
using
.UL \-ms ,
use the command
.P1
troff -ms files ...
.P2
for the typesetter, and
.P1
nroff -ms files ...
.P2
for a terminal.
The
.UL \-ms
argument tells
.UL troff
and
.UL nroff
to use the manuscript package of formatting requests.
.PP
There are several similar packages;
check with a local expert to determine which ones
are in common use on your machine.
.SH
Supporting Tools
.PP
In addition to the basic formatters,
there is
a host of supporting programs
that help with document preparation.
The list in the next few paragraphs
is far from complete,
so browse through the manual
and check with people around you for other possibilities.
.PP
.UL eqn
and
.UL neqn
let you integrate mathematics
into the text of a document,
in an easy-to-learn language that closely resembles the way
you would speak it aloud.
For example, the
.UL eqn
input
.P1
sum from i=0 to n x sub i ~=~ pi over 2
.P2
produces the output
.EQ
sum from i=0 to n x sub i ~=~ pi over 2
.EN
.PP
The program
.UL tbl
provides an analogous service for preparing tabular material;
it does all the computations necessary to align complicated columns
with elements of varying widths.
.PP
.UL refer
prepares bibliographic citations from a data base,
in whatever style is defined by the formatting package.
It looks after all the details of numbering references in sequence,
filling in page and volume numbers,
getting the author's initials and the journal name right,
and so on.
.PP
.UL spell
and
.UL typo
detect possible spelling mistakes in a document.
.UL spell
works by comparing the words in your document
to a dictionary,
printing those that are not in the dictionary.
It knows enough about English spelling to detect plurals and the like,
so it does a very good job.
.UL typo
looks for words which are ``unusual'',
and prints those.
Spelling mistakes tend to be more unusual,
and thus show up early when the most unusual words
are printed first.
.PP
.UL grep
looks through a set of files for lines
that contain a particular text pattern 
(rather like the editor's context search does,
but on a bunch of files).
For example,
.P1
grep \(fming$\(fm chap*
.P2
will find all lines that end with
the letters
.UL ing
in the files
.UL chap* .
(It is almost always a good practice to put single quotes around
the pattern you're searching for,
in case it contains characters like
.UL *
or
.UL $
that have a special meaning to the shell.)
.UL grep
is often useful for finding out in which of a set of files
the misspelled words detected by
.UL spell
are actually located.
.PP
.UL diff
prints a list of the differences between
two files,
so you can compare
two versions of something automatically
(which certainly beats proofreading by hand).
.PP
.UL wc
counts the words, lines and characters in a set of files.
.UL tr
translates characters into other characters;
for example it will convert upper to lower case and vice versa.
This translates upper into lower:
.P1
tr A-Z a-z <input >output
.P2
.PP
.UL sort
sorts files in a variety of ways;
.UL cref
makes cross-references;
.UL ptx
makes a permuted index
(keyword-in-context listing).
.UL sed
provides many of the editing facilities
of
.UL ed ,
but can apply them to arbitrarily long inputs.
.UL awk
provides the ability to do both pattern matching and numeric computations,
and to conveniently process fields within lines.
These programs are for more advanced users,
and they are not limited to document preparation.
Put them on your list of things to learn about.
.PP
Most of these programs are either independently documented
(like
.UL eqn
and
.UL tbl ),
or are sufficiently simple that the description in
the
.ul 2
.UC UNIX
Programmer's Manual
is adequate explanation.
.SH
Hints for Preparing Documents
.PP
Most documents go through several versions (always more than you expected) before they
are finally finished.
Accordingly, you should do whatever possible to make
the job of changing them easy.
.PP
First, when you do the purely mechanical operations of typing,
type so that subsequent editing will be easy.
Start each sentence on a new line.
Make lines short,
and break lines at natural places,
such as after commas and semicolons,
rather than randomly.
Since most people change documents by rewriting phrases
and adding, deleting and rearranging sentences,
these precautions simplify any editing
you have to do later.
.PP
Keep the individual files of a document down
to modest size,
perhaps ten to fifteen thousand characters.
Larger files edit more slowly,
and of course if you make a dumb mistake
it's better to have clobbered a small file than a big one.
Split into files at natural boundaries in the document,
for the same reasons that you start each sentence
on a new line.
.PP
The second aspect of making change easy
is to not commit yourself to formatting details too early.
One of the advantages of formatting packages like
.UL \-ms
is that they permit you to delay decisions
to the last possible moment.
Indeed,
until a document is printed,
it is not even decided whether it will be typeset
or put on a line printer.
.PP
As a rule of thumb, for all but the most trivial jobs,
you should type a document in terms of a set of requests
like
.UL .PP ,
and then define them appropriately,
either by using one of the canned packages
(the better way)
or by defining your own
.UL nroff
and
.UL troff
commands.
As long as you have entered the text in some systematic way,
it can always be cleaned up and re-formatted
by a judicious combination of
editing commands and request definitions.
.SH
IV.  PROGRAMMING
.PP
There will be no attempt made to teach any of
the programming languages available
but a few words of advice are in order.
One of the reasons why the
.UC UNIX
system is a productive programming environment
is that there is already a rich set of tools available,
and facilities like pipes, I/O redirection,
and the capabilities of the shell
often make it possible to do a job
by pasting together programs that already exist
instead of writing from scratch.
.SH
The Shell
.PP
The pipe mechanism lets you fabricate quite complicated operations
out of spare parts that already exist.
For example,
the first draft of the
.UL  spell 
program was (roughly)
.P1
.ta .6i 1.2i
cat ...	\f2collect the files\f3
| tr ...	\f2put each word on a new line\f3
| tr ...	\f2delete punctuation, etc.\f3
| sort	\f2into dictionary order\f3
| uniq	\f2discard duplicates\f3
| comm	\f2print words in text\f3
	\f2  but not in dictionary\f3
.P2
More pieces have been added subsequently,
but this goes a long way
for such a small effort.
.PP
The editor can be made to do things that would normally
require special programs on other systems.
For example, to list the first and last lines of each of a
set of files, such as a book,
you could laboriously type
.P1
ed
e chap1.1
1p
$p
e chap1.2
1p
$p
.ft R
etc.
.P2
But you can do the job much more easily.
One way is to type
.P1
ls chap* >temp
.P2
to get the list of filenames into a file.
Then edit this file to make the necessary
series of editing commands
(using the global commands of
.UL ed ),
and write it into
.UL script .
Now the command
.P1
ed <script
.P2
will produce
the same output as the laborious hand typing.
Alternately
(and more easily),
you can use the fact that the shell will perform loops,
repeating a set of commands over and over again
for a set of arguments:
.P1
for i in chap*
do
	ed $i <script
done
.P2
This sets the shell variable
.UL i
to each file name in turn,
then does the command.
You can type this command at the terminal,
or put it in a file for later execution.
.SH
Programming the Shell
.PP
An option often overlooked by newcomers
is that the shell is itself a programming language,
with variables,
control flow
.UL if-else , (
.UL while ,
.UL for ,
.UL case ),
subroutines,
and interrupt handling.
Since
there are
many building-block programs,
you can sometimes avoid writing a new program
merely by piecing together some of the building blocks
with shell command files.
.PP
We will not go into any details here;
examples and rules can be found in
.ul
An Introduction to the
.ul
.UC UNIX
.IT Shell ,
by S. R. Bourne.
.SH
Programming in C
.PP
If you are undertaking anything substantial,
C is the only reasonable choice of programming language:
everything in
the
.UC UNIX
system
is tuned to it.
The
system
itself
is written in C,
as are most of the programs that run on it.
It is also a easy language to use
once you get started.
C is introduced and fully described in
.ul
The C Programming Language
by
B. W. Kernighan and D. M. Ritchie
(Prentice-Hall, 1978).
Several sections of the manual
describe the system interfaces, that is,
how you do I/O
and similar functions.
Read
.ul
UNIX Programming
for more complicated things.
.PP
Most input and output in C is best handled with the 
standard I/O library,
which provides a set of I/O functions
that exist in compatible form on most machines
that have C compilers.
In general, it's wisest to confine the system interactions
in a program to the facilities provided by this library.
.PP
C programs that don't depend too much on special features of 
.UC UNIX
(such as pipes)
can be moved to other computers that have C compilers.
The list of such machines grows daily;
in addition to the original
.UC PDP -11,
it currently includes
at least
Honeywell 6000,
IBM 370,
Interdata 8/32,
Data General Nova and Eclipse,
HP 2100,
Harris /7,
VAX 11/780,
SEL 86,
and
Zilog Z80.
Calls to the standard I/O library will work on all of these machines.
.PP
There are a number of supporting programs that go with C.
.UL lint
checks C programs for potential portability problems,
and detects errors such as mismatched argument types
and uninitialized variables.
.PP
For larger programs
(anything whose source is on more than one file)
.UL make
allows you to specify the dependencies among the source files
and the processing steps needed to make a new version;
it then checks the times that the pieces were last changed
and does the minimal amount of recompiling
to create a consistent updated version.
.PP
The debugger
.UL adb
is useful for digging through the dead bodies
of C programs,
but is rather hard to learn to use effectively.
The most effective debugging tool is still
careful thought, coupled with judiciously placed
print statements.
.PP
The C compiler provides a limited instrumentation service,
so you can find out
where programs spend their time and what parts are worth optimizing.
Compile the routines with the
.UL \-p
option;
after the test run, use
.UL prof
to print an execution profile.
The command
.UL time
will give you the gross run-time statistics
of a program, but they are not super accurate or reproducible.
.SH
Other Languages
.PP
If you 
.ul
have
to use Fortran,
there are two possibilities.
You might consider
Ratfor,
which gives you the decent control structures
and free-form input that characterize C,
yet lets you write code that
is still portable to other environments.
Bear in mind that
.UC UNIX
Fortran
tends to produce large and relatively slow-running
programs.
Furthermore, supporting software like
.UL adb ,
.UL prof ,
etc., are all virtually useless with Fortran programs.
There may also be a Fortran 77 compiler on your system.
If so,
this is a viable alternative to 
Ratfor,
and has the non-trivial advantage that it is compatible with C
and related programs.
(The Ratfor processor
and C tools
can be used with Fortran 77 too.)
.PP
If your application requires you to translate
a language into a set of actions or another language,
you are in effect building a compiler,
though probably a small one.
In that case,
you should be using
the
.UL yacc
compiler-compiler, 
which helps you develop a compiler quickly.
The
.UL lex
lexical analyzer generator does the same job
for the simpler languages that can be expressed as regular expressions.
It can be used by itself,
or as a front end to recognize inputs for a
.UL yacc -based
program.
Both
.UL yacc
and
.UL lex
require some sophistication to use,
but the initial effort of learning them
can be repaid many times over in programs
that are easy to change later on.
.PP
Most
.UC UNIX
systems also make available other languages,
such as
Algol 68, APL, Basic, Lisp, Pascal, and Snobol.
Whether these are useful depends largely on the local environment:
if someone cares about the language and has worked on it,
it may be in good shape.
If not, the odds are strong that it
will be more trouble than it's worth.
.SH
V.  UNIX READING LIST
.SH
General:
.LP
K. L. Thompson and D. M. Ritchie,
.IT The
.ul
.UC UNIX
.ul
Programmer's Manual,
Bell Laboratories, 1978.
Lists commands,
system routines and interfaces, file formats,
and some of the maintenance procedures.
You can't live without this,
although you will probably only need to read section 1.
.LP
.ul
Documents for Use with the
.ul
.UC UNIX
.ul
Time-sharing System.
Volume 2 of the Programmer's Manual.
This contains more extensive descriptions
of major commands,
and tutorials and reference manuals.
All of the papers listed below are in it,
as are descriptions of most of the programs
mentioned above.
.LP
D. M. Ritchie and K. L. Thompson,
``The
.UC UNIX
Time-sharing System,''
CACM, July 1974.
An overview of the system,
for people interested in operating systems.
Worth reading by anyone who programs.
Contains a remarkable number of one-sentence observations
on how to do things right.
.LP
The Bell System Technical Journal
(BSTJ)
Special Issue on 
.UC UNIX ,
July/August, 1978,
contains many papers describing recent developments,
and some retrospective material.
.LP
The 2nd International Conference on Software Engineering
(October, 1976)
contains several 
papers describing the use of the
Programmer's Workbench
.UC PWB ) (
version of
.UC UNIX .
.SH
Document Preparation:
.LP
B. W. Kernighan,
``A Tutorial Introduction to the
.UC UNIX
Text Editor''
and
``Advanced Editing on
.UC UNIX ,''
Bell Laboratories, 1978.
Beginners need the introduction;
the advanced material will help you get the most
out of the editor.
.LP
M. E. Lesk,
``Typing Documents on
.UC UNIX ,''
Bell Laboratories, 1978.
Describes the
.UL \-ms
macro package, which isolates the novice
from the vagaries of
.UL nroff
and
.UL troff ,
and takes care of most formatting situations.
If this specific package isn't available on your system,
something similar probably is.
The most likely alternative is the
.UC PWB/UNIX
macro package
.UL \-mm ;
see your local guru if you use
.UC PWB/UNIX .
.LP
B. W. Kernighan and L. L. Cherry,
``A System for Typesetting Mathematics,''
Bell Laboratories Computing Science Tech. Rep. 17.
.LP
M. E. Lesk,
``Tbl \(em A Program to Format Tables,''
Bell Laboratories CSTR 49, 1976.
.LP
J. F. Ossanna, Jr.,
``NROFF/TROFF User's Manual,''
Bell Laboratories CSTR 54, 1976.
.UL troff
is the basic formatter used by
.UL \-ms ,
.UL eqn
and
.UL tbl .
The reference manual is indispensable
if you are going to write or maintain these
or similar programs.
But start with:
.LP
B. W. Kernighan,
``A TROFF Tutorial,''
Bell Laboratories, 1976.
An attempt to unravel the intricacies of
.UL troff .
.SH
Programming:
.LP
B. W. Kernighan and D. M. Ritchie,
.ul
The C Programming Language,
Prentice-Hall, 1978.
Contains a tutorial introduction,
complete discussions of all language features,
and the reference manual.
.LP
B. W. Kernighan and D. M. Ritchie,
.UC UNIX \& ``
Programming,''
Bell Laboratories, 1978.
Describes how to interface with the system from C programs:
I/O calls, signals, processes.
.LP
S. R. Bourne,
``An Introduction to the
.UC UNIX
Shell,''
Bell Laboratories, 1978.
An introduction and reference manual for the Version 7 shell.
Mandatory reading if you intend to make effective use
of the programming power
of this shell.
.LP
S. C. Johnson,
``Yacc \(em Yet Another Compiler-Compiler,''
Bell Laboratories CSTR 32, 1978.
.LP
M. E. Lesk,
``Lex \(em A Lexical Analyzer Generator,''
Bell Laboratories CSTR 39, 1975.
.LP
S. C. Johnson,
``Lint, a C Program Checker,''
Bell Laboratories CSTR 65, 1977.
.LP
S. I. Feldman,
``MAKE \(em A Program for Maintaining Computer Programs,''
Bell Laboratories CSTR 57, 1977.
.LP
J. F. Maranzano and S. R. Bourne,
``A Tutorial Introduction to ADB,''
Bell Laboratories CSTR 62, 1977.
An introduction to a powerful but complex debugging tool.
.LP
S. I. Feldman and P. J. Weinberger,
``A Portable Fortran 77 Compiler,''
Bell Laboratories, 1978.
A full Fortran 77 for
.UC UNIX
systems.
.sp
.I "May 1979"