V7M/doc/beginners/u4

.SH
IV.  PROGRAMMING
.PP
There will be no attempt made to teach any of
the programming languages available
but a few words of advice are in order.
One of the reasons why the
.UC UNIX
system is a productive programming environment
is that there is already a rich set of tools available,
and facilities like pipes, I/O redirection,
and the capabilities of the shell
often make it possible to do a job
by pasting together programs that already exist
instead of writing from scratch.
.SH
The Shell
.PP
The pipe mechanism lets you fabricate quite complicated operations
out of spare parts that already exist.
For example,
the first draft of the
.UL  spell 
program was (roughly)
.P1
.ta .6i 1.2i
cat ...	\f2collect the files\f3
| tr ...	\f2put each word on a new line\f3
| tr ...	\f2delete punctuation, etc.\f3
| sort	\f2into dictionary order\f3
| uniq	\f2discard duplicates\f3
| comm	\f2print words in text\f3
	\f2  but not in dictionary\f3
.P2
More pieces have been added subsequently,
but this goes a long way
for such a small effort.
.PP
The editor can be made to do things that would normally
require special programs on other systems.
For example, to list the first and last lines of each of a
set of files, such as a book,
you could laboriously type
.P1
ed
e chap1.1
1p
$p
e chap1.2
1p
$p
.ft R
etc.
.P2
But you can do the job much more easily.
One way is to type
.P1
ls chap* >temp
.P2
to get the list of filenames into a file.
Then edit this file to make the necessary
series of editing commands
(using the global commands of
.UL ed ),
and write it into
.UL script .
Now the command
.P1
ed <script
.P2
will produce
the same output as the laborious hand typing.
Alternately
(and more easily),
you can use the fact that the shell will perform loops,
repeating a set of commands over and over again
for a set of arguments:
.P1
for i in chap*
do
	ed $i <script
done
.P2
This sets the shell variable
.UL i
to each file name in turn,
then does the command.
You can type this command at the terminal,
or put it in a file for later execution.
.SH
Programming the Shell
.PP
An option often overlooked by newcomers
is that the shell is itself a programming language,
with variables,
control flow
.UL if-else , (
.UL while ,
.UL for ,
.UL case ),
subroutines,
and interrupt handling.
Since
there are
many building-block programs,
you can sometimes avoid writing a new program
merely by piecing together some of the building blocks
with shell command files.
.PP
We will not go into any details here;
examples and rules can be found in
.ul
An Introduction to the
.ul
.UC UNIX
.IT Shell ,
by S. R. Bourne.
.SH
Programming in C
.PP
If you are undertaking anything substantial,
C is the only reasonable choice of programming language:
everything in
the
.UC UNIX
system
is tuned to it.
The
system
itself
is written in C,
as are most of the programs that run on it.
It is also a easy language to use
once you get started.
C is introduced and fully described in
.ul
The C Programming Language
by
B. W. Kernighan and D. M. Ritchie
(Prentice-Hall, 1978).
Several sections of the manual
describe the system interfaces, that is,
how you do I/O
and similar functions.
Read
.ul
UNIX Programming
for more complicated things.
.PP
Most input and output in C is best handled with the 
standard I/O library,
which provides a set of I/O functions
that exist in compatible form on most machines
that have C compilers.
In general, it's wisest to confine the system interactions
in a program to the facilities provided by this library.
.PP
C programs that don't depend too much on special features of 
.UC UNIX
(such as pipes)
can be moved to other computers that have C compilers.
The list of such machines grows daily;
in addition to the original
.UC PDP -11,
it currently includes
at least
Honeywell 6000,
IBM 370,
Interdata 8/32,
Data General Nova and Eclipse,
HP 2100,
Harris /7,
VAX 11/780,
SEL 86,
and
Zilog Z80.
Calls to the standard I/O library will work on all of these machines.
.PP
There are a number of supporting programs that go with C.
.UL lint
checks C programs for potential portability problems,
and detects errors such as mismatched argument types
and uninitialized variables.
.PP
For larger programs
(anything whose source is on more than one file)
.UL make
allows you to specify the dependencies among the source files
and the processing steps needed to make a new version;
it then checks the times that the pieces were last changed
and does the minimal amount of recompiling
to create a consistent updated version.
.PP
The debugger
.UL adb
is useful for digging through the dead bodies
of C programs,
but is rather hard to learn to use effectively.
The most effective debugging tool is still
careful thought, coupled with judiciously placed
print statements.
.PP
The C compiler provides a limited instrumentation service,
so you can find out
where programs spend their time and what parts are worth optimizing.
Compile the routines with the
.UL \-p
option;
after the test run, use
.UL prof
to print an execution profile.
The command
.UL time
will give you the gross run-time statistics
of a program, but they are not super accurate or reproducible.
.SH
Other Languages
.PP
If you 
.ul
have
to use Fortran,
there are two possibilities.
You might consider
Ratfor,
which gives you the decent control structures
and free-form input that characterize C,
yet lets you write code that
is still portable to other environments.
Bear in mind that
.UC UNIX
Fortran
tends to produce large and relatively slow-running
programs.
Furthermore, supporting software like
.UL adb ,
.UL prof ,
etc., are all virtually useless with Fortran programs.
There may also be a Fortran 77 compiler on your system.
If so,
this is a viable alternative to 
Ratfor,
and has the non-trivial advantage that it is compatible with C
and related programs.
(The Ratfor processor
and C tools
can be used with Fortran 77 too.)
.PP
If your application requires you to translate
a language into a set of actions or another language,
you are in effect building a compiler,
though probably a small one.
In that case,
you should be using
the
.UL yacc
compiler-compiler, 
which helps you develop a compiler quickly.
The
.UL lex
lexical analyzer generator does the same job
for the simpler languages that can be expressed as regular expressions.
It can be used by itself,
or as a front end to recognize inputs for a
.UL yacc -based
program.
Both
.UL yacc
and
.UL lex
require some sophistication to use,
but the initial effort of learning them
can be repaid many times over in programs
that are easy to change later on.
.PP
Most
.UC UNIX
systems also make available other languages,
such as
Algol 68, APL, Basic, Lisp, Pascal, and Snobol.
Whether these are useful depends largely on the local environment:
if someone cares about the language and has worked on it,
it may be in good shape.
If not, the odds are strong that it
will be more trouble than it's worth.