SysIII/usr/src/man/docs/ratfor

.de PT
.lt \\n(LLu
.pc %
.nr PN \\n%
.if \\n%-1 .if o .tl '\s9\f2RATFOR\fP''\\n(PN\s0'
.if \\n%-1 .if e .tl '\s9\\n(PN''\f2RATFOR\^\fP\s0'
.lt \\n(.lu
..
.tr ~
.tr _\(em
'\"	.ND
.if n .ls 2
.de UL
.if t \&\\$3\f3\\$1\fP\\$2\&
.if n \&\\$3\f2\\$1\fP\\$2\&
..
.de IT
.if n .ul
\\$3\f2\\$1\fP\\$2
..
.de UI
\f3\\$1\fI\\$2\fR\\$3
..
.de P1
.if \\n(.$ .DS I \\$1
.if !\\n(.$ .DS I 5
.if n .ls 1
.nf
.if n .ta 5 10 15 20 25 30 35 40 45 50 55 60
.if t .ta .4i .8i 1.2i 1.6i 2i 2.4i 2.8i 3.2i 3.6i 4i
.if t .tr -\(mi|\(bv'\(fm^\(no
.if t .tr _\(ru
.lg 0
.		use first argument as indent if present
..
.de P2
.ft R
.if n .ls 2
.tr --||''^^!!_\(em
.lg
.DE
..
.hw semi-colon
.if t .ds m \(mi
.if n .ds m -
.if t .ds n \(no
.if n .ds n -
.if t .ds S \(sl
.if n .ds S /
.if t .ds d \s+4\&.\&\s-4
.if n .ds d \&.\&
.if t .ds a \z@@
.if n .ds a @
.	2=not last lines; 4= no -xx; 8=no xx-
.ds m \(mi
.tr *\(**
.de UC
\&\\$3\s-2\\$1\s0\\$2\&
..
.hy 14
'\"	.ND "January 1, 1977"
....TR 55
.TL
R\s-2ATFOR\s+2\(emA Preprocessor for a Rational Fortran
.AU "MH 2C-518" 6021
Brian W. Kernighan
.AI
.MH
.OK
structured programming, control flow, programming
.AB
.ps 9
.nr PS 9
.vs 11
.nr VS 11
.PP
Although Fortran is not a pleasant
language to use,
it does have the advantages of universality
and (usually) relative efficiency.
The
Ratfor 
language attempts to conceal
the main deficiencies of Fortran
while retaining its desirable qualities,
by providing
decent control flow statements:
.IP "\ \ \ \(bu"
statement grouping
.IP "\ \ \ \(bu"
.UL if-else
and
.UL switch
for decision-making
.IP "\ \ \ \(bu"
.UL while ,
.UL for ,
.UL do ,
and
.UL repeat-until
for looping
.IP "\ \ \ \(bu"
.UL break
and
.UL next
for controlling loop exits
.LP
and some ``syntactic sugar'':
.IP "\ \ \ \(bu"
free form input (multiple statements/line, automatic continuation)
.IP "\ \ \ \(bu"
unobtrusive comment convention
.IP "\ \ \ \(bu"
translation of >, >=, etc., into .GT., .GE., etc.
.IP "\ \ \ \(bu"
.UL return (expression)
statement for functions
.IP "\ \ \ \(bu"
.UL define
statement for symbolic parameters
.IP "\ \ \ \(bu"
.UL include
statement for including source files
.LP
Ratfor
is implemented as a
preprocessor which translates this language
into Fortran.
.PP
Once the control flow and cosmetic deficiencies of Fortran
are hidden,
the resulting language is remarkably pleasant to use.
Ratfor 
programs are
markedly easier to write, and to read,
and thus easier to debug, maintain and modify
than their Fortran equivalents.
.PP
It is readily possible to write 
Ratfor 
programs which are portable to other env ironments.
Ratfor
is written in itself
in this way,
so it is also portable;
versions of 
Ratfor 
are now running on at least two dozen different types of computers
at over five hundred locations.
.PP
This paper discusses design criteria
for a Fortran preprocessor,
the 
Ratfor
language
and its implementation,
and user experience.
.AE
.FS
This paper is a revised and expanded version of one published in
.ul
Software\(emPractice and Experience,
October 1975.
The Ratfor described here is the one in use
at Bell Laboratories.
.FE
.CS 12 1 13 0 0 10
.nr PS 9
.nr VS 11
.if t .2C
.if n .ls 2
.NH
INTRODUCTION
.PP
Most programmers will agree that Fortran is
an unpleasant language to program in,
yet there are many occasions when they are forced to use it.
For example, Fortran is often the only language
thoroughly supported on the local computer.
Indeed, it is the closest thing to a universal programming language
currently available:
with care it is possible to write large, truly portable
Fortran programs [1].
Finally, Fortran is often the most ``efficient'' language
available, particularly for programs requiring much computation.
.PP
But Fortran 
.ul
is
unpleasant.
Perhaps the worst deficiency is in
the control flow
statements
_ conditional branches and loops _
which express the logic of the program.
The conditional statements in Fortran are primitive.
The Arithmetic 
.UC IF
forces the user into at least two statement numbers and
two (implied) 
.UC GOTO 's;
it leads to unintelligible code, and is eschewed by good programmers.
The Logical
.UC IF
is better, in that the test part can be stated clearly,
but hopelessly restrictive because the statement
that follows the
.UC IF
can only be one Fortran statement
(with some
.ul
further
restrictions!).
And of course there can be no
.UC ELSE
part to a Fortran
.UC IF :
there is no way to specify an alternative action if the
.UC IF
is not satisfied.
.PP
The Fortran
.UC DO
restricts the user to going forward in an arithmetic progression.
It is fine for ``1 to N in steps of 1 (or 2 or ...)'',
but there is no direct way to go backwards,
or even (in ANSI Fortran [2]) to go from 1 to
.if n N-1.
.if t N\(mi1.
And of course the
.UC DO
is useless if one's problem doesn't map into an arithmetic progression.
.PP
The result of these failings is that Fortran programs
must be written with numerous labels and branches.
The resulting code is
particularly difficult to read and understand,
and thus hard to debug and modify.
.PP
When one is faced with an unpleasant language,
a useful technique is to define
a new language that overcomes the deficiencies,
and to translate it into the unpleasant one
with a preprocessor.
This is the approach taken with 
Ratfor.
(The preprocessor idea is of course not new,
and preprocessors for Fortran are especially popular
today.
A recent listing [3] of preprocessors 
shows more than 50, of which at least half a dozen are widely available.)
.NH
LANGUAGE DESCRIPTION
.SH
Design
.PP
Ratfor
attempts to retain the merits of Fortran
(universality, portability, efficiency)
while hiding the worst Fortran inadequacies.
The language
.ul
is
Fortran except for two aspects.
First,
since control flow is central to any program,
regardless of the specific application,
the primary task of
Ratfor
is to conceal this part of Fortran from the user,
by providing decent control flow structures.
These structures are sufficient and comfortable
for structured programming in the narrow sense of programming without
.UC GOTO 's.
Second, since the preprocessor must examine an entire program
to translate the control structure,
it is possible at the same time to clean up many of the
``cosmetic'' deficiencies of Fortran,
and thus provide a language which is easier
and more pleasant to read and write.
.PP
Beyond these two aspects _ control flow and cosmetics _
Ratfor
does nothing about the host of other weaknesses of Fortran.
Although it would be straightforward to extend 
it
to provide
character strings,
for example,
they are not needed by everyone,
and of course
the preprocessor would be harder to implement.
Throughout, the design principle which has determined
what should be in
Ratfor
and what should not has
been
.ul
Ratfor
.ul
doesn't know any Fortran.
Any language feature which would require that
Ratfor
really understand Fortran has been omitted.
We will return to this point in the section
on implementation.
.PP
Even within the confines of control flow and cosmetics,
we have attempted to be selective
in what features to provide.
The intent has been to provide a small set of the most useful
constructs,
rather than to throw in everything that has ever been thought useful
by someone.
.PP
The rest of this section contains an informal description
of the
Ratfor
language.
The control flow aspects will be
quite familiar to readers used to languages like
Algol, PL/I, Pascal, etc.,
and the cosmetic changes are equally straightforward.
We shall concentrate on 
showing what the language looks like.
.SH
Statement Grouping
.PP
Fortran provides no way to group statements together,
short of making them into a subroutine.
The standard construction
``if a condition is true,
do this group of things,''
for example,
.P1
if (x > 100)
	{ call error("x>100"); err = 1; return }
.P2
cannot be written directly in Fortran.
Instead
a programmer is forced to translate this relatively
clear thought into murky Fortran,
by stating the negative condition
and branching around the group of statements:
.P1
	if (x .le. 100) goto 10
		call error(5hx>100)
		err = 1
		return
10	...
.P2
When the program doesn't work,
or when it must be modified,
this must be translated back into
a clearer form before one can be sure what it does.
.PP
Ratfor
eliminates this error-prone and confusing back-and-forth translation;
the first form 
.ul
is
the way the computation is written in 
Ratfor.
A group of statements can be treated as a unit
by enclosing them in the braces { and }.
This is true throughout the language:
wherever a single 
Ratfor
statement can be used,
there can be several enclosed in braces.
(Braces seem clearer and less obtrusive than
.UL begin
and
.UL end 
or
.UL do
and
.UL end ,
and of course 
.UL do
and
.UL end
already have Fortran meanings.)
.PP
Cosmetics
contribute to the readability of code,
and thus to its understandability.
The character ``>'' is clearer than
.UC ``.GT.'' ,
so
Ratfor
translates it appropriately,
along with several other similar shorthands.
Although many Fortran compilers permit character strings in quotes
(like
.UL """x>100""" ),
quotes are
not allowed in 
.UC ANSI
Fortran,
so 
Ratfor
converts it into the right number of
.UL H 's:
computers count better than people do.
.PP
Ratfor
is a free-form language:
statements may appear anywhere on a line,
and several may appear on one line
if they are separated by semicolons.
The example above could also be written as
.P1
if (x > 100) {
	call error("x>100")
	err = 1
	return
}
.P2
In this case, no semicolon is needed at the end of each line because
Ratfor
assumes there is one statement per line
unless told otherwise.
.PP
Of course,
if the statement that follows the
.UL if
is a single statement
(Ratfor
or otherwise),
no braces are needed:
.P1
if (y <= 0.0 & z <= 0.0)
	write(6, 20) y, z
.P2
No continuation need be indicated 
because the statement is clearly not finished on the first line.
In general
Ratfor
continues lines when it seems obvious that they are not yet done.
(The continuation convention is discussed in detail later.)
.PP
Although a free-form language permits wide latitude in formatting styles,
it is wise to pick one that is readable, then stick to it.
In particular, proper indentation is vital,
to make the logical structure of the program obvious to the reader.
.SH
The ``else'' Clause
.PP
Ratfor
provides an
.UL "else"
statement to handle the construction
``if a condition is true,
do 
this
thing,
.ul
otherwise
do that thing.''
.P1
if (a <= b)
	{ sw = 0; write(6, 1) a, b }
else
	{ sw = 1; write(6, 1) b, a }
.P2
This writes out the smaller of
.UL a
and
.UL b ,
then the larger, and sets
.UL sw
appropriately.
.PP
The Fortran equivalent of this code is circuitous indeed:
.P1
	if (a .gt. b) goto 10
		sw = 0
		write(6, 1) a, b
		goto 20
10	sw = 1
	write(6, 1) b, a
20	...
.P2
This is a mechanical translation;
shorter forms exist, 
as they do for many similar situations.
But all translations suffer from the same problem:
since they are translations,
they are less clear and understandable than code
that is not a translation.
To understand the Fortran version,
one must scan the entire program to make
sure that no other statement branches
to statements 10 or 20
before one knows that indeed this is an 
.UL if-else
construction.
With the
Ratfor
version,
there is no question about how one gets to the parts of the statement.
The
.UL if-else
is a single unit,
which can be read, understood, and ignored if not relevant.
The program says what it means.
.PP
As before, if the statement following an
.UL if
or an
.UL else
is a single statement, no braces are needed:
.P1
if (a <= b)
	sw = 0
else
	sw = 1
.P2
.PP
The syntax of the
.UL if
statement is
.P1
if (\fIlegal Fortran condition\fP)
	\fIRatfor statement\fP
else
	\fIRatfor statement\fP
.P2
where the 
.UL else
part is optional.
The
.ul
legal Fortran condition
is
anything that can legally go into a Fortran Logical
.UC IF .
Ratfor
does not check this clause,
since it does not know enough Fortran
to know what is permitted.
The
.ul
Ratfor
.ul
statement
is any Ratfor or Fortran statement, or any collection of them
in braces.
.SH
Nested if's
.PP
Since the statement that follows an
.UL if
or 
an
.UL else
can be any Ratfor statement, this leads immediately
to the possibility of another
.UL if
or
.UL else .
As a useful example, consider this problem:
the variable
.UL f
is to be set to
\-1 if
.UL x
is less than zero,
to
+1
if
.UL x
is greater than 100,
and to 0 otherwise.
Then in Ratfor, we write
.P1
if (x < 0)
	f = -1
else if (x > 100)
	f = +1
else
	f = 0
.P2
Here the statement after the first
.UL else
is another
.UL if-else .
Logically it is just a single statement,
although it is rather complicated.
.PP
This code says what it means.
Any version written in straight Fortran
will necessarily be indirect
because Fortran does not let you say what you mean.
And as always, clever shortcuts may turn out
to be too clever to understand a year from now.
.PP
Following an
.UL else
with an
.UL if
is one way to write a multi-way branch in Ratfor.
In general the structure
.P1
if (...)
	- - -
else if (...)
	- - -
else if (...)
	- - -
 ...
else
	- - -
.P2
provides a way to specify the choice of exactly one of several alternatives.
(Ratfor also provides a
.UL switch
statement which does the same job
in certain special cases;
in more general situations, we have to make do
with spare parts.)
The tests are laid out in sequence, and each one
is followed by the code associated with it.
Read down the list
of decisions until one is found that is satisfied.
The code associated with this condition is executed,
and then the entire structure is finished.
The trailing
.UL else
part handles the ``default'' case,
where none of the other conditions apply.
If there is no default action, this final
.UL else
part
is omitted:
.P1
if (x < 0)
	x = 0
else if (x > 100)
	x = 100
.P2
.SH
if-else ambiguity
.PP
There is one thing to notice about complicated structures
involving nested
.UL if 's
and
.UL else 's.
Consider
.P1
if (x > 0)
	if (y > 0)
		write(6, 1) x, y
	else
		write(6, 2) y
.P2
There are two
.UL if 's
and
only one
.UL else .
Which
.UL if
does the
.UL else
go with?
.PP
This is a genuine ambiguity in Ratfor, as it is in many other programming
languages.
The ambiguity is resolved in Ratfor
(as elsewhere) by saying that in such cases the
.UL else
goes with the closest previous
.UL else 'ed un-
.UL if .
Thus in this case, the
.UL else
goes with the inner
.UL if ,
as we have indicated by the indentation.
.PP
It is a wise practice to resolve such cases by explicit braces,
just to make your intent clear.
In the case above, we would write
.P1
if (x > 0) {
	if (y > 0)
		write(6, 1) x, y
	else
		write(6, 2) y
}
.P2
which does not change the meaning, but leaves
no doubt in the reader's mind.
If we want the other association, we
.ul
must
write
.P1
if (x > 0) {
	if (y > 0)
		write(6, 1) x, y
}
else
	write(6, 2) y
.P2
.SH
The ``switch'' Statement
.PP
The
.UL switch
statement
provides a clean way to express multi-way branches
which branch on the value of some integer-valued expression.
The syntax is
.P1
\f3switch (\fIexpression\|\f3) {

	case \fIexpr1\f3 :
		\f2statements\f3
	case \fIexpr2, expr3\f3 :
		\f2statements\f3
	...
	default:
		\f2statements\f3
}
.P2
.PP
Each
.UL case
is followed by a
list of comma-separated integer expressions.
The
.ul
expression
inside
.UL switch
is compared against the case expressions
.ul
expr1,
.ul
expr2, 
and so on in turn
until one matches,
at which time the statements following that
.UL case
are executed.
If no cases match
.ul
expression,
and there is a
.UL default 
section,
the statements with it are done;
if there is no
.UL default,
nothing is done.
In all situations,
as soon as some block of statements is executed,
the entire
.UL switch
is exited immediately.
(Readers familiar with C [4] should beware that this
behavior is not the same as the C
.UL switch .)
.SH
The ``do'' Statement
.PP
The
.UL do
statement in
Ratfor
is quite similar to the
.UC DO
statement in Fortran,
except that it uses no statement number.
The statement number, after all, serves only to mark the end
of the
.UC DO ,
and this can be done just as easily with braces.
Thus
.P1
	do i = 1, n {
		x(i) = 0.0
		y(i) = 0.0
		z(i) = 0.0
	}
.P2
is the same as
.P1
	do 10 i = 1, n
		x(i) = 0.0
		y(i) = 0.0
		z(i) = 0.0
10	continue
.P2
The syntax is:
.P1
do \fIlegal\(hyFortran\(hyDO\(hytext\fP
	\fIRatfor statement\fP
.P2
The part that follows 
the keyword
.UL do
has to be something that can legally go into a Fortran
.UC DO
statement.
Thus if a local version of Fortran allows
.UC DO
limits to be expressions
(which is not currently permitted in
.UC ANSI
Fortran),
they can be used in a
Ratfor
.UL do.
.PP
The
.ul
Ratfor statement
part will often be enclosed in braces, but
as with the
.UL if ,
a single statement need not have braces around it.
This code sets an array to zero:
.P1
do i = 1, n
	x(i) = 0.0
.P2
Slightly more complicated,
.P1
do i = 1, n
	do j = 1, n
		m(i, j) = 0
.P2
sets the entire array
.UL m
to zero, and
.P1
do i = 1, n
	do j = 1, n
		if (i < j)
			m(i, j) = -1
		else if (i == j)
			m(i, j) = 0
		else
			m(i, j) = +1
.P2
sets the upper triangle of
.UL m
to \-1, the diagonal to zero, and the lower triangle to +1.
(The operator == is ``equals'', that is, ``.EQ.''.)
In each case, the statement that follows the
.UL do
is logically a
.ul
single
statement, even though complicated,
and thus needs no braces.
.SH
``break'' and ``next''
.PP
Ratfor
provides a statement for leaving a loop early,
and one for beginning the next iteration.
.UL "break"
causes an immediate exit from the
.UL do ;
in effect it is a branch to the statement
.ul
after
the
.UL do .
.UL next
is a branch to the bottom of the loop,
so it causes the next iteration to be done.
For example, this code skips over negative values in an array:
.P1
do i = 1, n {
	if (x(i) < 0.0)
		next
	\fIprocess positive element\fP
}
.P2
.UL break
and
.UL next
also work in the other Ratfor looping constructions
that we will talk about in the next few sections.
.PP
.UL break
and
.UL next
can be followed by an integer to indicate breaking or iterating
that level of enclosing loop; thus
.P1
break 2
.P2
exits from two levels of enclosing loops,
and
.UL break\ 1
is equivalent to
.UL break .
.UL next\ 2
iterates the second enclosing loop.
(Realistically, 
multi-level
.UL break 's
and
.UL next 's
are
not likely to be much used
because they lead to code that is hard to understand
and somewhat risky to change.)
.SH
The ``while'' Statement
.PP
One of the problems with the Fortran
.UC DO
statement
is that it generally insists upon being done once,
regardless of its limits.
If a loop begins
.P1
DO I = 2, 1
.P2
this will typically be done once with
.UL I
set to 2,
even though common sense would suggest that perhaps it shouldn't be.
Of course a
Ratfor
.UL do
can easily be preceded by a test
.P1
if (j <= k)
	do i = j, k  {
		_ _ _
	}
.P2
but this has to be a conscious act,
and is often overlooked by programmers.
.PP
A more serious problem with the
.UC DO
statement
is that it encourages that a program be written
in terms of an arithmetic progression
with small positive steps,
even though that may not be the best way to write it.
If code has to be contorted to fit the requirements
imposed by the Fortran
.UC DO ,
it is that much harder to write and understand.
.PP
To overcome these difficulties,
Ratfor
provides a
.UL while
statement,
which is simply a loop:
``while some condition is true,
repeat this group of statements''.
It has
no preconceptions about why one is looping.
For example, this routine to compute sin(x)
by the Maclaurin series
combines two termination criteria.
.P1 1
.ta .3i .6i .9i 1.2i 1.5i 1.8i
real function sin(x, e)
	# returns sin(x) to accuracy e, by
	# sin(x) = x - x**3/3! + x**5/5! - ...

	sin = x
	term = x

	i = 3
	while (abs(term)>e & i<100) {
		term = -term * x**2 / float(i*(i-1))
		sin = sin + term
		i = i + 2
	}

	return
	end
.P2
.PP
Notice that
if the routine is entered with
.UL term
already smaller than
.UL e ,
the 
loop will be done
.ul
zero times,
that is, no attempt will be made to compute
.UL x**3
and thus a potential underflow is avoided.
Since the test is made at the top of a
.UL while
loop
instead of the bottom,
a special case disappears _
the code works at one of its boundaries.
(The test
.UL i<100
is the other boundary _
making sure the routine stops after
some maximum number of iterations.)
.PP
As an aside, a sharp character ``#'' in a line
marks the beginning of a comment;
the rest of the line is comment.
Comments and code can co-exist on the same line _
one can make marginal remarks,
which is not possible with Fortran's ``C in column 1'' convention.
Blank lines are also permitted anywhere
(they are not in Fortran);
they should be used to emphasize the natural divisions
of a program.
.PP
The syntax of the 
.UL while
statement is
.P1
while (\fIlegal Fortran condition\fP)
	\fIRatfor statement\fP
.P2
As with the
.UL if ,
.ul
legal Fortran condition
is something that can go into
a Fortran Logical
.UC IF ,
and
.ul
Ratfor statement
is a single statement,
which may be multiple statements in braces.
.PP
The
.UL while
encourages a style of coding not normally
practiced by Fortran programmers.
For example, suppose
.UL nextch
is a function which returns the next input character
both as a function value and in its argument.
Then a loop to find the first non-blank character is just
.P1
while (nextch(ich) == iblank)
	;
.P2
A semicolon by itself is a null statement,
which is necessary here to mark the end of the
.UL while ;
if it were not present, the
.UL while
would control the next statement.
When the loop is broken, 
.UL ich
contains the first non-blank.
Of course the same code can be written in Fortran as
.P1 1
100	if (nextch(ich) .eq. iblank) goto 100
.P2
but many Fortran programmers (and a few compilers) believe this line is illegal.
The language at one's disposal
strongly influences how one thinks about a problem.
.SH
The ``for'' Statement
.PP
The
.UL for
statement
is another Ratfor loop, which
attempts to carry the separation of
loop-body from reason-for-looping
a step further
than the
.UL while.
A
.UL for
statement allows explicit initialization
and increment steps as part of the statement.
For example, 
a
.UC DO
loop is just
.P1
for (i = 1; i <= n; i = i + 1) ...
.P2
This is equivalent to
.P1
i = 1
while (i <= n) {
	...
	i = i + 1
}
.P2
The initialization and increment of
.UL i
have been moved into the
.UL for
statement,
making it easier to see at a glance
what controls the loop.
.PP
The
.UL for
and
.UL while
versions have the advantage that they will be done zero times
if
.UL n
is less than 1; this is not true of the
.UL do .
.PP
The loop of the sine routine in the previous section
can be re-written
with a
.UL for
as
.P1 3
for (i=3; abs(term) > e & i < 100; i=i+2) {
	term = -term * x**2 / float(i*(i-1))
	sin = sin + term
}
.P2
.PP
The syntax of the
.UL for
statement is
.P1
for ( \fIinit\fP ; \fIcondition\fP ; \fIincrement\fP )
	\fIRatfor statement\fP
.P2
.ul
init
is any single Fortran statement, which gets done once
before the loop begins.
.ul
increment
is any single Fortran statement,
which gets done at the end of each pass through the loop,
before the test.
.ul
condition
is again anything that is legal in a logical 
.UC IF.
Any of 
.ul
init,
.ul
condition,
and
.ul
increment
may be omitted,
although the semicolons
.ul
must
always be present.
A non-existent
.ul
condition
is treated as always true,
so
.UL "for(;;)"
is an indefinite repeat.
(But see the
.UL repeat-until
in the next section.)
.PP
The
.UL for
statement is particularly
useful for
backward loops, chaining along lists,
loops that might be done zero times,
and similar things which are hard to express with a 
.UC DO
statement,
and obscure to write out 
with
.UC IF 's
and
.UC GOTO 's.
For example,
here is a
backwards
.UC DO
loop
to find the last non-blank character on a card:
.P1
for (i = 80; i > 0; i = i - 1)
	if (card(i) != blank)
		break
.P2
(``!='' is the same as 
.UC ``.NE.'' ).
The code scans the columns from 80 through to 1.
If a non-blank is found, the loop
is immediately broken.
.UL break \& (
and
.UL next
work in
.UL for 's
and
.UL while  's
just as in 
.UL do 's).
If 
.UL i
reaches zero,
the card is all blank.
.PP
This code is rather nasty to write with a regular Fortran
.UC DO ,
since the loop must go forward,
and we must explicitly set up proper conditions
when we fall out of the loop.
(Forgetting this is a common error.)
Thus:
.P1 1
.ta .3i .6i .9i 1.2i 1.5i 1.8i
	DO 10 J = 1, 80
		I = 81 - J
		IF (CARD(I) .NE. BLANK) GO TO 11
10	CONTINUE
	I = 0
11	...
.P2
The version that uses the
.UL for
handles the termination condition properly for free;
.UL i
.ul
is
zero when we fall out of the
.UL for
loop.
.PP
The increment
in a
.UL for
need not be an arithmetic progression;
the following program walks along a list
(stored in an integer array
.UL ptr )
until a zero pointer is found,
adding up elements from a parallel array of values:
.P1
sum = 0.0
for (i = first; i > 0; i = ptr(i))
	sum = sum + value(i)
.P2
Notice that the code works correctly if the list is empty.
Again, placing the test at the top of a loop
instead of the bottom eliminates a potential boundary error.
.SH
The ``repeat-until'' statement
.PP
In spite of the dire warnings,
there are times when one really needs a loop that tests at the bottom
after one pass through.
This service is provided by the
.UL repeat-until :
.P1
repeat
	\fIRatfor statement\fP
until (\fIlegal Fortran condition\fP)
.P2
The
.ul
Ratfor statement
part is done once,
then the condition is evaluated.
If it is true, the loop is exited;
if it is false, another pass is made.
.PP
The
.UL until
part is optional, so a bare
.UL repeat
is the cleanest way to specify an infinite loop.
Of course such a loop must ultimately be broken by some
transfer of control such as
.UL stop ,
.UL return ,
or
.UL break ,
or an implicit stop such as running out of input with
a
.UC READ
statement.
.PP
As a matter of observed fact [8], the
.UL repeat-until
statement is
.ul
much
less used than the other looping constructions;
in particular, it is typically outnumbered ten to one by
.UL for
and
.UL while .
Be cautious about using it, for loops that test only at the
bottom often don't handle null cases well.
.SH
More on break and next
.PP
.UL break
exits immediately from 
.UL do ,
.UL while ,
.UL for ,
and
.UL repeat-until .
.UL next
goes to the test part of
.UL do ,
.UL while
and
.UL repeat-until ,
and to the increment step of a
.UL for .
.SH
``return'' Statement
.PP
The standard Fortran mechanism for returning a value from a function uses the name of the function as a variable which can be assigned to;
the last value stored in it 
is the function value upon return.
For example, here is a routine
.UL equal
which returns 1 if two arrays are identical,
and zero if they differ.
The array ends are marked by the special value \-1.
.P1 1
.ta .3i .6i .9i 1.2i 1.5i 1.8i
# equal _ compare str1 to str2;
#	return 1 if equal, 0 if not
	integer function equal(str1, str2)
	integer str1(100), str2(100)
	integer i

	for (i = 1; str1(i) == str2(i); i = i + 1)
		if (str1(i) == -1) {
			equal = 1
			return
		}
	equal = 0
	return
	end
.P2
.PP
In many languages (e.g., PL/I)
one instead says
.P1
return (\fIexpression\fP)
.P2
to return a value from a function.
Since this is often clearer, Ratfor provides such a
.UL return
statement _
in a function
.UL F ,
.UL return (expression)
is equivalent to
.P1
{ F = expression; return }
.P2
For example, here is
.UL equal
again:
.P1 1
.ta .3i .6i .9i 1.2i 1.5i 1.8i
# equal _ compare str1 to str2;
#	return 1 if equal, 0 if not
	integer function equal(str1, str2)
	integer str1(100), str2(100)
	integer i

	for (i = 1; str1(i) == str2(i); i = i + 1)
		if (str1(i) == -1)
			return(1)
	return(0)
	end
.P2
If there is no parenthesized expression after
.UL return ,
a normal
.UC RETURN 
is made.
(Another version of
.UL equal
is presented shortly.)
.SH
Cosmetics
.PP
As we said above,
the visual appearance of a language
has a substantial effect
on how easy it is to read and understand
programs.
Accordingly, Ratfor provides a number of cosmetic facilities
which may be used to make programs more readable.
.SH
Free-form Input
.PP
Statements can be placed anywhere on a line;
long statements are continued automatically,
as are long conditions in
.UL if ,
.UL while ,
.UL for ,
and
.UL until .
Blank lines are ignored.
Multiple statements may appear on one line,
if they are separated by semicolons.
No semicolon is needed at the end of a line,
if
Ratfor
can make some reasonable guess about whether the statement
ends there.
Lines ending with any of the characters
.P1
=    +    -    *    ,    |    &    (    \(ru
.P2
are assumed to be continued on the next line.
Underscores are discarded wherever they occur;
all others remain as part of the statement.
.PP
Any statement that begins with an all-numeric field is
assumed to be a Fortran label,
and placed in columns 1-5 upon output.
Thus
.P1
write(6, 100); 100 format("hello")
.P2
is converted into
.P1
	write(6, 100)
100	format(5hhello)
.P2
.SH
Translation Services
.PP
Text enclosed in matching single or double quotes
is converted to
.UL nH...
but is otherwise unaltered
(except for formatting _ it may get split across card boundaries
during the reformatting process).
Within quoted strings, the backslash `\e' serves as an escape character:
the next character is taken literally.
This provides a way to get quotes (and of course the backslash itself) into
quoted strings:
.P1
"\e\e\e\(fm"
.P2
is a string containing a backslash and an apostrophe.
(This is
.ul
not
the standard convention of doubled quotes,
but it is easier to use and more general.)
.PP
Any line that begins with the character `%'
is left absolutely unaltered  
except for stripping off the `%'
and moving the line one position to the left.
This is useful for inserting control cards,
and other things that should not be transmogrified
(like an existing Fortran program).
Use `%' only for ordinary statements,
not for the condition parts of
.UL if ,
.UL while ,
etc., or the output may come out in an unexpected place.
.PP
The following character translations are made, 
except within single or double quotes
or on a line beginning with a `%'.
.P1
.ta .5i 1.5i 2i
==	.eq.	!=	.ne.
>	.gt.	>=	.ge.
<	.lt.	<=	.le.
&	.and.	|	.or.
!	.not.	^	.not.
.P2
In addition, the following translations are provided
for input devices with restricted character sets.
.P1
.ta .5i 1.5i 2i
[	{	]	}
$(	{	$)	}
.P2
.SH
``define'' Statement
.PP
Any string of alphanumeric characters can be defined as a name;
thereafter, whenever that name occurs in the input
(delimited by non-alphanumerics)
it is replaced by the rest of the definition line.
(Comments and trailing white spaces are stripped off).
A defined name can be arbitrarily long,
and must begin with a letter.
.PP
.UL define
is typically used to create symbolic parameters:
.P1
define	ROWS	100
define	COLS	50
.if t .sp 5p
dimension a(ROWS), b(ROWS, COLS)
.if t .sp 5p
	if (i > ROWS  \(or  j > COLS) ...
.P2
Alternately, definitions may be written as
.P1
define(ROWS, 100)
.P2
In this case, the defining text is everything after the comma up to the balancing
right parenthesis;
this allows multi-line definitions.
.PP
It is generally a wise practice to use symbolic parameters
for most constants, to help make clear the function of what
would otherwise be mysterious numbers.
As an example, here is the routine
.UL equal 
again, this time with symbolic constants.
.P1 3
.ta .3i .6i .9i 1.2i 1.5i 1.8i
define	YES		1
define	NO		0
define	EOS		-1
define	ARB		100

# equal _ compare str1 to str2;
#	return YES if equal, NO if not
	integer function equal(str1, str2)
	integer str1(ARB), str2(ARB)
	integer i

	for (i = 1; str1(i) == str2(i); i = i + 1)
		if (str1(i) == EOS)
			return(YES)
	return(NO)
	end
.P2
.SH
``include'' Statement
.PP
The statement
.P1
	include file
.P2
inserts the file
found on input stream
.ul
file
into the
Ratfor
input in place of the
.UL include
statement.
The standard usage is to place 
.UC COMMON
blocks on a file,
and
.UL include
that file whenever a copy is needed:
.P1
subroutine x
	include commonblocks
	...
	end

suroutine y
	include commonblocks
	...
	end
.P2
This ensures that all copies of the 
.UC COMMON
blocks are identical
.SH
Pitfalls, Botches, Blemishes and other Failings
.PP
Ratfor catches certain syntax errors, such as missing braces,
.UL else
clauses without an
.UL if ,
and most errors involving missing parentheses in statements.
Beyond that, since Ratfor knows no Fortran, 
any errors you make will be reported by the Fortran compiler,
so you will from time to time have to relate a Fortran diagnostic back
to the Ratfor source.
.PP
Keywords are reserved _
using
.UL if ,
.UL else ,
etc., as variable names will typically wreak havoc.
Don't leave spaces in keywords.
Don't use the Arithmetic
.UC IF .
.PP
The Fortran
.UL nH
convention is not recognized anywhere by Ratfor;
use quotes instead.
.NH
IMPLEMENTATION
.PP
Ratfor
was originally written in
C [4]
on the
.UX
operating system [5].
The language is specified by a context free grammar
and the compiler constructed using
the
.UC YACC
compiler-compiler [6].
.PP
The
Ratfor
grammar is simple and straightforward, being essentially:
.P1
prog	: stat 
	| prog   stat
stat	: \f3if\fP (...) stat 
	| \f3if\fP (...) stat \f3else\fP stat
	| \f3while\fP (...) stat
	| \f3for\fP (...; ...; ...) stat
	| \f3do\fP ... stat
	| \f3repeat\fP stat
	| \f3repeat\fP stat \f3until\fP (...)
	| \f3switch\fP (...) { \f3case\fP ...: prog ...
			\f3default\fP: prog }
	| \f3return\fP
	| \f3break\fP
	| \f3next\fP
	| digits   stat
	| { prog }
	| anything unrecognizable
.P2
The observation
that
Ratfor
knows no Fortran
follows directly from the rule that says a statement is
``anything unrecognizable''.
In fact most of Fortran falls into this category,
since any statement that does not begin with one of the keywords
is by definition ``unrecognizable.''
.PP
Code generation is also simple.
If the first thing on a source line is
not a keyword
(like
.UL if ,
.UL else ,
etc.)
the entire statement is simply copied to the output
with appropriate character translation and formatting.
(Leading digits are treated as a label.)
Keywords cause only slightly more complicated actions.
For example, when
.UL if
is recognized, two consecutive labels L and L+1
are generated and the value of L is stacked.
The condition is then isolated, and the code
.P1
if (.not. (condition)) goto L
.P2
is output.
The 
.ul
statement
part of the
.UL if
is then translated.
When the end of the 
statement is encountered
(which may be some distance away and include nested \f3if\fP's, of course),
the code
.P1
L	continue
.P2
is generated, unless there is an
.UL else
clause, in which case
the code is
.P1
	goto L+1
L	continue
.P2
In this latter case,
the code
.P1
L+1	continue
.P2
is produced after the
.ul
statement
part of the
.UL else.
Code generation for the various loops is equally simple.
.PP
One might argue that more care should be taken
in code generation.
For example,
if there is no trailing
.UL else ,
.P1
	if (i > 0) x = a
.P2
should be left alone, not converted into
.P1
	if (.not. (i .gt. 0)) goto 100
	x = a
100	continue
.P2
But what are optimizing compilers for, if not to improve code?
It is a rare program indeed where this kind of ``inefficiency''
will make even a measurable difference.
In the few cases where it is important,
the offending lines can be protected by `%'.
.PP
The use of a compiler-compiler is definitely the preferred method
of software development.
The language is well-defined,
with few syntactic irregularities.
Implementation is quite simple;
the original construction took under a week.
The language
is sufficiently simple, however, that an
.ul
ad hoc
recognizer can be readily constructed to do the same job
if no compiler-compiler is available.
.PP
The C version of 
Ratfor
is used on
.UX
and on the Honeywell
.UC GCOS
systems.
C compilers are not as widely available as Fortran, however,
so there is also a
Ratfor
written in itself
and originally bootstrapped with the C version.
The
Ratfor
version
was written so as to translate into the portable subset
of Fortran described in [1],
so it is portable,
having been run essentially without change
on at least twelve distinct machines.
(The main restrictions of the portable subset are:
only one character per machine word;
subscripts in the 
form
.ul
c*v\(+-c;
avoiding expressions in places like
.UC DO
loops;
consistency in subroutine argument usage,
and in 
.UC COMMON
declarations.
Ratfor
itself will not gratuitously generate non-standard Fortran.)
.PP
The
Ratfor
version is about 1500 lines of
Ratfor
(compared to about 1000 lines of C);
this compiles into 2500 lines of Fortran.
This expansion ratio is somewhat higher than average,
since the compiled code contains unnecessary occurrences
of
.UC COMMON
declarations.
The execution time of the
Ratfor
version is dominated by
two routines that read and write cards.
Clearly these routines could be replaced
by machine coded local versions;
unless this is done, the efficiency of other parts of the translation process
is largely irrelevant.
.NH
EXPERIENCE
.SH
Good Things
.PP
``It's
so much better than Fortran''
is the most common response of users
when asked how well
Ratfor
meets their needs.
Although cynics might consider this to be vacuous,
it does seem to be true that 
decent control flow and cosmetics converts Fortran
from a bad language into quite a reasonable one,
assuming that Fortran data structures are adequate
for the task at hand.
.PP
Although there are no quantitative results,
users feel that coding in
Ratfor
is at least twice as fast as in Fortran.
More important, debugging and subsequent revision
are much faster than in Fortran.
Partly this is simply because the code can be
.ul
read.
The looping statements
which test at the top instead of the bottom
seem to eliminate or at least
reduce the occurrence of a wide class of
boundary errors.
And of course it is easy to do structured programming in 
Ratfor;
this self-discipline also contributes
markedly to reliability.
.PP
One interesting and encouraging fact is that
programs written in
Ratfor
tend to be as readable as programs
written in more modern languages
like Pascal.
Once one is freed from the shackles of Fortran's
clerical detail and rigid input format,
it is easy to write code that is readable, even esthetically pleasing.
For example,
here is a
Ratfor
implementation of the linear table search discussed by
Knuth [7]:
.P1
A(m+1) = x
for (i = 1; A(i) != x; i = i + 1)
	;
if (i > m) {
	m = i
	B(i) = 1
}
else
	B(i) = B(i) + 1
.P2
A large corpus (5400 lines) of Ratfor, including a subset of
the Ratfor preprocessor itself,
can be found in
[8].
.SH
Bad Things
.PP
The biggest single problem is that many Fortran syntax errors
are not detected by
Ratfor
but by the local Fortran compiler.
The compiler then prints a message
in terms of the generated Fortran,
and in a few cases this may be difficult
to relate back to the offending
Ratfor
line,
especially if the implementation conceals the generated Fortran.
This problem could be dealt with
by tagging each generated line with some indication
of the source line that created it,
but this is inherently implementation-dependent,
so no action has yet been taken.
Error message interpretation
is actually not so arduous as might be thought.
Since Ratfor generates no variables,
only a simple pattern of
.UC IF 's
and
.UC GOTO 's,
data-related errors like missing
.UC DIMENSION
statements
are easy to find in the Fortran.
Furthermore, there has been a steady improvement
in Ratfor's ability to catch trivial syntactic
errors like unbalanced parentheses and quotes.
.PP
There are a number of implementation weaknesses
that are a nuisance, especially to new users.
For example,
keywords are reserved.
This rarely makes any difference, except for those hardy souls
who want to use an Arithmetic 
.UC IF .
A few standard Fortran
constructions are not accepted by 
Ratfor,
and this is perceived as a problem by users with a large corpus
of existing Fortran programs.
Protecting every line with a `%' is not really a
complete solution, although it serves as a stop-gap.
The best long-term solution is provided by the program
Struct [9],
which converts arbitrary Fortran programs into Ratfor.
.PP
Users who export programs often complain that the generated Fortran is
``unreadable'' because it is not 
tastefully formatted and contains extraneous
.UC CONTINUE
statements.
To some extent this can be ameliorated
(Ratfor now has an option to copy Ratfor comments into
the generated Fortran),
but it has always seemed that effort is better spent
on the input language than on the output esthetics.
.PP
One final problem is partly attributable to success _
since Ratfor is relatively easy to modify,
there are now several dialects of Ratfor.
Fortunately, so far most of the differences are in character set,
or in invisible aspects like code generation.
.NH
CONCLUSIONS
.PP
Ratfor
demonstrates that with modest effort
it is possible to convert Fortran
from a bad language into quite a good one.
A preprocessor 
is clearly a useful way to extend or ameliorate
the facilities of a base language.
.PP
When designing a language,
it is important to concentrate on
the essential requirement of providing
the user with the best language possible
for a given effort.
One must avoid throwing in
``features'' _
things which the user may trivially construct within the existing
framework.
.PP
One must also avoid getting sidetracked on irrelevancies.
For instance it seems pointless for
Ratfor
to prepare a neatly formatted
listing of either its input or its output.
The user is presumably capable of the self-discipline required
to prepare neat input
that reflects his thoughts.
It is much more important that the language provide free-form input
so he
.ul
can
format it neatly.
No one should read the output anyway
except in the most dire circumstances.
.SH
Acknowledgements
.PP
C. A. R. Hoare
once said that
``One thing [the language designer] should not do
is to include untried ideas of his own.''
Ratfor
follows this precept very closely _
everything in it has been stolen from someone else.
Most of the control flow structures
are taken directly from the language C [4]
developed by Dennis Ritchie;
the comment and continuation
conventions are adapted from Altran [10].
.PP
I am grateful to Stuart Feldman,
whose patient simulation of an innocent user
during the early days of Ratfor
led to several design improvements
and the eradication of bugs.
He also translated the C parse-tables
and
.UC YACC 
parser
into Fortran for the
first
Ratfor
version of
Ratfor.
.bp
.SH
Appendix: Usage on
.UX .
.PP
Beware _
local customs vary.
Check with a native before going into the jungle.
.PP
The program
.UL ratfor
is the basic translator; it takes either a list
of file names or the standard input and writes
Fortran on the standard output.
Options include
.UL \-6x ,
which uses
.UL x
as a continuation character in column 6
.UC UNIX "" (
uses 
.UL & 
in column 1),
and
.UL \-C ,
which causes Ratfor comments to be copied into
the generated Fortran.
.PP
The program
.UL rc
provides an interface to the
.UL ratfor
command which is much the same as
.UL cc .
Thus
.P1
rc [options] files
.P2
compiles the files specified by
.UL files .
Files with names ending in
.UL \&.r
are Ratfor source; other files are assumed to
be for the loader.
The flags
.UL \-C 
and
.UL \-6x
described above are recognized, as are
.P1
-c	compile only; don't load
-f	save intermediate Fortran .f files
-r	Ratfor only; implies -c and -f
-2	use big Fortran compiler
		(for large programs)
-U	flag undeclared variables
		(not universally available)
.P2
Other flags are passed on to the loader.
.sp 10i
.SH
References
.IP [1]
B. G. Ryder,
``The PFORT Verifier,''
.ul
Software_Practice & Experience,
October 1974.
.IP [2]
American National Standard Fortran.
American National Standards Institute,
New York, 1966.
.IP [3]
.ul
For-word: Fortran Development Newsletter,
August 1975.
.IP [4]
B. W. Kernighan and D. M. Ritchie,
.ul
The C Programming Language,
Prentice-Hall, Inc., 1978.
.IP [5]
D. M. Ritchie and K. L. Thompson,
``The UNIX Time-sharing System.''
\fICACM\fP, July 1974.
.IP [6]
S. C. Johnson,
``YACC _ Yet Another Compiler-Compiler.''
Bell Laboratories Computing Science Technical Report #32,
1978.
.IP [7]
D. E. Knuth,
``Structured Programming with goto Statements.''
\fIComputing Surveys\fP, December 1974.
.IP [8]
B. W. Kernighan and P. J. Plauger,
.ul
Software Tools,
Addison-Wesley, 1976.
.IP [9]
B. S. Baker,
``Struct _ A Program which Structures Fortran'',
Bell Laboratories internal memorandum, December 1975.
.IP [10]
A. D. Hall,
``The Altran System for Rational Function Manipulation _
A Survey.''
\fICACM\fP, August 1971.
.sp
.I "May 1979"