Mini-Unix/usr/doc/ctut/ct2

.NH
If; relational operators; compound statements
.PP
The basic conditional-testing statement in C
is the
.UL if
statement:
.E1
c = getchar( );
if( c \*= '?' )
	printf("why did you type a question mark?\\n");
.E2
The simplest form of
.UL if
is
.E1
if (expression) statement
.E2
.PP
The condition to be tested is any expression enclosed in parentheses.
It is followed by a statement.
The expression is evaluated, and if its value is non-zero,
the statement is executed.
There's an
optional
.UL else
clause, to be described soon.
.PP
The character sequence `=='  is one of the relational operators in C;
here is the complete set:
.E1
\*=	equal to (\*.EQ\*. to Fortraners)
!=	not equal to
>	greater than
<	less than
>=	greater than or equal to
<=	less than or equal to
.E2
.PP
The value of
.UL ``expression
.UL relation
.UL expression''
is 1 if the relation is true,
and 0 if false.
Don't forget that the equality test is `==';
a single `=' causes an assignment, not a test,
and invariably leads to disaster.
.PP
Tests can be combined with the operators 
.UL `&&'
.UC (AND),
.UL `\*|'
.UC (OR),
and
.UL `!'
.UC (NOT).
For example, we can test whether a character is blank or tab or newline
with
.E1
if( c\*=' ' \*| c\*='\\t' \*| c\*='\\n' ) \*.\*.\*.
.E2
C guarantees that
.UL `&&'
and
.UL `\*|'
are evaluated left to right _
we shall soon see cases where this matters.
.PP
One of the nice things about C is that the
.UL statement
part of an 
.UL if
can be made arbitrarily complicated
by enclosing a set of statements
in {}.
As a simple example,
suppose we want to ensure that
.UL a
is bigger than
.UL b,
as part of a sort routine.
The interchange of
.UL a
and
.UL b
takes three statements in C,
grouped together by {}:
.E1
.ne 5
if (a < b) {
	t = a;
	a = b;
	b = t;
}
.E2
.PP
As a general rule in C, anywhere you can use a simple statement,
you can use any compound statement, which is just a number of simple
or compound ones enclosed in {}.
There is no semicolon after the } of a compound statement,
but there
.ul
is
a semicolon after the last non-compound statement inside the {}.
.PP
The ability to replace single statements by complex ones at will
is one feature that makes C much more pleasant to use than Fortran.
Logic (like the exchange in the previous example) which would require several GOTO's and labels in Fortran can
and should
be
done in C without any, using compound statements.
.NH
While Statement; Assignment within an Expression; Null Statement
.PP
The basic looping mechanism in C is the
.UL while
statement.
Here's a program that copies its input to its output
a character at a time.
Remember that `\\0' marks the end of file.
.E1
main(~) {
	char c;
	while( (c=getchar(~)) != '\\0' )
		putchar(c);
}
.E2
The
.UL while
statement is a loop, whose general form
is
.E1
while (expression) statement
.E2
Its meaning is
.E1
(a) evaluate the expression
(b) if its value is true (i\*.e\*., not zero)
		do the statement, and go back to (a)
.E2
Because the expression is tested before the statement
is executed,
the statement part can be executed zero times,
which is often desirable.
As in the
.UL if
statement, the expression and the statement can both be
arbitrarily complicated, although we haven't seen that yet.
Our example gets the character,
assigns it to
.UL c,
and then tests if it's a `\\0''.
If it is not a `\\0',
the statement part of the 
.UL while 
is executed,
printing the character.
The
.UL while
then repeats.
When the input character is finally a `\\0',
the
.UL while
terminates,
and so does
.UL main\*.
.PP
Notice that we used an assignment statement
.E1
c = getchar(~)
.E2
within an expression.
This is a handy notational shortcut which often produces clearer code.
(In fact it is often the only way to write the code cleanly.
As an exercise, re-write the file-copy without
using an assignment inside an expression.)
It works because an assignment statement has a value, just as any
other expression does.
Its value is the value of the right hand side.
This also implies that we can use multiple assignments like
.E1
x = y = z = 0;
.E2
Evaluation goes from right to left.
.PP
By the way, the extra parentheses in the assignment statement
within the conditional were really necessary:
if we had said
.E1
c = getchar(~) != '\\0'
.E2
.UL c
would be set to 0 or 1 depending on whether the character fetched
was an end of file or not.
This is because in the absence of parentheses the assignment operator `='
is evaluated after the relational operator `!='.
When in doubt, or even if not,
parenthesize.
.PP
Since
.UL putchar(c)
returns
.UL c
as its function value,
we could also copy the input to the output by nesting the calls to
.UL getchar
and
.UL putchar:
.E1
main(~) {
	while( putchar(getchar(~)) != '\\0' ) ;
}
.E2
What statement is being repeated?~
None, or technically, the
.ul
null
statement,
because all the work is really done within the test part of the
.UL while\*.
This version is slightly different from the previous one,
because the final `\\0' is copied to the output
before we decide to stop.
.NH
Arithmetic
.PP
The arithmetic operators are the usual `+', `\(mi', `*', and `/'
(truncating integer division if the operands are
both
.UL int),
and the remainder or mod operator `%':
.E1
x = a%b;
.E2
sets
.UL x
to the remainder after
.UL a
is divided by
.UL b
(i.e.,
.UL a
.UL mod
.UL b)\*.
The results are machine dependent unless
.UL a
and
.UL b
are both positive.
.PP
In arithmetic,
.UL char
variables can usually be treated like
.UL int
variables.
Arithmetic on characters
is quite legal, and often makes sense:
.E1
c = c + 'A' - 'a';
.E2
converts a single lower case ascii character stored in
.UL c
to upper case,
making use of the fact that
corresponding ascii letters are a fixed distance apart.
The rule governing this arithmetic is that all
.UL chars
are converted to
.UL int
before the arithmetic is done.
Beware that conversion may involve sign-extension _
if the leftmost bit of a character is 1,
the resulting integer might be negative.
(This doesn't happen with genuine characters on any current machine.)
.PP
So to convert a file into lower case:
.E1
main( ) {
	char c;
	while( (c=getchar( )) != '\\0' )
		if( 'A'<=c && c<='Z' )
			putchar(c+'a'-'A');
		else
			putchar(c);
}
.E2
Characters have different sizes on different machines.
Further, this code won't work
on an IBM machine,
because the letters
in the ebcdic alphabet are not contiguous.
.NH
Else Clause; Conditional Expressions
.PP
We just used an
.UL else
after an 
.UL if\*.
The most general form of
.UL if
is
.E1
if (expression) statement1 else statement2
.E2
the
.UL else
part is optional, but often useful.
The canonical example sets
.UL x
to the
minimum of
.UL a
and
.UL b:
.E1
.ne 4
if (a < b)
	x = a;
else
	x = b;
.E2
Observe that there's a semicolon after
.UL x=a\*.
.PP
C provides an alternate form of conditional which is often more
concise.
It is called the ``conditional expression'' because it is a conditional
which actually has a value
and can be used anywhere an expression can.
The value of
.E1
a<b ? a : b;
.E2
is
.UL a
if
.UL a
is less than
.UL b;
it is
.UL b
otherwise.
In general, the form
.E1
expr1 ? expr2 : expr3
.E2
means
``evaluate
.UL expr1\*.
If it is not zero, the value of the whole thing is
.UL expr2;
otherwise the value is
.UL expr3\*.''
.PP
To set
.UL x
to the minimum of
.UL a
and
.UL b,
then:
.E1
x = (a<b ? a : b);
.E2
The parentheses aren't necessary because 
.UL `?:'
is evaluated before `=',
but safety first.
.PP
Going a step further,
we could write the loop in the lower-case program as
.E1
while( (c=getchar( )) != '\\0' )
	putchar( ('A'<=c && c<='Z') ? c-'A'+'a' : c );
.E2
.PP
.UL If's
and
.UL else's
can be used to construct logic that
branches one of several ways and then rejoins, a common
programming structure, in this way:
.E1
if(\*.\*.\*.)
	{\*.\*.\*.}
else if(\*.\*.\*.)
	{\*.\*.\*.}
else if(\*.\*.\*.)
	{\*.\*.\*.}
else
	{\*.\*.\*.}
.E2
The conditions are tested in order,
and exactly one block is executed _ either the first one whose
.UL if
is satisfied,
or the one for the last
.UL else\*.
When this block is finished,
the next statement executed is the one after the last
.UL else\*.
If no action is to be taken for the ``default'' case,
omit
the last
.UL else\*.
.PP
For example, to count letters, digits and others in a file,
we could write
.E1
.ne 9
main( ) {
	int let, dig, other, c;
	let = dig = other = 0;
	while( (c=getchar( )) != '\\0' )
		if( ('A'<=c && c<='Z') \*| ('a'<=c && c<='z') )  \*+let;
		else if( '0'<=c && c<='9' )  \*+dig;
		else  \*+other;
	printf("%d letters, %d digits, %d others\\n", let, dig, other);
}
.E2
The
`++' operator means
``increment
by 1'';
we will get to it in the next section.