[TUHS] yacc for Unix v5

scj at yaccman.com scj at yaccman.com
Sun Sep 28 14:48:27 AEST 2014


> I've been looking into the history of yacc (yet another compiler
> compiler).    ....


The Yacc history was not particularly clean in the early days.  When Unix
came along, I inherited a B compiler from Dennis for the GE (later
Honeywell) mainframe at Bell Labs.  It was a recursive descent compiler
for statements with a bolted on precedence mechanism for expressions.  I
wanted to add some features (notably, some way to do exclusive OR). 
Dennis and I settled on the ^ character and after a lot of sweat I got it
to work on the GE, and Dennis put the change in the Unix B as well.  But
it was very painful.

There was a compiler/compiler in use at the Labs, imported I think by Doug
McIlroy, called TMG.  It was a backtracking recursive descent compiler
that could compile almost anything but was unbearably slow and, if it
encountered a syntax error would backtrack back to the first character of
the input file and say, in effect, "something's wrong in here somewhere".

Al Aho overheard some of the discussions about adding the operator and the
problems with TMG, and suggested that we explore this new LR parsing that
Knuth had invented a few yuears earlier.  Al offered to build a table to
handle the expressions in B, and I took him up on the offer and gave him
an expression grammar with about 35 rules.  Al labored by hand for two
days, and gave me a table.  I typed it in and it didn't work.  We debugged
it, and Al spent another handful of hours and gave me another table.  It
was better, but still didn't work.  After a couple of more go-arounds, I
said "maybe you could show me what you are doing, and I could get the
computer to do it."  And Yacc was born.

The original version of Yacc, in B, was a huge CPU hog, and couldn't do
very large grammars.  I remember it once taking 20 minutes to process a
grammar.  And it brought the Unix system to its knees.  So after the first
version, most of the activity was in improving the algorithm and data
structure, and overlaying temporary arrays so we could do bigger grammars
on the PDP-11.  I estimated that I sped Yacc up by a factor of 10,000 in
its first several years.

In the middle of this work, I spent a 9-month sabbatical at the University
of Waterloo in Canada.  I did some work on Yacc while there, but when I
returned to Bell Labs, I discovered that B had been supplanted by C, and
an MIT co-op student had converted the B compiler to a C compiler, and had
also translated Yacc from B to C.  He had also made some substantial
changes in the code generation algorithm, improving much of the code but
also introducing a bunch of fairly obscure bugs.

One thing about the early Unix releases. Most of the work of preparing the
releases was done by a very small number of people (1-3).  Unix in those
days operated in a state of continuous development and integration.  For
example, Dennis did most of his compiler work between midnight and 4 AM. 
We would come in almost every day to find a new C compiler, with
yesterday's compiler tucked away in a known place.  Most of the time we
never noticed.  But if there was a problem we sent email to Dennis and
used the previous compiler.  When Dennis came in after lunch, the bug was
usually fixed in an hour or two.  The rest of us operated in a similar
fashion.

So the decision of what source code for Yacc or PCC was copied into a
given release was often invisible to the author of the code.  The only
discussion I remember relating to the code I had written was a lunchtime
discussion about whether the lawyers would let us distribute PCC.  This
was resolved by putting it in the release and not saying anything and
nothing happened...





More information about the TUHS mailing list