[TUHS] Yacc binary on 4th edition tape
Paul Ruizendaal via TUHS
tuhs at tuhs.org
Wed Jan 21 16:57:26 AEST 2026
> Alan Snyder was at Bell labs in this time period, and brought C and yacc
> back to MIT. The input file to his yacc looks rather archanic, perhaps
> influenced by the B yacc? I asked Johnson about it, but he didn't
> recognize it. Here's an example, the grammar for Snyder's C compiler.
>
> https://github.com/PDP-10/its/blob/master/src/c/c.grammr
That is an interesting link. It uses some of the syntax that is mentioned as deprecated in the Johnson Yacc paper on your site:
<quote>
This appendix mentions synonyms and features which are supported for historical continuity, but, for various reasons, are not encouraged.
1. Literals may be delimited by double quotes ‘‘"’’ as well as single quotes ‘‘´’’.
2. Literals may be more than one character long. If all the characters are alphabetic, numeric, or _, the type number of the literal is defined, just as if the literal did not have the quotes around it. Otherwise, it is difficult to find the value for such literals.
3. The use of multi-character literals is likely to mislead those unfamiliar with Yacc, since it suggests that Yacc is doing a job which must be actually done by the lexical analyzer.
4. Most places where % is legal, backslash ‘‘\’’ may be used. In particular, \\ is the same as %%, \left the same as %left, etc.
There are a number of other synonyms:
%< is the same as %left
%> is the same as %right
%binary and %2 are the same as %nonassoc
%0 and %term are the same as %token
%= is the same as %prec
5. The curly braces ‘‘{’’ and ‘‘}’’ around an action are optional if the action consists of a single C statement. (They are always required in Ratfor).
</quote>
These old forms are still accepted by Yacc as present in the surviving V6 source files. The Snyder grammer uses backslash instead of percent and also the short forms (i.e. “\>” for “%right”). It does not put “\0” in front of the token definitions, and it does use multi-character literals. When I have time, I will attempt to disassemble/reverse the file “y2.c” from the 1974 binary as well (this has the lexer / parser part of yacc). This should give a view on Yacc grammar in mid-1974.
As to the grammar itself, I am a little confused by the single letter tokens ‘l’ through ’s’ which don’t appear used in the grammer, and I’m intrigued by the use of the name “.expression” for a rule to allow empty expressions in the FOR statement: the Eh grammar from Waterloo uses that name as well for this purpose (suggesting a common root).
===
I was a focused on the optimizer earlier and missed two relevant files from the PWB1 yacc source tree:
https://www.tuhs.org/cgi-bin/utree.pl?file=PWB1/sys/source/s2/yacc.d/INDEX
https://www.tuhs.org/cgi-bin/utree.pl?file=PWB1/sys/source/s2/yacc.d/yaccdiffs
"The archive file contains information for testing and installing Version 2 of Yacc.”
So at the time it was seen as Yacc 1 and Yacc 2, perhaps not surprising considering the material improvement in performance (much smaller tables, much faster parsing). The earlier Johnson papers appear to talk about Yacc 1 and the later versions about Yacc 2. The source in the V6 tree is Yacc 1 and the source in the PWB1 tree is Yacc 2.
More information about the TUHS
mailing list