[TUHS] B compiler restored

Angelo Papenhoff aap at papnet.eu
Thu Jul 13 04:09:46 AEST 2023


So there's been quite a bit of talk about B recently (mostly from my
side) and right now I feel that I've reach an interesting enough
milestone to warrant a separate thread for this here.

First of all, I want to stress that this is still WIP,
but everything can be found here now:
https://github.com/aap/b/tree/master/unix1_bdir

In this repo you will find the following:
- bc and ba that can build themselves
	(I've included .s files so everything can be bootstrapped
	easily).
- libb and bilib in source form from object/library and binary files of
	the s2 tape
- brt1 and brt2 restored from binary files of the s2 tape
- olibb, obilib and obrt1, older versions of the above
- a version of ba that does not generate threaded code but an
	interpreted code more like the pdp-7 code.
	ken told me such a thing existed at one point and indeed it is
	the only way to fit the compiler into 8kb/4kw
- an implementation of this interpreted code. With this bc and ba
	fit into 8kb

Note I have only tested this under apout so far. The version I used [1]
needed two tweaks, but see my README.

With this I was able to build the recently reversed B programs [2] and
produce exact matches to the originals (modulo assembler differences).
In that process I found a few mistakes I made, now the programs are
exact.

I want to thank everyone who was of help in this endeavour in one way or
another:
Ken Thompson, Phil Budne, Robert Swierczek, Steve Johnson, Warren Toomey

What's left to do now is to actually run this under UNIX v1 proper,
preferably even on a real machine. I've been too lazy for that so far.

Also there are inaccuracies and unknowns in the compiler and assembler.
Right now the intermediate code is a binary code that's easy to generate
and to parse, but if I understood ken correctly the intermediate code
was more like something the PDP-7 assembler could deal with.

I'm also rather unsure how to handle the conditional ?: operator. The
printf.o file shows that it produces labels that are in line with all
the other labels. Now the C compiler uses labels starting at L10000
for the ones generated in the second pass.  So it *feels* like the
conditional should be generated by bc directly and not by ba but this
leads to other problems, which I won't go into detail now.

Finally the code should probably be a bit closer to the C compiler than
it currently is.

Cheers,
Angelo

[1] https://github.com/philbudne/pdp11-B/tree/pb/tools/apout
[2] http://squoze.net/B/programs/


More information about the TUHS mailing list