[TUHS] B compiler restored
aap at papnet.eu
Thu Jul 13 04:09:46 AEST 2023
So there's been quite a bit of talk about B recently (mostly from my
side) and right now I feel that I've reach an interesting enough
milestone to warrant a separate thread for this here.
First of all, I want to stress that this is still WIP,
but everything can be found here now:
In this repo you will find the following:
- bc and ba that can build themselves
(I've included .s files so everything can be bootstrapped
- libb and bilib in source form from object/library and binary files of
the s2 tape
- brt1 and brt2 restored from binary files of the s2 tape
- olibb, obilib and obrt1, older versions of the above
- a version of ba that does not generate threaded code but an
interpreted code more like the pdp-7 code.
ken told me such a thing existed at one point and indeed it is
the only way to fit the compiler into 8kb/4kw
- an implementation of this interpreted code. With this bc and ba
fit into 8kb
Note I have only tested this under apout so far. The version I used 
needed two tweaks, but see my README.
With this I was able to build the recently reversed B programs  and
produce exact matches to the originals (modulo assembler differences).
In that process I found a few mistakes I made, now the programs are
I want to thank everyone who was of help in this endeavour in one way or
Ken Thompson, Phil Budne, Robert Swierczek, Steve Johnson, Warren Toomey
What's left to do now is to actually run this under UNIX v1 proper,
preferably even on a real machine. I've been too lazy for that so far.
Also there are inaccuracies and unknowns in the compiler and assembler.
Right now the intermediate code is a binary code that's easy to generate
and to parse, but if I understood ken correctly the intermediate code
was more like something the PDP-7 assembler could deal with.
I'm also rather unsure how to handle the conditional ?: operator. The
printf.o file shows that it produces labels that are in line with all
the other labels. Now the C compiler uses labels starting at L10000
for the ones generated in the second pass. So it *feels* like the
conditional should be generated by bc directly and not by ba but this
leads to other problems, which I won't go into detail now.
Finally the code should probably be a bit closer to the C compiler than
it currently is.
More information about the TUHS