[TUHS] Software written in B

Phil Budne phil at ultimate.com
Thu Jun 8 13:31:53 AEST 2023


An unpleasant aspect of B on the PDP-11 seemed to be that data
addresses were stored as "word addresses" (divided by two).  Addresses
"fix ups" were done before starting any user or other run-time code.

I wrote a comment about this at
https://github.com/philbudne/pdp11-B/blob/pb/source/brt/brt1.s#L67

(which is my reconstruction of the brt files).  Alas, I didn't note
the origin of the SCJ recollection of DMR's hack.

B code from "libb" (disassembled by Angelo Papenhoff?) shows the
initial branch:

http://squoze.net/B/libb/printf.s
http://squoze.net/B/libb/printn.s

Although neither file has any fixups.

The signature I would expect from binary B code of this era would be
that the generated code from each source file starts with a branch (or
jmp) around the contents of the file, to a "jsr r5, chain" followed by
a zero terminated list of addresses (which I guessed were addresses of
address words that needed to be fixed up).

I would expect the code at "chain" to loop through the words
referenced by (r5)+ "fixing" them, and finally returning using "rts r5",
something like the code I wrote at
https://github.com/philbudne/pdp11-B/blob/pb/source/brt/brt1.s#L102

chain:	mov	(r5)+,r0	// fetch pointer pointer
	beq	1f		// quit on zero word
	asr	(r0)		// adjust the referenced word
	br	chain

1:	rts	r5		// return to end of file, fall into next

If the utilities you mention were in fact written in B (which would
offer us the chance to recover the actual code used in brt1 and brt2)

Which looks VERY MUCH like what you describe:
> This "signature" I refer to being a few properties of the a.out files and initial flow of the entry compared with other binaries of known source code origin.  First, these are all magic number 405(8) binaries, so V1 era a.out.  Second, in each case, the initial branch is to a jump vector which then performs a r5-relative subroutine call followed by a halt in the case of fallthrough.  In other words:
>
>     br  _start  / 405(8)
>     ...
> _start:
>     jmp innerstart    / some faraway place
>     ...
> innerstart:
>     jsr r5,main    / always 004567 000042
>     halt
>     ...
> main:
>     inc somevalue   / always 005267 000136 or 005267 000140
>     ...

The fact that the jsr r5 always points to a small, fixed address is
likely because it points to B runtime code loaded at the start of
memory, which doesn't exactly match what's described in section 10.0
in https://www.bell-labs.com/usr/dmr/www/kbman.html:

	ld object /etc/brt1 -lb /etc/bilib /etc/brt2

The initial jmp is the file prologue emitted by the B compiler,
and the code at "innerstart" the epilogue, that I would expect
to be "jsr r5, chain"

I believe the "halt" is a literal zero word (terminating the fixup
list) and not a halt instruction, and that the chain routine (auto)
increments r5, until it sees a zero word, and then returns
(likely via "rts r5") to the word after the zero word.



More information about the TUHS mailing list