I've assembled some notes from old manuals and other sources
on the formats used for on-disk file systems through the
Additional notes, comments on style, and whatnot are welcome.
(It may be sensible to send anything in the last two categories
directly to me, rather than to the whole list.)
Hi all, I received an e-mail looking for the ksh-88 source code. A quick
search for it on-line doesn't reveal it. Does anybody have a copy?
I recently built a PiDP11 and have been enjoying going back in time
to 2.11BSD.. I was at UC Davis in the the early 1980's and we had
a few PDP-11/70's running 2.8/2.9 BSD. Back then we reached out to
David Korn and he sent us the source for KSH -- this would have been
in 1985ish if I remember, and we compiled it for 2.9 & 4.1BSD, Xenix,
and some other variants that used K&R C. It may have been what was
later called ksh88. I wish I still had the files from then..
I was wondering if you might know if there's an older version like this
or one that's been ported for 2.11BSD?
The last article of the latest issue of the Communications of the ACM
that appeared electronically earlier today is a brief interview with
this year's ACM Turing Award winners, Al Aho and Jeff Ullman.
The article is
Last byte: Shaping the foundations of programming languages
Comm. ACM 64(6), 120, 119, June 2021.
and it includes a picture of the two winners sitting on Dennis
I liked this snippet from Jeff Ullman, praising fellow list member
Steve Johnson's landmark program, yacc:
>> At the time of the first Fortran compiler, it took several
>> person-years to write a parser. By the time yacc came around,
>> you could do it in an afternoon.
- Nelson H. F. Beebe Tel: +1 801 581 5254 -
- University of Utah FAX: +1 801 581 4148 -
- Department of Mathematics, 110 LCB Internet e-mail: beebe(a)math.utah.edu -
- 155 S 1400 E RM 233 beebe(a)acm.org beebe(a)computer.org -
- Salt Lake City, UT 84112-0090, USA URL: http://www.math.utah.edu/~beebe/ -
>From what I can gather the only way to reasonably examine the disassembly
of a program in the early days of Unix was adb. Is this true? Was there a
way to easily produce a full disassembly? I'll confess to being fairly
ignorant of adb use since I always had dbx or the equivalent available.
The first tool I'm aware of to purposefully/primarily produce a full
listing is MIPS dis (ca. 1986?) but there must have been something before
that for other systems, no?
> From: Paul Riley <paul(a)rileyriot.com>
>> (I wrote a note, BITD, explaining how all this worked; I'll upload it
>> to the CHWiki when I get a chance.)
along with simple examples of args and auto variables, which are both
referenced via the FP.
> As a non-C consumer of printf, should I point R5 at some space for a
> stack and call printf in the same manner as the C example I cited?
Not necessary to do anything with R5 (you can leave it blank); the only things
a PDP-11 C routine needs are:
- a good stack
- the arguments, and return point, on the top of the stack
csv will set up the frame pointer, making no assumptions about the old
contents of R5 - see the source:
although it does save the old R5 contents, and restore them on exit.
> From: Paul Riley
> I want to use printf from an assembly language program, in V6. ... the
> substitutional arguments for printf are pushed onto the stack in reverse
> order, then the address of the string, and then printf is called. After
> this, 6 is added to the stack pointer.
This is all down to the standard C environment / calling sequence on the
PDP-11 (at least, in V6 C; other compilers may do it differently). Calls to
printf() are in no way special.
Very briefly, there's a 'frame pointer' (R5) which points to the current stack
frame; all arguments and automatics are relative to that. A pair of special
routines, csv and cret (I couldn't find the source on TUHS, but it happens to
if you want to see it), set up and tear down the frame on entry/exit to that
routine. The SP (R6) points to a blank location on the top (i.e. lower address;
PDP-11 stacks grow down) of the stack.
To call a subroutine, the arguments are pushed, the routine is called (which
pushes the return PC), and on return (which pops the return PC), the arguments
are discarded by the caller.
(I wrote a note, BITD, explaining how all this worked; I'll upload it to the
CHWiki when I get a chance.)
> I assume that the printf routine pops the address of the string off the
> stack, but leaves the other values on the stack
No, all C routines (including printf()) leave the stack more or less alone,
except for CSV/CRET hackery, allocating space for automatic variables on
routine entry (that would be at L1; try looking at the .s for a routine with
automatic variables), and popping the return PC on exit. The exception to this
is the stuff around calling _enother_ routine (sketched above).
Another exception is alloca() (source here:
again, couldn't find it in TUHS), which allocated a block of memory on
the stack (automatically discarded when the routine which called alloca()
returns). Note that since all automatic variables and incoming arguments
are relative to the FP, alloca is _really_ simple; justs adjusts the
SP, and it's done.
> What troubles me is that the stack pointer is not decremented before the
> first mov, in the example below. Is this some C convention? I would
> assume that the first push in my example would overwrite the top of the
That's right; that's because in the C runtime environment, the top location
on the stack is a trash word (set up by csv).
> I understand db only works on files like a.out or core dumps. If I
> wanted to break the assembly language program to examine values, how can
> I force a termination and core dump elegantly, so I can examine some
> register values?
which can do interactive debugging.
I want to use printf from an assembly language program, in V6. It seems
that the Unix Programmer's Manual doesn't show how to use it from assembly,
so I wrote a short C program and captured the assembler output, for some
clues. Listings below.
In my example, the substitutional arguments for printf are pushed onto the
stack in reverse order, then the address of the string, and then printf is
called. After this, 6 is added to the stack pointer. I assume that the
printf routine pops the address of the string off the stack, but leaves the
other values on the stack, hence the need to add 2x3=6 to the stack after
calling printf in my example.
What troubles me is that the stack pointer is not decremented before the
first mov, in the example below. Is this some C convention? I would assume
that the first push in my example would overwrite the top of the stack.
Perhaps I'm not used to PDP-11 stack conventions.
I understand db only works on files like a.out or core dumps. If I wanted
to break the assembly language program to examine values, how can I force a
termination and core dump elegantly, so I can examine some register values?
int a, b, c;
a = 1;
b = 2;
c = 3;
printf("A = %d, B = %d, C = %d", a, b, c);
> The demand paging code for SysVR2 was written by Keith A. Kelleman and Steven J. Buroff, and in contemporary conference talks they were saying that they wanted to combine the best parts of demand-paged 32V and BSD. They may have some additional memories that could help with getting a better understanding of the final version of 32V.
> Does anybody have contact details for these two gentlemen?
I’ve managed to contact Keith Kelleman and he had some interesting remarks. The paging code in SVR2 was all new code, with a focus the 3B dual processor. It does not derive at the code level from 32V and in fact he does not recall working with the 32V paging code. This kills the hope that the SVR2 code had clues about the 32V code. Keith did suggest that I tried to contact Tom Raleigh, who might have worked with the later 32V code base. Anybody with suggestions for locating him?
Besides functionality, the people that remember paged 32V all recall it being very fast. I wonder what made it fast.
First to consider is “faster than what?”. Maybe Rob P. has a specific memory, but I would assume faster than 4BSD: if the comparison was with the "scatter loading, partial swapping” version of 32V people would have expected the better performance and not remember it as impressive 40 years later. Possibly the comparison is with 8th Edition which would have used the 4BSD paging code by then.
If the comparison is with 4BSD, then the CoW feature in paging 32V would have been mostly matched by the vfork mechanism in 4BSD: it covers 80% of the use and it leaves the code paths simpler. If the comparison is with 8th edition, this may be the difference that people remember.
The next possibility is that paging 32V had a better page-out algorithm. Joy/Babaoglu mention that the cost of the clock process is noticable. Maybe paged 32V used a VMS-like FIFO/second chance algorithm that did not need a separate kernel process/thread. Arguably this is not enough for a convincing speed difference.
It is also possible that JFR found a more clever way to do LRU approximation. He remembers that his code used ‘strict LRU’, but not the algorithm. On Tenex - his conceptual reference - that was natural to do, because the MMU hardware maintains a table with 4 words of 36 bits for each frame with statistical data. With the VAX hardware it is a challenge. Considering his mathematical prowess it is maybe plausible that JFR found an efficient way. A slightly better page hit rate gives a significant speed improvement.
All speculation of course: only finding the source will truly tell.
It has always bugged me that the bsd-family-tree file got 2.8BSD slightly
It has the relationship V6 -> 1BSD -> 2BSD -> 2.79BSD -> 2.8BSD
with V7 -> 32V -> 3BSD ... -> 4.1BSD -> 2.8BSD
Now, as far as it goes, that's not terrible. But it's missing something.
First, there was no V6 code in 1BSD or 2BSD through 2.79BSD. It was for V6
at first then for both V6 and V7, but w/o V7 code. There weren't even
patches for V6 or new drivers on the early 1BSD and 2BSDs. However,
starting with 2.8BSD, there's a V7 kernel. Or should I say a heavily
patched V7 kernel with a ton of #ifdefs for all the fixes and enhancements
collected by Berkeley and a few minor build system tweaks.
Also, the code from 4.1BSD that's in 2.8 is rather minimal from what I can
tell with my analysis so far, except indirectly in some of the patches to
the V7 kernel appear to also be in 4.1BSD. The biggest thing that's in
2.8BSD from 4.1BSD is the job control (confined to its own directory with
big warnings that basically say you'll need to update the code and even
then the system is unstable). 2.9BSD has much better 4.xBSD integration,
but 2.8 was quite early days for rejoining the two lines. 4.1BSD didn't
have many berkeley rewrites of userland code, and the 2.8 tape has only a
few of them (eg ls). So although it's not as complete as one would hope,
there was a decent amount of code from 4.1BSD flowing into 2.8BSD.
Now, my request. I've created a code review for FreeBSD to fix this.
https://reviews.freebsd.org/D30883 is the review. We use phabricator in the
FreeBSD project. Anybody can view this, but if you don't want to create an
account, please send me email with a comment about the change and/or the
commit message. It just adds an arc from V7 to 2.8BSD.
Thanks for any time and/or insight you might have here. I'm judging the
above entirely on the archived code rather than any historical knowledge...