[TUHS] Early multiprocessor Unix

Paul Ruizendaal pnr at planet.nl
Sun Aug 6 16:52:20 AEST 2023

Thank you for sharing, very interesting context info!

It would seem to me that there was only limited work on multi-processor Unix prior to 1985, with the work by Bach/Buroff/Kelleman around 1983 being the defining effort.

When I was doing my thesis with Prof. Van de Goor (who had earlier worked at DEC with the PDP-8 and PDP-11 design teams), another student down the corridor was working on multi-processor Unix ca. 1984-1985. There is a short paper about that work here https://dl.acm.org/doi/pdf/10.1145/6592.6598

From today’s perspective the paper’s conclusions seem odd. It describes going through the source code end-to-end as a huge job, but the core SysIII kernel is only some 7,000 sloc, with maybe two dozen shared data structures. On the other hand, with the tooling of the early 1980’s debugging this stuff must have been very hard. In contrast, on my porting projects today it is very easy to generate a kernel trace for millions of instructions in a few seconds. Even billions is do-able. The edit/compile/test cycle for a kernel with some debug test code compiled in is similarly a matter of seconds. A single test must have taken an hour or more back in the day, assuming that one had exclusive access to the machine.

Also its observation that the Unix kernel is “not highly structured” seems unfair. I find the 1980-era Unix kernel rather well structured, with the exception of the memory management code which is indeed spread out over multiple files making it not so easy to fully grasp. Maybe this was what my fellow student was referring to in his MSc thesis. Also note that a Dutch MSc thesis only took 6-12 months.

> On 6 Aug 2023, at 06:00, scj at yaccman.com wrote:
> When I left Bell Labs in 1986, I joined Ardent Computer in California.  We built a multiprocessor Unix system with up to four processors based on ECL technology (faster than the computer chips of the time).  The CPU had a standard set of registers, and there were four vector registers that would hold 1000 floating-point numbers and vector instructions to do arithmetic.
> So that meant that we had compiler work to do.  Luckily, Randy Allen had just graduated and signed on to do the parallelism.  I took on the job of doing the assembler, loader, parallelizing C and FORTRAN compilers, and I did the lower-level stuff: assembler, loader,
> designed the a.out format, and even wrote a bug tracking system.  Randy's compiler was excellent, but there were other problems.  The Sun workstations had some quirks: from time to time they would page in a page of all zeros due to a timing problem.  Unhappily, the zero was the halt operation!  We addressed that by adding code to the Kernel the verify that no code page was all 0's before executing.   AT&T and Sun and MIPS and all the hardware makers have problems like this with early chips.  One thing I had told the team from the beginning was that we were going to have to patch hardware problems in the early versions.
> The most serious early hardware bug in our machine was that when the MIPS chip had a page fault, the CPU started executing the new page before it was all present.  It only missed the first two or three instructions.  We settled on a strategy to generate the a.out file so that the first 4 instructions were all No-Ops.  This solved the MIPS problem.
> Now we faced the problem of how do we take a standard a.out format and redo it so that the first four instructions in each code page are NOPs.  We built an "editor" for a.out files that would read the file in, respond to a series of requests, relocate the instructions correctly, and then branch to the line of code that it had been about to execute.  One good thing about this was that when the chip got fixed we would not have to change any code -- it would just work.
> And then we got creative.  We could use the "editor" to find the basic blocks in the code, introduce counting instructions at the head of each block, and produce a profiler by recompiling.  We probably found about 20 things we could do with this mechanism, including optimization after loading, timing the code without having to recompile everything, collecting parallelism statistics, etc.
> ---
> On 2022-11-28 05:24, Paul Ruizendaal wrote:
>> The discussion about the 3B2 triggered another question in my head:
>> what were the earliest multi-processor versions of Unix and how did
>> they relate?
>> My current understanding is that the earliest one is a dual-CPU VAX
>> system with a modified 4BSD done at Purdue. This would have been late
>> 1981, early 1982. I think one CPU was acting as master and had
>> exclusive kernel access, the other CPU would only run user mode code.
>> Then I understand that Keith Kelleman spent a lot of effort to make
>> Unix run on the 3B2 in a SMP setup, essentially going through the
>> source and finding all critical sections and surrounding those with
>> spinlocks. This would be around 1983, and became part of SVr3. I
>> suppose that the “spl()” calls only protected critical sections that
>> were shared between the main thread and interrupt sequences, so that a
>> manual review was necessary to consider each kernel data structure for
>> parallel access issues in the case of 2 CPU’s.
>> Any other notable work in this area prior to 1985?
>> How was the SMP implementation in SVr3 judged back in its day?
>> Paul

More information about the TUHS mailing list