[TUHS] UNIX v4 Source Code Commentary - complete book now available

Briam Rodriguez via TUHS tuhs at tuhs.org
Fri Jan 16 05:47:46 AEST 2026


George,

To be clear, I'm not advocating patching v4 to work around emulator 
bugs. My email was purely diagnostic - trying to figure out /where/ the 
bug actually lives.

If the issue is SIMH not correctly handling overlapped seeks and 
interrupt coalescing, then yes, SIMH should be fixed. But before anyone 
fixes anything, it helps to understand the failure mode. That's all I 
was doing: reading the driver code and theorizing about what timing 
assumptions might not hold under emulation.

The suggestion to instrument rkintr() was to /confirm/ whether the 
emulator is the culprit, not to change the driver logic.

cheers, Briam

On 1/15/26 2:44 PM, George Michaelson via TUHS wrote:
> If you patch an OS to run on a buggy simulator you're corrupting source to
> fix non existent bugs in that source.
>
> The fix is to get simh to respect interrupt signalling surely?
>
> Pragmatism says fix the v4 code, sure.
>
> G
>
> On Fri, 16 Jan 2026, 4:08 am Briam Rodriguez via TUHS,<tuhs at tuhs.org>
> wrote:
>
>> Angelo,
>>
>> I took a look at the v4 RK driver source to see what might be going on
>> here.
>>
>> The v4 driver does something clever with multiple disks: overlapped
>> seeks. When rkstart() runs, it iterates through all four drive queues
>> and fires off SEEK commands for every drive that has pending work. The
>> idea is that while drive 0 is seeking, drive 1 can be transferring data.
>> Good for throughput on real hardware.
>>
>> The tricky part is the interrupt handler. When an interrupt comes in,
>> rkintr() checks the SEEKCMP bit in rkcs to figure out if this is a "seek
>> finished" or "transfer finished" interrupt. For seek completion, it
>> reads bits 13-15 of rkds to determine which drive just finished seeking,
>> then kicks off the actual data transfer for that drive.
>>
>> There's a global variable rk_ap that tracks which drive queue is
>> currently mid-transfer. It gets set when a seek completes and used when
>> the subsequent transfer completes. This is the state that ties the
>> two-phase operation together.
>>
>> My theory: SIMH isn't correctly emulating the behavior when multiple
>> seeks complete at the same (or nearly the same) simulated time. On real
>> hardware, you'd get separate interrupts as each drive's seek finishes,
>> with rkds properly reflecting which drive triggered each interrupt. But
>> in emulation, if two seeks "complete simultaneously," you might only get
>> one interrupt, or rkds might only reflect one of the drives.
>>
>> If that happens, the other drive's seek completion never triggers
>> devstart(), so its read/write never actually happens. The process
>> waiting in iowait() sleeps forever on B_DONE that never gets set. Hung
>> process, exactly as you're seeing.
>>
>> The busy-waits in the driver (waiting for CTLRDY after commands, waiting
>> for DRY|ARDY in error recovery) could also be problematic if the
>> emulated status bits don't update with the timing the code expects.
>>
>> This would explain why v5 works: if they rewrote it to serialize
>> operations more conservatively, or changed the state machine to not
>> depend on precise interrupt ordering, the emulation timing issues
>> wouldn't matter as much.
>>
>> Might be worth instrumenting rkintr() to log the rkcs and rkds values on
>> each interrupt, and see if the drive identification is coming through
>> correctly when multiple disks are in use.
>>
>> cheers
>>
>> -- Briam R.
>>
>> On 1/15/26 8:41 AM, Angelo Papenhoff via TUHS wrote:
>>> I was wondering about the RK driver, because there are issues with it.
>>> The v4 RK11 driver does not work with simh emulation correctly when
>>> using multiple disks. I've had to use the v5 driver for my v4
>>> installation guide to get a usable system. Looking at the code i had the
>>> impression that the v4 driver is fine (it also matches the nsys RK
>>> driver, which i've also had trouble with recently. haven't tried v5 RK
>>> with nsys yet). What i suspect is happening is that the seek/read dance
>>> isn't working correctly, either in simh, or there were faulty
>>> assumptions that only work on real hardware more or less accidentially
>>> (the rewrite in v5 might suggest the latter).
>>> In any case it looks like blocks aren't being read correctly, and
>>> whatever process is waiting for the read will hang.
>>>
>>> Would be nice to get to the bottom of this.
>>>
>>> cheers,
>>> aap


More information about the TUHS mailing list