[TUHS] Version 256 of systemd boasts '42% less Unix philosophy' The Register

Clem Cole clemc at ccc.com
Tue Jun 18 02:00:13 AEST 2024


typo...  like the VFS layer (not CFS layer)
ᐧ

On Mon, Jun 17, 2024 at 11:56 AM Clem Cole <clemc at ccc.com> wrote:

>
>
> On Mon, Jun 17, 2024 at 1:51 AM Bakul Shah via TUHS <tuhs at tuhs.org> wrote:
>
>> Forgot to mention LOCUS, which was the only distributed Unix compatible
>> OS I am aware of. To anyone who has user/implementer experience, I would
>> love to hear what worked well, what didn't, what was easy to implement,
>> what was very hard and what you wished was added to it.
>>
> Jerry and Bruce's book is the complete reference:
> https://www.amazon.com/Distributed-System-Architecture-Computer-Systems/dp/0262161028
>
> There were basically 3/4 versions...  the original version of the PDP 11
> which is the SOSP paper, which morphed to include a VAX at UCLA; IBM's
> AIX/370 and AIX/PS2 which included TCF (Transparent Computing Facility),
> and LCC's TNC Transparent Networking Computing "product" which were the 14
> core technologies used to built it.  Part of them landed in other systems
> from Tru64, HPUX, the Paragon and even a later a Linux implementation
> (which sadly was done on the V2  kernel so was lost when Linus did not
> understand it).
>
> What worked well was different flavors of the DFS and the later core idea
> of the VPROCS layer which I sorely miss, which allowed process migration -
> which w worked well and boy did I miss later in my career.  Admin of a
> Locus based system was a dream because it was just one system for up to
> 4096 nodes in a Paragon.   It also means you could migrate processes off a
> node, take the node down, reboot/change and bring it back. Very cool.
> After the first system was installed, adding a node was trivial, by the
> way.  You booted the node, "joined" the cluster, and were up. AIX used file
> replication to then build the local disks as needed.    BTW:
> "checkpointing" was a freebie -- you just migrated the file to a disk.
>
> Mixing ISA like the 370 and PS/2  was a mixed bag -- I'll let Charlie
> comment.   With TNC we redid that model a bit, I'm not sure we ever got it
> 100% right.  The HP-UX version was probably the best.
>
> The biggest implementation issue is that UNIX has too many different
> namespaces with all sorts of rules that are particular to each.  For all of
> the concept of "everything is a file," - when you start to try to bring it
> together, you discover new and werid^H^H^H^H^Hintersting name spaces from
> System V IPC to signals to FIFOs and Name Pipes (similar but different).
> It seemed like everything we looked, we would find another NS we needed to
> handle, and when we started to try to look at non-UNIX process layers, it
> got even stranger.  The original UNIX protection model is a tad weak, but
> most people had started to add ACLs, and POSIX was in the throughs of
> standardizing them -- so we based it on an early POSIX proposal (mostly
> based on HP-UX since they had them before the others did).
>
> To be more specific, the virtual process layer (VPROC) attempted to do
> what VFS had done for the FS layer to the core kernel.   If you look at
> both the original 2 Locus schemes, process control was ad hoc and thus very
> messy.   LCC realized if we were going to succeed, we needed to make that
> cleaner.  But that still took major surgery - although, like the CFS layer,
> things were a lot clearer once done.   Bruce, Roman, and I came up with
> VPROCs.  BTW: one of the cool parts of VPROC is like VFS. It conceptually
> made it possible to have other process models. We did a prototype for OS/2
> running inside of the OSF uK and were trying to get a contract from DEC to
> do it to Tru64 and adding VMS before we got sold (we had already developed
> CFS for DEC as part of Tru64 - which TNC's Cluster File System). Truth is,
> cheap VMs killed the need for this idea, but it worked fairly well.
>
> After the core VPROCs layer, the hardest thing was distributed
> shared memory (DSM) and the distributed lock manager (DLM).   DSM was an
> example that offered pure transparency in operation, *i.e.,* test and set
> worked (operationally) correctly across the DSM, but it was not "speed
> transparent."  But if you rewrote to use DLM, then you could get full
> transparency and speed.  The DLM is one of the TNC technology which lives
> on today.  It ended up in a number of systems - Oracle wrote their own
> based on the specs for the DEC DLM we built for the CFS for Tru64 (which is
> from TNC). I believe a few other folks used it.  It was in OSF's DCE, and
> ISTR Microsoft picked it up.
>
> So a good question is if TNC was so cool, why did Beowulf (a real hack in
> comparison) stick around and TNC die?   Well, a few things.  LCC/HP did not
> open-source the code until it was too late.  So Beowulf, which was around,
> was what folks (like me) used to build big scientific clusters. And while
> Popek was "right," -- it takes something like Locus/TNC to make a cluster
> fully transparent.  Beowulf ignored the seams and i the end, that was "good
> enough."   But it makes setup and admin a PITA, and the program needs to be
> careful -- the dragons are all over the place. So, when I went to Intel, I
> was the Architect of Cluster Ready, which defined away many of those seams
> and then provided tools to test for them and help you admin.
>
> Tools like the Cluster Checker and the whole ClusterReady program would
> not be needed if TNC had "stuck," and I think clusters, in general, a
> cluster of small computers on a LAN, not just clusters on a
> high-speed/special interconnect like a supercomputer, would be more
> available today.
>
>
> Clem
>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20240617/9b6e75a3/attachment.htm>


More information about the TUHS mailing list