TUHS June 2020

tuhs@tuhs.org

68 participants
61 discussions

by Doug McIlroy

Interesting. My "speak" program had a trivial lexer that recognized literal tokens, many of which were prefixes of others, by maximum-munch binary search in a list of 1600 entries. Entries gave token+translation+rewrite. The whole thing fit in 15K. Many years later I wrote a regex recognizer that special-cased alternations of lots of literals. I believe Gnu's regex.c does that, too. (My regex also supported conjunction and negation-- legitimate regular-language operations--implemented by continuation-passing to avoid huge finite-state machines.) We have here a case of imperfect communication in 1127. Had I been conscious of the lex-explosion problem, I might have thought of speak and put support for speak-like tables into lex. As it happened, I only used yacc/lex once, quite successfully, for a small domain-specific language. Doug Steve Johnson wrote: I also gave up on lex for parsing fairly early. The problem was reserved words. These looked like identifiers, but the state machine to pick out a couple of dozen reserved words out of all identifiers was too big for the PDP-11. When I wrote spell, I ran into the same problem. I had some rules that wanted to convert plurals to singular forms that would be found in the dictionary. Writing a rule to recognize .*ies and convert the "ies" to "y" blew out the memory after only a handful of patterns. My solution was to pick up words and reverse them before passing them through lex, so I looked for the pattern "sei.*", converted it to "y" and then reversed the word again. As it turned out, I only owned spell for a few weeks because Doug and others grabbed it and ran with it.

5 years

Accessing the PDP-11/70 MMU registers and the kernel's u area

by Diomidis Spinellis

From the 2.11 BSD sources I understand that the PDP-11/70 MMU address and data registers, KDSA and KDSD, start at 0172360 and 0172320 respectively [1]. Yet, when I read the register contents I don't get what I would expect to see: increasing by 0200 memory values for KDSA and the same constant value for KDSD [2]. I checked this by looking at /dev/mem. # od -o /dev/mem 0172360 | head -1 0172360 000002 000016 001403 012700 000400 000402 012700 000200 # od /dev/mem 0172320 | head -1 0172320 101016 005064 000026 005067 175456 016467 000006 175430 I get the same results when I examine the memory through SIMH: sim> examine 172360 172360: 000002 sim> examine 172362 172362: 000016 sim> examine 172364 172364: 001403 sim> examine 172320 172320: 101016 sim> examine 172322 172322: 005064 The MMU kernel instruction registers, KISA and KISD, contain similarly nonsensical values as do the registers located at a different memory location (077320, 0772360) indicated in another source [3]. What am I missing? My goal is to access from the console the kernel's u area. According to mem(4) and the symbols in /unix, this should be at address 0140000. Indeed, accessing it through /dev/kmem I get the expected results for e.g. u_comm and u_uid. However, I have been unable to find it in the machine's physical memory, hence my question regarding the MMU's operation. [1] https://github.com/RetroBSD/2.11BSD/blob/master/usr/sys/pdpstand/M.s#L346 [2] https://github.com/RetroBSD/2.11BSD/blob/master/usr/sys/pdpstand/M.s#L247 [3] https://gunkies.org/wiki/PDP-11_Memory_Management Diomidis

5 years

non-blocking IO

by Paul Ruizendaal

This time looking into non-blocking file access. I realise that the term has wider application, but right now my scope is “communication files” (tty’s, pipes, network connections). As far as I can tell, prior to 1979 non-blocking access did not appear in the Spider lineage, nor did it appear in the NCP Unix lineage. First appearance of non-blocking behaviour seems to have been with Chesson’s multiplexed files where it is marked experimental (an experiment within an experiment, so to say) in 1979. The first appearance resembling the modern form appears to have been with SysIII in 1980, where open() gains a O_NDELAY flag and appears to have had two uses: (i) when used on TTY devices it makes open() return without waiting for a carrier signal (and subsequent read() / write() calls on the descriptor return with 0, until the carrier/data is there); and (ii) on pipes and fifo’s, read() and write() will not block on an empty/full pipe, but return 0 instead. This behaviour seems to have continued into SysVR1, I’m not sure when EAGAIN came into use as a return value for this use case in the SysV lineage. Maybe with SysVR3 networking? In the Research lineage, the above SysIII approach does not seem to exist, although the V8 manual page for open() says under BUGS "It should be possible [...] to optionally call open without the possibility of hanging waiting for carrier on communication lines.” In the same location for V10 it reads "It should be possible to call open without waiting for carrier on communication lines.” The July 1981 design proposals for 4.2BSD note that SysIII non-blocking files are a useful feature and should be included in the new system. In Jan/Feb 1982 this appears to be coded up, although not all affected files are under SCCS tracking at that point in time. Non-blocking behaviour is changed from the SysIII semantics, in that EWOULDBLOCK is returned instead of 0 when progress is not possible. The non-blocking behaviour is extended beyond TTY’s and pipes to sockets, with additional errors (such as EINPROGRESS). At this time EWOULDBLOCK is not the same error number as EGAIN. It would seem that the differences between the BSD and SysV lineages in this area persisted until around 2000 or so. Is that a fair summary? - - - I’m not quite sure why the Research lineage did not include non-blocking behaviour, especially in view of the man page comments. Maybe it was seen as against the Unix philosophy, with select() offering sufficient mechanism to avoid blocking (with open() the hard corner case)? In the SysIII code base, the FNDELAY flag is stored on the file pointer (i.e. with struct file). This has the effect that the flag is shared between processes using the same pointer, but can be changed in one process (using fcntl) without the knowledge of others. It seems more logical to me to have made it a per-process flag (i.e. with struct user) instead. In this aspect the SysIII semantics carry through to today’s Unix/Linux. Was this semantic a deliberate design choice, or simply an overlooked complication?

5 years

Unix V6: Assembler Listings

by Paul Ruizendaal

> I am now writing code in assembly for the PDP-11. I remember reading > somewhere that the output from "AS" (my caps) is a bit meagre. I can't find > an option to produce a text listing. Is it possible from AS, using command > options (I can't see one) or perhaps from "LD"? > > Paul > > *Paul Riley* I had the same problem. As I was porting to a different mini I had to write a new assembler. As you have undoubtedly seen, early ‘as’ was written in assembler and not so easy to use as a base. Hence I used Richard’s Miller’s AS for the Interdata as a base (available on Tuhs): https://www.tuhs.org/cgi-bin/utree.pl?file=Interdata732/usr/source/as Later I discovered that the TUHS archive has source code for the original ‘as’ rewritten in C, a work by Roger Jaeger: https://minnie.tuhs.org/Archive/Distributions/USDL/Mini-Unix/ Maybe adding a listing module to this version of ‘as’ is another possible route.

5 years, 1 month

Re: [TUHS] Unix V6: Assembler Listings

by Clem Cole

below... On Thu, Jun 11, 2020 at 9:04 AM Paul Riley <pdr0663(a)icloud.com> wrote: > Clem, > > Thanks for that. So this would compile on modern machines to a cross > compiler for V6 also running on a modern machine? I note you say macro11, > so not a Unix “as” style syntax, is that right? > Yes - the AT&T syntax was much simpler/less sugar than the DEC assembler. But the differences are pretty easy to see. IIRC that assembler generates DEC style linker objects and there is a companion linker that can create DEC binary objects (*i.e.* 'obj' files) as well as traditional UNIX a.out format. The entire tool suite was created originally to move code from RT-11 to UNIX at Harvard and passed around the nascent USENIX community. IIRC that version was forked from a BSD 2.x/NetBSD source repository and folks were adding some fields/features in the DEC obj format that RSX supported that RT-11 did not. Go hunting and see what you find. My memory was that with the BSD 2.x project, somebody added a DEC obj to UNIX binary (a.out) converter tool, so that you could use ld(1) instead of using the DEC style linker that had been included in the original. It has been >>years<< since I was really familiar with any of this stuff. A question about it came up last fall/winter on the simh mailing listing, which is where I found the the URL. FWIW: I offered the modern port, assuming you might want to run some of it as a cross-systems on a newer OS with a modern compiler. But if you are content running this on V6, then you might just want to go back to the original. As I said, my memory is that's in the original USENIX Harvard tape. All that should be Warner's archives if not other places on the Internet. Just remember that a big problem with the original code is that it will be written in pre-'White Book' C (that many of us learned years ago - not even ANSI of Second edition - this used Lesk's portable C library etc.). It sometimes looks a little strange to modern eyes. Also if you go looking, IIRC, someone at Harvard ported the DEC Macro RT-11 library to UNIX v6. In the late 1970s, I remember tjk, Danny Klein, Tron McConnell and I, plus some of the folks over in the bio-med group (whose names I have forgotten) had to a number assembler codes that had been written for the earlier RT-11 systems to Unix for one of the projects we had. Some of it got re-written in C, but I do remember we managed to use the Harvard assembler somehow for parts of it. If my memory is correct, early VMS and messing with BLISS compatibility could have been mixed up in the project somehow, but I've long forgotten the details of what we were doing at the time. Have fun.

5 years, 1 month

Unix V6: Assembler Listings

by Paul Riley

Team, I am now writing code in assembly for the PDP-11. I remember reading somewhere that the output from "AS" (my caps) is a bit meagre. I can't find an option to produce a text listing. Is it possible from AS, using command options (I can't see one) or perhaps from "LD"? Paul *Paul Riley*

5 years, 1 month

UNESCO's Software Heritage Foundation

by Clem Cole

I'm seeding this URL to TUHS as one would expect them to be interested in the work from Warren and friends. FWIW: I tried to browse their archives and was not impressed (I couldn't find anything). https://www.softwareheritage.org/

5 years, 1 month

Re: [TUHS] History of popularity of C

by Doug McIlroy

> Steve Johnson's position paper on optimising compilers may amuse you: > https://dl.acm.org/doi/abs/10.1145/567532.567542 Indeed. This passage struck a particular chord: "I contend that the class of applications that depend on, for example, loop optimization and dead code elimination for their efficient solution is of modest size, growing smaller, and often very susceptible to expression in applicative languages where the optimization is built into the individual applicative operators." I don't know whether I saw that note at the time, but since then I've come to believe, particularly in regard to C, that one case of dead-code elmination should be guaranteed. That case is if(0), where 0 is the value of a constant expression. This guarantee would take the place of many--possibly even most--ifdefs. Every ifdef is an ugly intrusion and a pain to read. Syntactically it occurs at top level completely out of sync with the indentation and flow of text. Conversion to if would be a big win. Doug

5 years, 1 month

History of popularity of C

by Tyler Adams

Does anybody have any good resources on the history of the popularity of C? I'm looking for data to resolve a claim that C is so prolific and influential because it's so easy to write a C compiler. Tyler

5 years, 1 month

My BSDcan talk

by Paul Ruizendaal

> It's another similar to the last two. I've uploaded a version to youtube until the conference has theirs ready. It's a private link, but should work for anybody that has it. Now that I've given my talk it's cool to share more widely. > The link at the end is wrong. https://github.com/bsdimp/bsdcan2020-demos is the proper link. > Please let me know what you think. Watched it & liked it a lot! I have one nit-pick in the section on early networking: BBN's VAX TCP did not allow the ‘/dev/net/host’ syntax. That particular semantic comes from UoI’s NCP Unix, where the 8-bit host number was encoded in the minor number of character special file ‘host’ - but it did not carry through to the BBN code. Other systems used something similar. The Chaos network code made namei() break when it recognised the Chaos driver and left the remainder of the path for the driver to fetch & parse. I’m also being told that Greg Chesson experimented with using the given name of a Datakit channel device as the connection string for the switch, but that this approach was abandoned early on. In my view, exposing the host names through integration in the Unix file name space makes a lot of conceptual sense, but it unfortunately falls down on the practicalities, with the host name set being hard to enumerate (it is large, distributed and not stable - even back then). A question mark is hard pin-pointing the start of Unix networking to V4 / 1974. Yes, that is the earliest evidence we currently have. However, Sandy Fraser says that Spider came into operation in 1972 and it must have connected to something. Maybe that something was a lab-bench test setup, but it could have been a computer - maybe even one running Unix. There is another candidate for earliest Unix networking as well. The tech memo’s from Heinz Lycklama include one on the Glance terminal. That memo includes a section on the network used, referencing a 1973 report by D.R. Weller, "A High-Speed I/O Loop Communication System for the DEC PDP-11 Computer”. That computer appears to be an 11/45 running Unix and the loop is not Spider (nor the Pierce loop discussed in 1970/71 BSTJ). I have an off-list question outstanding to better understand this.

5 years, 1 month

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

TUHS June 2020