TUHS January 2016

tuhs@tuhs.org

41 participants
51 discussions

by norman＠nose.cita.utoronto.ca

I've assembled some notes from old manuals and other sources on the formats used for on-disk file systems through the Seventh Edition: http://www.cita.utoronto.ca/~norman/old-unix/old-fs.html Additional notes, comments on style, and whatnot are welcome. (It may be sensible to send anything in the last two categories directly to me, rather than to the whole list.)

3 years, 1 month

Re: [TUHS] Short history of 'grep'

by Doug McIlroy

> I'm still trying to get my around about how a program such as "egrep" > which handles complex patterns can be faster than one that doesn't > Is there a simple explanation, involving small words? First, the big-word explanation. Grep is implemented as a nondeterministic finite automaton, egrep as a deterministic finite automaton. Theory folk abbreviate these terms to "NFA" and "DFA", but these are probably not what you had in mind as "small words". The way grep works is to keep a record of possible partial parsings as it scans the subject text; it doesn't know which (if any) will ultimately be the right one, hence "nondeterministic". For example, suppose grep seeks pattern '^a.*bbbc' in text 'a.*bbbbbbc'. When it has read past 3 b's, the possible partial parses are 'a.*', 'a.*b', 'a.*bb' and 'a.*bbb'. If it then reads another b, it splits the first partial parse into 'a.*' and 'a.*b', updates the next two to 'a.*bb' and 'a.*bbb', and kills off the fourth. If instead it reads a c, recognition is successful; if anything else, all partials are killed and recognition fails. Egrep, by preprocessing the expression, produces separate code for each of several possible states: "no b's", "1 b", 2 b's" and "3 bs". When it's in state "1 b", for example, it switches on the next character into "2 b's" or fails, depending on whether the character is b or not--very fast. Grep, on the other hand, has to update all the live parses. So if egrep is so fast, why do we have grep? One reason is that grep only needs to keep a list of possible progress points in the expression. This list can't be longer than the expression. In egrep, though, each state is a subset of progress points. The number of possible subsets is exponential in the length of the expression, so the recognition machine that egrep constructs before attempting the parse can explode--perhaps overflowing memory, or perhaps taking so long in construction that grep can finish its slow parse before egrep even begins its fast parse. To revert to the words of theory folks, grep takes time O(m*n) and space O(m) worst case, while egrep takes time and space O(2^m+n). (2^m is an overestimate, but you get the idea.) That's the short story. In real life egrep overcomes the exponential by lazily constructing the machine--not generating a state until it is encountered in the parse, so no more than n states get constructed. It's a complex program, though, for the already fancy preprocessing must be interleaved with the parsing. Egrep is a tour de force of classical computer science, and pays off big on scanning big files. Still, there's a lot to say for the simple elegance of grep (and the theoretical simplicity of nondeterministic automata). On small jobs it can often win. And it is guaranteed to work in O(m) memory, while egrep may need O(n). ------------------------------------------------- Ken Thompson invented the grep algorithm and published it in CACM. A pointy-headed reviewer for Computing Reviews scoffed: why would anybody want to use the method when a DFA could do the recognition in time O(n)? Of course the reviewer overlooked the potentially exponential costs of constructing a DFA. Some years later Al Aho wrote the more complicated egrep in the expectation that bad exponential cases wouldn't happen in everyday life. But one did. This job took 30 seconds' preprocessing to prepare for a fraction of a second's parsing. Chagrined, Al conceived the lazy-evaluation trick to overcome the exponential and achieved O(n) run time, albeit with a big linear constant. In regard to the "short history of grep", I have always thought my request that Ken liberate regular expressions from ed caused him to write grep. I asked him one afternoon, but I can't remember whether I asked in person or by email. Anyway, next morning I got an email message directing me to grep. If Ken took it from his hip pocket, I was unaware. I'll have to ask him. Doug

9 years, 4 months

Short history of 'grep'

by Warren Toomey

Norman put this link up on twitter: https://medium.com/@rualthanzauva/grep-was-a-private-command-of-mine-for-qu… All good, except I'm not sure I would have written "One guy McIlroy claimed grep was invented for him." :-) Cheers, Warren (a lesser guy)

9 years, 5 months

32V, 3BSD and 4.0BSD on VAX-11/780

by nao

Hi, I successfully made SIMH VAX-11/780 emulator run 32V, 3BSD and 4.0BSD. Details are on my web site (thogh rather tarse): http://zazie.tom-yam.or.jp/starunix/ Enjoy! Naoki Hamada nao(a)tom-yam.or.jp

9 years, 5 months

Links to relevant Unix Usenet newgroups?

by Warren Toomey

Hi all, does anybody know of on-line historical Usenet archives that I can link to, especially if they have unpacked articles (visible subject lines would be better)? What newsgroups are relevant? net.v7bugs, comp.sources.unix, comp.sources.misc, net.sources, mod.sources, comp.sources.bugs? What about platform or system-specific newsgroups? I'll put the links here: http://wiki.tuhs.org/doku.php?id=publications:newsgroups Thanks, Warren P.S A good history of the legal side of Unix is here: http://wiki.tuhs.org/doku.php?id=publications:theses

9 years, 5 months

Working e-mail for Henry Spencer?

by Warren Toomey

Does anybody have a working e-mail address for Henry Spencer? I've tried his "zoo.utoronto..." address but the box is refusing SMTP connections (from me, at least). Alternatively, could someone e-mail him and see if he would be interested in joining the TUHS list? And ditto for any other old Unix users! Cheers, Warren

9 years, 5 months

Fwd: [SIGCIS-Members] Need help identifying a photo

by Al Kossow

Can someone here ID the mystery person? Embarrassingly, CHM has the person misidentified as well. -------- Forwarded Message -------- Subject: [SIGCIS-Members] Need help identifying a photo Date: Wed, 27 Jan 2016 16:46:45 +0000 From: Ceruzzi, Paul <CeruzziP(a)si.edu> To: members(a)lists.sigcis.org <members(a)lists.sigcis.org> There is a famous photo on Wikimedia commons, of what purports to be Ken Thompson & Dennis Ritchie in front of a PDP-11, presumably working on UNIX. The problem is that the seated person doesn’t look like either of them. And he is clean-shaven. Could it be Bjarne Stroustrup? Does anyone recall seeing T&R w/o facial hair? Any help in tracking this down would be much appreciated! The photo has been reprinted in many places, and I’d like to track this down before I inadvertently propagate an error. https://commons.wikimedia.org/wiki/File:Ken_Thompson_(sitting)_and_Dennis_R… Paul E. Ceruzzi Curator, Division of Space History National Air and Space Museum MRC 311, PO Box 37012 Smithsonian Institution Washington, DC 20013-7012 www.ceruzzi.com <http://www.ceruzzi.com> ceruzzip(a)si.edu <mailto:ceruzzip@si.edu> 202-633-2414

9 years, 5 months

Re: [TUHS] Fwd: [SIGCIS-Members] Need help identifying a photo

by Doug McIlroy

No question about it. That is Ken Thompson. I saw him just like that dozens, perhaps hundreds, of times in that very seat. doug

9 years, 5 months

And, contacts in the UK or Europe?

by Warren Toomey

[I feel like I'm spamming my own list] I've tried to make contact with people in the UK that might have copies of the UKUUG and EUUG newsletters: Peter Collinson, Sunil Das, Bruce Anderson. No luck with this. There are newsletters back to 1992 at http://www.ukuug.org/newsletter/ but I'm after the ones in the 1970s and 1980s. The current secretary doesn't know about the earlier newsletters. Who else can I contact? Cheers, Warren

9 years, 5 months

Re: [TUHS] Scan of "Edition 0" manual

by Doug McIlroy

> we can probably substitute part of the db(1) man page from 1st > Edition Unix for the missing page A7 That would be appropriate--properly documented, of course. doug

9 years, 5 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

1996

1995

1994

1993

1992

1991

1990

TUHS January 2016