Someone on the TUHS list mailed me privately, prompting me to
write this lengthy apology (in the classical sense) of why groff doesn't
make a certain application easier. I have slightly revised my response.
This message also may serve as a summary of the challenges that need to
be overcome if someone else wants to tackle the job, and potentially
contribute it to groff.
[person creates PDFs of historical Unix documents (many of which are
written using the ms macros) and wishes groff ms made the task easier]
I sympathize. I sometimes render historical documents, so I prescribed
in groff ms's documentation the approach that I take myself. I decided
against trying to support a "-matt" or "-msatt" option in groff because
it's flatly impossible to know which definition of `UX` to use. Even a
date declaration in the document sheds little light, as we then have to
consider the question of whether we want fidelity to the actual state of
the mark at the time of that declared date, or to what would have been
rendered in the author's environment--and they may have been using an ms
that wasn't "up to date" in the same respect. That information, too, is
not recorded in the document.[1]
Providing all the macros _except_ `UX` didn't seem likely to satisfy
users since that's the most important one! It shows up in body text
whereas all the others seldom do--if you can live without the cover page
then, often, you're golden. Except for `UX`.
Finally there is the name collision problem with Berkeley. 4.2BSD and
later ms defined `CT` and `TM` macros (aspects of their "thesis mode")
and once again there's no declarator within the document to tell you
which dialect of ms is in use. This one can be heuristically figured
out with pretty good odds, I suspect, but troff works as a filter--what
was I going to do, write a preprocessor just for this?
(Hmm, maybe grog(1) could do it, and that would be in its wheelhouse.
But there's no point until and unless we reimplement support for
Berkeley thesis mode in the first place [so that grog has an option
argument to report], and that is an undertaking I have demurred.[2])
It seemed like a moderate amount of work for almost zero upside. It's
also hard to validate/verify my work. The only historical troffs to
which I have access are Seventh Edition Unix troff (1979, before
Kernighan) and DWB 3.3 (early 1990s). It's a right pain in the butt to
inspect typesetter output on V7 because I have nothing that emulates a
C/A/T or translates it to device-independent troff output for a
"ditroff"-style device description that Kernighan troff, DWB/Hierloom
Doctools troff, or GNU troff could use.
And even if I had either of those, they'd have to be vetted to a _high_
degree of quality before they'd be fit for purpose; else I wouldn't know
whether I was chasing bugs in the groff ms macros or the C/A/T
emulator/translator.
So, to summarize, I confine my compatibility efforts to _nroff_ output,
and rule the Bell Labs "site" macros out of scope. I feel there is not
much more I can do, and have confidence my results, without resources
that I'm lacking.
I hope this sheds some light on my reasoning.
Regards,
Branden
[1] Still, if someone wants to start, I'd start here.
https://minnie.tuhs.org/cgi-bin/utree.pl?file=V10/vol2/ms/tmac.s
[2] One person, ever, has requested it, 20 years ago. And I have no
specimens of input or corresponding model output rendered by an
"authentic" BSD troff [formatter executable PLUS support files]
against which to develop a reconstruction. (On the bright side, the
Berkeley modifications to the once-encumbered AT&T "tmac.s" are, of
themselves, presumably BSD-licensed.)
https://savannah.gnu.org/bugs/?64455
All, I got this e-mail and thought many of you would appreciate the link.
Cheers, Warren
----- Forwarded message from Poul-Henning Kamp -----
I stumbled over this:
https://www.telecomarchive.com/lettermemo.html
is the TUHS crew aware of that resource ?
----- End forwarded message -----
I'm wondering if there are places where people who were in the Unix
Room wrote about the origins and evolution of what people (at least
used to(*)) refer to as "Unix Philosophy", and since some are in THIS
(TUHS) room, what they might have to say about it.
How much was in reaction to the complexity of Multics, and how much
was simply a response to the limited address spaces of
available and affordable hardware?
Eric S. Raymond wrote in "The Art of Unix Programming" quoting
Doug McIlroy and Rob Pike:
http://www.catb.org/esr/writings/taoup/html/ch01s06.html
And I wonder if they care to comment on it?
I have trouble taking ESR as authoritative, as, it seems to me that
Research Unix was more a product of the "Cathedral" (or at least a
contained community) than the "Bazaar" (at least the modern bazaar,
where everyone needs to leave a new feature grafito on the town
walls), and ESR
A side question for Rob Pike, is the "Not only is UNIX dead, it's
starting to smell really bad." quote accurate? Was it in reaction to
BSD, GNU, or all of the above?
(*) I say "used to", because, for the most part, minimalism seems to
have left the building. I can't look at modern GNU utilities, and
many, if not most open source packages and think they've gone WAY past
classic Unix minimalism, especially since I remember hearing that Bell
Research had happily stripped excess features (removal of "cat -s"
sticks in my mind) from later day research Unix, and because Stallman
is said to have coined the term "New Jersey" style as a synonym for
what Richard P. Gabriel called "Worse is Better", which seems, an
attack on minimalism (nothing less than "the right thing" is acceptable)
Worse is.... readings:
https://dreamsongs.com/WorseIsBetter.htmlhttps://dreamsongs.com/RiseOfWorseIsBetter.htmlhttps://dreamsongs.com/Files/IsWorseReallyBetter.pdfhttps://dreamsongs.com/Files/worse-is-worse.pdf
Anti-flamage disclainmers:
Inclusion of links above does not imply any agreement on my part! My
apologies in advance for any offense, misquote, or misunderstanding on
my part.
> From: Rik Farrow <rik(a)rikfarrow.com>
> Was the brevity typical of Unix command names a function of the tiny
> disk and memory available? Or more a function of having a Teletype 33
> for input?
I'm not sure the answer was ever written down (e.g. in a memo); we will
probably have to rely on memory - and memories that far back are now fairly
thin on the ground by now. Perhaps Mr. McIlroy (or Mr. Thompson, if we're
_really_ lucky) will humor us? :-)
I have the impression that some of the names are _possibly_ inherited from
Multics (which the early Unicians all used before Unix existed) - but maybe
not. The command to list a directory, on Multics, is 'ls' (but see below) -
but the Multics qcommand to remove a file is 'del' (not 'rm'); and change working
directory is 'cwd'. So maybe ls' is just chance?
Multics had a 'feature' where a segment (file) could have additional names (to
the main name), and this is used to add short aliases to many commands, so the
'base name'' for the directory list command is 'list'; 'ls' is a short
alias. A list of Multics commands (with short forms) is available here:
https://www.multicians.org/multics-commands.html
I'm not sure how early that alias mechanism came in, though; my copy of
"Introduction to Multics" (February, 1974) doesn't have short names (or, at
least, it doesn't use them).
It won't have anything to do with disk and memory. Having used a Teletype, it
would take noticeably longer to type in a longer name! It's also more effort
and time. I would expect those are the reasons for the short names.
Noel
> I wonder what happened to the amazing library at Murray Hill.
Last I knew, the Bell Labs archives were intact under supervision of a
professional archivist. Formally speaking, the archives and the library
were distinct entities. The library, which was open to self service 24
hours a day, declined rapidly after the bean counters decreed that it
should henceforth support itself on rental fees. Departments immediately
turned to buying books rather than borrowing them. It's very likely that
this was bad for the Labs' bottom line, but the cost (both monetary and
intellectual) was not visible as a budgetary line item.
The 24-hour library contributed to one of Ken's programming feats. Spurred
by a lunchtime remark that it would be nice to have a unit-conversion
program, Ken announced units(1) the next morning. Right from the start, the
program knew more than 200 units, thanks to a book Ken grabbed from the
library in the middle of the night.
Doug
> That CSTR number 1 is nicely formatted, is that troff?
The archive's CSTR 1 is ersatz. It's a 1973 journal article obtained from
JSTOR. I imagine the manuscript was largely copied from the CSTR, but the
printed paper certainly differs in meta-content and in layout, say nothing
of font. Having gone through the usual route of journal submission and
revision, the body text is probably not word-for-word identical to the CSTR
either.
Doug
Clem Cole:
Interesting -- 'Jason' had always been a Pascal hacker when the strip was
first created. As I recall, Berkeley Breathed had Wendell (his hacker
character) comment on that during the time of Pascal/C Wars.
====
But Jason later was revealed to be wearing Unix underpants:
https://www.gocomics.com/foxtrot/2002/02/25
Norman Wilson
Toronto ON