Hi everyone,
I've cobbed together a crude Teletype Model 37 emulator that generates PDF files (https://github.com/TheBrokenPipe/Teletype-37-PDF) It produces sane-looking PDFs for most (all?) of the early UNIX ROFF/NROFF documents.
The biggest advantage of this over something like "roff $1 | enscript -c -f Courier12 -l -M Letter --margins=67:-9:0:-9 -p $1.ps -s -0.05" is it supports half (forward/reverse) line feeds which enscript does not. Early ROFF stuff like the UNIX manuals and memos made extensive use of subscripts (and superscripts), making them rather painful to typeset.
As an experiment, I re-set Ken Thompson's "Users' Reference to B" memo from early 1972 (https://github.com/TheBrokenPipe/kbman-reset) I picked this one because it contains a BNF-alike description of the grammar as well as fractions in code comments, both of which make extensive use of sub/superscripts. I went to the extent of overlaying the re-set pages on top of the originals to make sure everything lined up.
I'd really appreciate it if someone could review my work on the B manual. If everything looks good, I may tackle other documents, starting with low hanging fruits like the "V0" manual and potentially moving on to re-setting the V1 and V2 manuals in the future (building on aap's work).
Sincerely,
Yufeng
> From: Yufeng Gao
> The s1 kernel is, to date, the earliest machine-readable UNIX kernel,
> sitting between V1 and V2.
It will be interesting to see what it reveals, as it's in the UNIX 'dark age'
between V1 and V4. Working from hints and clues in the extant 'UNIX
Programmer's Manual: Second Edition', I had tried to figure out how V2
differed from V1:
https://gunkies.org/wiki/UNIX_Second_Edition
but I was mostly interested in 'big picture' issues (like how a process'es
address space was laid out), not details like 'the foo() call was added', or
'how exec() differs'. (If someone _does_ create lists of the calls in V1 and
V2, and their details, and compares them, that _will_ be of value, don't get
me wrong; I was mostly just trying to work out how the mysterious KS11
worked.)
> It's somewhat picky about the environment. So far, aap's PDP-11/20
> emulator .. is the only one capable of booting the kernel. SIMH and
> Ersatz-11 both hang before reaching the login prompt.
It would be very interesting to know what fails. By 'hang', do you mean
'ceases making progress', or 'halts'?
If the former, since I've almost always had good experiences with Ersatz-11,
my _guess_ would be a problem with the RF11 emulation. (The RF11 was a very
early, and smalll, disk, so I wouldn't be surprised if there hasn't been a
lot of software run on those emulators that uses it, to flush out bugs. It's
also kind of an odd duck; it's word-oriented, not block-orientd.) So, for
instance, a 'lost' disk interrupt would produce this symptom. Are there any
RF11 diagnostics online? That would be the thing I would start with.
And I guess this system doesn't include the KS11; a pity, code that uses it
would allow re-creation of the programming manual (the way the:
https://gunkies.org/wiki/ANTS/ISI_IMP_Interface
programming instructions were re-created).
> From: Angelo Papenhoff
> So the next step would be to restore the assembly source? :)
Having only the binary to work from (to start with) is not optimal; those
early versions of UNIX ran on a number of very different hardware
configurations (e.g. with or without the KS11), with conditional assembly to
handle different configurations. Having only the dis-assembled code for _this_
configuration would obviously leave the code for the others missing.
Still, having _this_ source _would_ be useful; e.g. the 'hang-up' problem
above; the easiest way to debug that would to put 'print' statements in the
code, where a disk operation was started, and completes. If it's 'losing' a
disk interrupt completion, that will show right up. (Been there, done that, on
the RK11 hardware emulator Bridgham and I built, when UNIX wouldn't boot, just
hung.) Although I suppose one could put break-points there. Trying to debug it
any other way would be painfu beyond belief.
Noel
Hi everyone,
First-time poster here. Near the end of last year, I did some forensic analysis on the DMR tapes (https://www.tuhs.org/Archive/Applications/Dennis_Tapes) and had some fun playing around with them. Warren forwarded a few of my emails to this list at the end of last year and the beginning of this year, but it was never my intention for him to be my messenger, so I'm posting here myself now.
Here's an update on my work with the s1/s2 tapes - I've managed to get a working system out of them. The s1 tape is a UNIX INIT DECtape containing the kernel, while s2 includes most of the distribution files.
The s1 kernel is, to date, the earliest machine-readable UNIX kernel, sitting between V1 and V2. It differs from the unix-jun72 kernel in the following ways:
- It supports both V1 and V2 a.outs out of the box, whereas the unmodified unix-jun72 kernel supports only V1.
- The core size has been increased to 16 KiB (8K words), while the unmodified unix-jun72 kernel has an 8 KiB (4K word) user core.
On the other hand, its syscall table matches that of V1 and the unix-jun72 kernel, lacking all V2 syscalls. Since it aligns with V1 in terms of syscalls, has the V2 core size and can run V2 binaries, I consider it a "V2 beta".
login: root
root
# ls -la
total 42
41 sdrwrw 7 root 80 Jan 1 00:02:02 .
41 sdrwrw 7 root 80 Jan 1 00:02:02 ..
43 sdrwrw 2 root 620 Jan 1 00:01:30 bin
147 l-rwrw 1 root 16448 Jan 1 00:33:51 core
42 sdrwrw 2 root 250 Jan 1 00:01:51 dev
49 sdrwrw 2 root 110 Jan 1 00:01:55 etc
54 sdrwrw 2 root 50 Jan 1 00:00:52 tmp
55 sdrwrw 7 root 80 Jan 1 00:00:52 usr
# ls -la usr
total 8
55 sdrwrw 7 root 80 Jan 1 00:00:52 .
41 sdrwrw 7 root 80 Jan 1 00:02:02 ..
56 sdrwrw 2 28 60 Jan 1 00:02:22 fort
57 sdrwrw 2 jack 50 Jan 1 00:02:39 jack
58 sdrwrw 2 6 30 Jan 1 00:02:36 ken
59 sdrwrw 2 root 120 Jan 1 00:00:52 lib
60 sdrwrw 2 sys 50 Jan 1 00:02:45 sys
142 s-rwrw 1 jack 54 Jan 1 00:52:29 x
# ed
a
main() printf("hello world!\n");
.
w hello.c
33
q
# cc hello.c
I
II
# ls -l a.out
total 3
153 sxrwrw 1 root 1328 Jan 1 00:02:12 a.out
# a.out
hello world!
#
It's somewhat picky about the environment. So far, aap's PDP-11/20 emulator (https://github.com/aap/pdp11) is the only one capable of booting the kernel. SIMH and Ersatz-11 both hang before reaching the login prompt. This makes installation from the s1/s2 tapes difficult, as aap's emulator does not support the TC11. The intended installation process involves booting from s1 and restoring files from s2.
What I did was I extracted the files from the s1 tape and placed them on an empty RF disk, then installed the unix-jun72 kernel. After booting from the RF under SIMH, I extracted the remaining files from s2. Finally, I replaced the unix-jun72 kernel with the s1 kernel using a hex editor, resulting in an RF disk image containing only files from s1/s2. This RF image is bootable under aap's emulator but not SIMH.
The RF disk image can be downloaded from here (https://github.com/TheBrokenPipe/Research-UNIX-V2-Beta)
Direct link - https://github.com/TheBrokenPipe/Research-UNIX-V2-Beta/raw/refs/heads/main/…
Interestingly, its init(7) program does not mount the RK to /usr, suggesting that /usr was stored on the RF.
Sincerely,
Yufeng
Tom Van Vleck just posted to the multicians mailing list that he is
doing an update to the Unix page at multicians.org and is soliciting
feedback. I figure some folks here may have useful suggestions.
His draft is here: https://multicians.org/unix2.html
Comments directly to Tom, I suppose, but if interested parties would
rather discuss here I'd be happy to summarize and send to him as well.
- Dan C.
For the non-TUHS folks who don't know me, I worked in
Center 1127 (the Bell Labs Computing Science Research
Center) 1984-1990, and had some hand in 9th and 10th
Edition Manuals and what passed for the V8-V10
`distributions.'
To answer Branden's points:
A. I do know what version of troff was used to typeset
the 8th through 10th Edition manuals. It was the version
we were using in 1127 at the time, which was indeed
Kernighan's. The macro packages probably matter more
than the particular troff edition.
For the 10th Edition (which files I have at hand), there
was an individual mkfile (mk(1)) for each paper, so
in principle there was no fixed formatting package,
but in practice everything appears to have used troff -mpm,
with various preprocessors according the paper: prefer,
tbl, pic, ideal, and in some cases additional macros and even
odds and ends of sed and awk.
If you wanted to re-render things from scratch you'd
want all the tools. But if you have the real troff
sources you'll have all the mkfiles--things were stored
one paper per directory.
-mpm (mpm(6) in 10/e vol 1) was a largely ms-compatible
package with special expertise in page layout.
B. There was no such thing as a `release' after V7.
In fall 1984 we made a single V8 snapshot. Making
that involved a lot of fiddly work, because we didn't
normally try to build systems from scratch; when we
brought in a new computer we cloned it from an existing
one. So there was lots of fiddly work to make sure
every program in /bin and /usr/bin on the tape compiled
correctly from the source code that would be on the tape
when the cc and as and ld and libraries on the tape were
used.
We sent V8 tapes to about a dozen external places, few
of which did anything with it (many probably never even
installed it). Which makes sense, by then we really
weren't a central source for Unix even within AT&T, let
alone to the world. Neither did we want the support
burden that would have carried--the group's charter was
research, after all, not software support. So the 9th
and 10th editions existed as manuals, but not as releases.
We did occasionally make one-off snapshots for other parts
of AT&T, and maybe for a university or two. (I definitely
remember taking a snapshot to help the official AT&T System N
Unix people set up a Research system at one point, and have
a vague memory that I may have carried a tape to a university
under a special one-off license letter.)
On the other hand, troff wasn't a rapid moving target, and
unlike the stars of the modern software world, we tried not
to break things unless there was a real reason to do so.
So I suspect the troff from any system of that era would
render the Volume 2 papers properly, and am all but certain
the 10th-edition-era troff would do so even for older manuals.
C. Just to be clear, the official 10th Edition manuals
published by Saunders College Publishing were made from
camera-ready copy prepared by us in 1127 (Doug McIlroy
did all the final work, I think) and printed on our
phototypesetter. We didn't ship them troff source, nor
even Postscript. We did everything including the tables
of contents and indexes and page numbering.
D. troff is indeed not TeX, and some of us think of that
as a feature, not a bug.
I think the odds are fairly good (but not 100%) that
groff would do a reasonable job of rendering the papers;
as I said, the hard part is the macro packages. I'm
not sure -mpm ever made it out of Research.
And there are probably copyright issues not just with
the software but with the papers themselves. The published
manuals bear a copyright notice, after all.
Norman Wilson
Toronto ON
(A much nicer place than suburban NJ, which is why
I left the Labs when I did)
Although I edited the v7 through v10 manuals, I have no recollection of
why "system" crept into the title between v7 and v8. Resistance to
trademark edicts did grow. In v10, the cover and the man pages proclaimed
"Unix". However, the fossilized spelling, "UNIX", still appeared in the
introduction to Volume 1 and scattered throughout Volume 2.
Doug
So in most technical circles and indeed in the research communities surrounding
UNIX, the name of the system was just that, UNIX, prefixed often with some
descriptor of which stream, be it Research, USG, BSD/Berkeley, but in any case
the name UNIX itself was descriptive of the operating system for many of its
acolytes and disciples.
However, in AT&T literature and media, addition of "System" to the end of the
formal name seemed to become de facto if not de jure. This can be seen for
instance in manual edits in the early 80s with references to just "UNIX" being
replaced with variations on "The UNIX System", sometimes haphazardly as if done
via a search and replace with little review. This too is evident in some
informative films published by AT&T, available on YouTube today as
"The UNIX Operating System" and "UNIX: Making Computers Easier to Use"[1][2].
Discrepancies in the titles of the videos notwithstanding, throughout it seems
there are several instances where audio of an interviewee saying
"The UNIX System" were edited over what I presume were instances of them simply
saying UNIX.
I'm curious if anyone has the scoop on whether this was an attempt to echo the
"One Bell System" and related terminology, marketing tag lines like
"The System is the Solution", and/or the naming of the revisions themselves as
"System <xyz>". On the other hand, could it have simply been for clarity, with
the uninitiated not being able to glean from the product name anything about it,
making the case for adding "System" in formal descriptions to give them a little
bit of a hint.
Bell Labs folks especially, was there ever some grand thou shalt call it
"The UNIX System" in all PR directive or was it just something that organically
happened over time as bureaucratic powers at be got their hands on a part of the
steering wheel?
- Matt G.
[1] - https://www.youtube.com/watch?v=tc4ROCJYbm0
[2] - https://www.youtube.com/watch?v=XvDZLjaCJuw
> My understanding is that Unix V8-V10 were not full distributions but
patches.
"Patch" connotes individually distributed small fixes, not complete
working systems. I don't believe Brendan meant that v8 was only a patch on
v7, but that's the natural interpretation of the statement.
V8-v10 were snapshots, yes, possibly not perfectly in sync with the
printed editions. But this was typical of Research editions, and especially
of Volujme 2,
which was originally called something like "Documents for Use with Unix".
Doug
[looping in TUHS so my historical mistakes can be corrected]
Hi Alex,
At 2025-02-13T00:59:33+0100, Alejandro Colomar wrote:
> Just wondering... why not build a new PDF from source, instead of
> scanning the book?
A. I don't think we know for sure which version of troff was used to
format the V10 manual. _Probably_ Kernighan's research version,
which was similar to a contemporaneous DWB troff...but what
"contemporaneous" means in the 1989-1990 period is a little fuzzy.
Also, Kernighan may not have a complete source history of his
version of troff, it is presumably still encumbered by AT&T
copyrights, and he's been using groff for at least his last two
books (his Unix memoir and the 2nd edition of the AWK book).
B. It is hard to recreate a Research Unix V10 installation. My
understanding is that Unix V8-V10 were not full distributions but
patches. And because troff was commercial/proprietary software at
that (the aforementioned DWB troff), I don't know if Kernighan's
"Research troff" escaped Bell Labs or how consistently it could be
expected to be present on a system. Presumably any of a variety of
DWB releases would have "worked fine". How much they would have
varied in extremely fiddly details of typesetting is an open
question. I can say with some confidence that the mm package saw
fairly significant development. Of troff itself (and the
preprocessors one bumps into in the Volume 2 white papers) I'm much
more in the dark.
C. Getting a scan out there tells us at least what one software
configuration deemed acceptable by producers of the book generated,
even if it's impossible to identify details of that software
configuration. That in turn helps us to judge the results of
_known_ software configurations--groff, and other troffs too.
D. troff is not TeX. Nothing like trip.tex has ever existed. A golden
platonic ideal of formatter behavior does not exist except in the
collective, sometimes contentious minds of its users.
> Doesn't groff(1) handle the Unix sources?
Assuming the full source of a document is available, and no part of its
toolchain requires software that is unavailable (like Van Wyk's "ideal"
preprocessor) then if groff cannot satisfactorily render a document
produced by the Bell Labs CSRC, then I'd consider that presumptively a
bug in groff. It's a rebuttable presumption--if one document in one
place relied upon a _bug_ in AT&T troff to produce correct rendering, I
think my inclination would be to annotate the problem somewhere in
groff's documentation and leave it unresolved.
For a case where groff formats a classic Unix document "better" (in
the sense of not unintentionally omitting a formatted equation) than
AT&T troff, see the following.
https://github.com/g-branden-robinson/retypesetting-mathematics
> I expect the answer is not licenses (because I expect redistributing
> the scanned original will be as bad as generating an apocryphal PDF in
> terms of licensing).
I've opined before that the various aspects of Unix "IP" ownership
appear to be so complicated and mired in the details of decades-old
contracts in firms that have changed ownership structures multiple
times, that legally valid answers to questions like this may not exist.
Not until a firm that thinks it holds the rights decides it's worth the
money to pay a bunch of archivists and copyright attorneys to go on a
snipe hunt.
And that decision won't be made unless said firm thinks the probability
is high that they can recover damages from infringers in excess of their
costs. Otherwise the decision simply sets fire to a pile of money.
...which isn't impossible. Billionaires do it every day.
> I sometimes wondered if I should run the Linux man-pages build system
> on the sources of Unix manual pages to generate an apocryphal PDF book
> of Volume 1 of the different Unix systems. I never ended up doing so
> for fear of AT&T lawyers (or whoever owns the rights to their manuals
> today), but I find it would be useful.
It's the kind of thing I've thought about doing. :)
If you do, I very much want to know if groff appears to misbehave.
Regards,
Branden
Dave Horsfall:
Silent, like the "p" in swimming? :-)
===
Not at all the same. Unix smelled much better
than its competitors in the 1970s and 1980s.
Norman Wilson
Toronto ON