[TUHS] *roff history as told to GNU

G. Branden Robinson g.branden.robinson at gmail.com
Thu Jan 13 04:06:21 AEST 2022


Hi, Dan,

At 2022-01-12T11:33:35-0500, Dan Cross wrote:
> I have some questions about the earlier history.

I've been collecting a detailed narrative history not just of the *roff
_programs_ but also of the development on the language in the roff(7)
manual page.  Below I'll share a current chunk of it that is planned for
the next release (groff 1.23).  It has been heavily revised since
groff 1.22.4.  Many of my revisions have been motivated by accounts from
this list, from the "history of man pages" (more of a history of troff)
at manpages.bsd.lv, and the minnie TUHS archive.

> As I understand it, in the beginning there was RUNOFF, which I believe
> originated on CTSS? The CTSS sources contain a RUNOFF program that's
> made up of ~1100 lines of MAD and ~1300 lines of assembler.

This is a detail I hadn't encountered before; instead I've read claims
that distorted it into being a solely high-level language project.

> There is certainly a RUNOFF in Multics, written in BCPL (there's a
> small "outer module transfer vector" program in ALM).

As I understand it, _this_ RUNOFF is undisputedly Doug McIlroy's.

> This is where it gets muddy for me; I understand this was roughly
> ported to unix as `roff` by Ken and that at this point,

It is hard to find an account of this period that _isn't_ muddy.  Claims
from Murray Hill luminaries suggest that V0 and V1 Unix roffs were the
collective work of Thompson, Ritchie, Ossanna, a fourth person who
contributed the hyphenation algorithm (does someone have the name?), and
McIlroy, because in Ritchie's words[1], this roff was "transliterated"
from Doug's BCPL codebase.

> formatting was fairly primitive: suitable for hardcopy terminals and
> line printers, and could do things like center lines and so forth, but
> nothing fancy (https://www.cs.dartmouth.edu/~doug/reader.pdf).

Yes.  My contributions to groff's roff(7) page attempt to shed more
light on this sort of thing.  Sadly, at crucial periods sources and even
documentation are missing.  For example, there is an nroff entry in the
Unix V2 manual table of contents, but no man page is present.  In other
early editions the reader is asked to see Ossanna for documentation, and
it seems the corresponding artifacts might be lost to time.

> Ossanna then took over and greatly expanded the capabilities of
> `roff`, adding macros and traps and making it Turing-complete; this
> was `nroff`, which grew to become `troff` once the C/A/T typesetter
> was acquired.

Yes.

Here's what I have, though it looks better typeset[2].  Corrections from
witnesses are warmly welcomed.

Name
       roff - concepts and history of roff typesetting

Description
       The term roff describes a family of document formatting systems
       known by names like troff, nroff, ditroff, and groff.  A roff
       system consists of an extensible text formatting language and a
       set of programs for printing and converting to other text
       formats.  Unix-like operating systems often distribute a roff
       system as a core package.

[snip]

History
       Computer-driven document formatting dates back to the 1960s.  The
       roff system itself is intimately connected with the Unix
       operating system, but its roots go back to the earlier operating
       systems CTSS and Multics.

   The predecessor--RUNOFF
       roff's ancestor RUNOFF was written in the MAD language by Jerry
       Saltzer to prepare his Ph.D. thesis using the Compatible Time
       Sharing System (CTSS), a project of the Massachusetts Institute
       of Technology (MIT).  The program is generally referred to in
       full capitals, both to distinguish it from its many descendants,
       and because bits were expensive in those days; five- and six-bit
       character encodings were still in widespread usage, and mixed-
       case alphabetics seen as a luxury.  RUNOFF introduced a syntax of
       inlining formatting directives amid document text, by beginning a
       line with a period (an unlikely occurrence in human-readable
       material) followed by a "control word".  Control words with
       obvious meaning like ".line length n" were supported as well as
       an abbreviation system; the latter came to overwhelm the former
       in popular usage and later derivatives of the program.  A sample
       of control words from a RUNOFF manual of December 1966 <http://
       web.mit.edu/Saltzer/www/publications/ctss/AH.9.01.html> was
       documented as follows (with the parameter notation slightly
       altered).  The abbreviations will be familiar to roff veterans.

                        Abbreviation   Control word
                                 .ad   .adjust
                                 .bp   .begin page
                                 .br   .break
                                 .ce   .center
                                 .in   .indent n
                                 .ll   .line length n
                                 .nf   .nofill
                                 .pl   .paper length n
                                 .sp   .space [n]

       In 1965, MIT's Project MAC teamed with Bell Telephone
       Laboratories and General Electric (GE) to inaugurate the Multics
       <http://www.multicians.org> project.  After a few years, Bell
       Labs discontinued its participation in Multics, famously
       prompting the development of Unix.  Meanwhile, Saltzer's RUNOFF
       proved influential, seeing many ports and derivations elsewhere.

       In 1969, Doug McIlroy wrote one such reimplementation of RUNOFF
       in the BCPL language for a GE 645 running GECOS at the Bell Labs
       location in Murray Hill, New Jersey.  In its manual, the control
       commands were termed "requests", their two-letter names were
       canonical, and the control character was configurable with a .cc
       request.  Other familiar requests emerged at this time; no-adjust
       (.na), need (.ne), page offset (.po), tab configuration (.ta,
       though it worked differently), temporary indent (.ti), character
       translation (.tr), and automatic underlining (.ul; on RUNOFF you
       had to backspace and underscore in the input yourself).  .fi to
       enable filling of output lines got the name it retains to this
       day.

   Unix and roff
       roff was one of the first Unix programs.  McIlroy's runoff was,
       in Dennis Ritchie's term, "transliterated" from BCPL to DEC PDP-7
       assembly language for the fledgling Unix operating system.  It
       saw its name shortened to roff (perhaps under the influence of
       Ken Thompson), while adding support for automatic hyphenation
       with .hc and .hy requests; a generalization of line spacing
       control with the .ls request; and what later roffs would call
       diversions, with "footnote" requests.  This roff indirectly
       funded operating systems research at Murray Hill, for it was used
       to prepare patent applications for AT&T to the U.S. government.
       This arrangement enabled the group to acquire a PDP-11; roff
       promptly proved equal to the task of typesetting the first
       edition of the manual for what would later become known as "Unix
       Version 1", dated November 1971.

       Output from all of the foregoing programs was limited to line
       printers and paper terminals such the IBM 2471 (based on the
       Selectric line of typewriters) and the Teletype Corporation Model
       37.  Proportionally-spaced type was unknown.

   New roff and Typesetter roff
       The first years of Unix were spent in rapid evolution.  The
       practicalities of preparing standardized documents like patent
       applications (and Unix manual pages), combined with McIlroy's
       enthusiasm for macro languages, perhaps created an irresistible
       pressure to make roff extensible.  Joe Ossanna's nroff, literally
       a "new roff", was the outlet for this pressure.  By the time of
       Unix Version 3 (February 1973)--and still in PDP-11 assembly
       language--it sported a swath of features now considered essential
       to roff systems: definition of macros (.de), diversion of text
       thence (.di), and removal thereof (.rm); trap planting (.wh;
       "when") and relocation (.ch; "change"); conditional processing
       (.if); and environments (.ev).  Incremental improvements included
       assignment of the next page number (.pn); no-space mode (.ns) and
       restoration of vertical spacing (.rs); the saving (.sv) and
       output (.os) of vertical space; specification of replacement
       characters for tabs (.tc) and leaders (.lc); configuration of the
       no-break control character (.c2); shorthand to disable automatic
       hyphenation (.nh); a condensation of what were formerly six
       different requests for configuration of page "titles" (headers
       and footers) into one (.tl) with a length controlled separately
       from the line length (.lt); automatic line numbering (.nm);
       interactive input (.rd), which necessitated buffer-flushing
       (.fl), and was made convenient with early program cessation
       (.ex); source file inclusion in its modern form (.so; though
       RUNOFF had an ".append" control word for a similar purpose) and
       early advance to the next file argument (.nx); ignorable content
       (.ig); and programmable abort (.ab).

       Third Edition Unix also brought the pipe(2) system call, the
       explosive growth of a componentized system based around it, and a
       "filter model" that remains perceptible today.  Equally
       importantly, the Bell Labs site in Murray Hill acquired a Graphic
       Systems C/A/T phototypesetter, and with it came the necessity of
       expanding the capabilities of a roff system to cope with
       proportionally-spaced type, multiple type sizes, and a variety of
       fonts.  Ossanna wrote a parallel implementation of nroff for the
       C/A/T, dubbing it troff (for "typesetter roff").  Unfortunately,
       surviving documentation does not illustrate what requests were
       implemented at this time for C/A/T support; the troff(1) man page
       in Fourth Edition Unix (November 1973) does not feature a request
       list, unlike nroff(1).  Apart from typesetter-driven features,
       Unix Version 4 roffs added string definitions (.ds); made the
       escape character configurable (.ec); and enabled the user to
       write diagnostics to the standard error stream (.tm).  Around
       1974, empowered with multiple type sizes, italics, and a symbol
       font specially commissioned by Bell Labs from Graphic Systems,
       Brian Kernighan and Lorinda Cherry implemented eqn for
       typesetting mathematics.  In the same year, for Fifth Edition
       Unix, Ossanna combined and reimplemented the two roffs in C,
       using preprocessor conditions of that language to generate both
       from a single source tree.

       Ossanna documented the syntax of the input language to the nroff
       and troff programs in the "Troff User's Manual", first published
       in 1976, with further revisions as late as 1992 by Kernighan.
       (The original version was entitled "Nroff/Troff User's Manual",
       which may partially explain why roff practitioners have tended to
       refer to it by its AT&T document identifier, "CSTR #54".)  Its
       final revision serves as the de facto specification of AT&T
       troff, and all subsequent implementors of roff systems have done
       so in its shadow.

       A small and simple set of roff macros was first used for the
       manual pages of Unix Version 4 and persisted for two further
       releases, but the first macro package to be formally described
       and installed was ms by Lesk in Version 6.  He also wrote a
       manual, "Typing Documents on the Unix System", describing ms and
       basic nroff/troff usage, updating it as the package accrued
       features.  Sixth Edition additionally saw the debut of the tbl
       preprocessor for formatting tables, also by Lesk.

       For Unix Version 7 (January 1979), McIlroy designed, implemented,
       and documented the man macro package, introducing most of the
       macros described in groff_man(7) today, and edited volume 1 of
       the Version 7 manual using it.  Documents composed using ms
       featured in volume 2, edited by Kernighan.

       Ossanna had passed away unexpectedly in 1977, and after the
       release of Version 7, with the C/A/T typesetter becoming
       supplanted by alternative devices such as the Mergenthaler
       Linotron 202, Kernighan undertook a revision and rewrite of troff
       to generalize its design.  To implement this revised
       architecture, he developed the font and device description file
       formats and the device-independent output format that remain in
       use today.  He described these novelties in the article "A
       Typesetter-independent TROFF", last revised in 1982, and like the
       troff manual itself, it is widely known by a shorthand, "CSTR
       #97".

       Kernighan's innovations prepared troff well for the introduction
       of the Adobe PostScript language in 1982 and a vibrant market in
       laser printers with built-in interpreters for it.  An output
       driver for PostScript, dpost, was swiftly developed.  However,
       due to AT&T software licensing practices, Ossanna's troff, with
       its tight coupling to the capabilities of the C/A/T, remained in
       parallel distribution with device-independent troff throughout
       the 1980s, leading some developers to contrive translators for
       C/A/T-formatted documents to other devices.  An example was
       vtroff for Versatec and Benson-Varian plotters.  Today, however,
       all actively maintained troffs follow Kernighan's device-
       independent design.

Regards,
Branden

[1] "The Evolution of the Unix Time-Sharing System", Ritchie, 1984
[2] Formatted with:
      groff -man -P-c -Tascii -rLL=72n -rHY=0 -dAD=l build/man/roff.7
    (The `AD` string is new to groff 1.23 man(7).)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20220113/a61a6943/attachment.sig>


More information about the TUHS mailing list