A tangential connection to early Unix experience:
My collection of early computer manuals includes Brinch Hansen's manual
for the RC 4000, which stands out for its precise description of the
CPU logic--in Algol 60! It's the only manual I have seen that offers a
good-to-the-last-bit formal description of the hardware.
DEC presented something of the sort for the PDP-11, but punted where
the woods got thick. When I wanted to know how they computed the last
bit of floating-point results, I got no satisfaction. Amidst a thorough
description of addressing came this formulation of the actual computation:
"form floating point result".
Why would anyone be interested in an old regex package that never was
a part of any Unix distro?
The driving force was Posix, whose regex spec was quite inscrutable. Could
there be a reference implementation? It was easy to fool every
implementation I could get my hands on, including Gnu's over-the-top
But as I got into it, I got fascinated by regexes per se. In making a
recognizer, there's a tradeoff between contruction time and execution
time. Linear execution can be achieved, but at a potentially exponential
cost in construction time (and space). Backreferencing takes the regex
languages out of the class of regular languages.
Recalling that regular languages are closed under intersection and
negation, I wondered about how to implement new regex operators, &
and -. I came up with a scheme for this optional non-Posix feature that
involved layering continuation-passing over more traditional methods. And
while I was at it, I broke out smaller sublanguages for special treatment
(as does Gnu), all the way down to Knuth-Morris-Pratt for expressions
in which the only operation is catenation.
And finally, having followed the development of C++ from its infancy,
I wanted to try out its new template facility, so there's a bit of
that in the package, too. Arnold has discovered that not only has C++
evolved, but also that without the discipline of -Wall to force clean
code, I was rather cavalier about casting, both explicitly and implicitly.
The only real customer the code ever had was the AST project, which
translated it to C. After the C++ had sat idle for a half-dozen years, I
thought to revive it in Linux, but found it riddled with incompatibilities
with that new environment and gave up. Arnold deserves a citation for
bravery in pushing that through 15 years further on.