[TUHS] Happy birthday, Ken Thompson!

Doug McIlroy doug at cs.dartmouth.edu
Tue Feb 5 09:42:08 AEST 2019


> Ken wrote ... ed(before regexp ed)

Actually Ken wrote a regexp qed (for Multics) before he wrote ed.
He wrote about it here, before the birth of Unix:
Programming Techniques: Regular expression search algorithm 
Ken Thompson 
June 1968 Communications of the ACM: Volume 11 Issue 6, June 1968 

This is the nondetermistic regexp recognizer that's been used 
ever since. Amusingly a reviewer for Computing Reviews panned
the article on the grounds that everybody already knew how to
write a deterministic recognizer that runs in linear time.
There's no use for this slower program. What the reviewer failed
to observe is that it may take time exponential in the size of
the regexp (and ditto for space) to make such a recognizer.
In real life for a one-shot recognizer that can easily be the
dominant cost.

The problem of exponential construction time arose in Al Aho's
egrep. I was an early adopter--for the calendar(1) daemon. The
daemon generated a date recognizer that accepted most any
(American style) date. The regular expresssions were a couple
of hundred bytes long, full of alternations. Aho was chagrinned
to learn that it took about 30 seconds to make a recognizer
that would be used for less than a second. That led Al to the
wonderful invention of a lazily-constructed recognizer that
would only construct the states that were actually visited
during recognition. At last a really linear-time algorithm!

This is one of my favorite examples of the synergy of having
sytems builders and theoreticians together in one small 
department.

Doug


More information about the TUHS mailing list