[COFF] [TUHS] Unix gre, forgotten successor to grep (was: forth on early unix)

Tue Oct 7 03:39:10 AEST 2025

Since QED predated Unix, I'm redirecting this to COFF.

Ken's CACM article evoked an unusually harsh response in Computing
Reviews. The reviewer said roughly that everybody knows one can make a
deterministic recognizer that runs in linear time, so why waste our
time talking about an NFA recognizer? This moved me to write a letter
to the editor of CR in Ken's defense. I pointed out that a
deterministic recognizer can have  exponentially more states than the
number of symbols in the regular expression. This might well overflow
memories of the time and in any event would take exponential time to
construct and completely wipe out the advantage of linear recognition
time. (Al Aho had not yet invented the egrep algorithm, which
constructs only the states encountered during recognition.)

Computing Reviews did not have a letters section, so, as far as I
know, the off-base review still stands unrebutted in the literature.

Doug

On Mon, Oct 6, 2025 at 12:04 AM Thalia Archibald <thalia at archibald.dev> wrote:
>
> Ken,
>
> Your email reminds me of a comment you made in a 1989 interview with Mike
> Mahoney, that suggests something earlier than QED:
>
> > I did a lot of compiling. Even in college and out of college I did a lot of
> > on-the-fly compilers. Ah. ah. I wrote a GREP-like program. It would... You
> > type in …, you’d say what you wanted it to look for, and a sed-like thing
> > also. That you’d say, I want to do a substitute of A for B or some block of
> > text. What it would do is compile a program that would look for A and
> > substitute in B and then run the compiled program so that one level removed
> > from it do I direct my (unclear) and the early languages, the regular
> > expression searching stuff in ED and its predecessors on CTSS and those things
> > were in fact compilers for searches. They in fact compiled regular...
>
> https://www.tuhs.org/Archive/Documentation/OralHistory/transcripts/thompson.htm
>
> By anyone's history of regular expressions, your matcher in QED was the first
> software implementation of regular expressions. Was this grep-like program you
> wrote in college something earlier than that? Could you share more about it? Do
> you somehow still have the source for these? I'd love to study it.
>
> Thalia
>
> On Sep 23, 2025, at 11:40, Ken Thompson wrote:
> > i think the plan9 grep is the fastest.
> > it is grep, egrep, fgrep also.
> > i think it is faster than boyer-moore.
> > the whole program is a jit dfa
> >
> >   read block
> >   for c in block
> >   {
> >      s=s.state[c]
> >      if s == nil do something occasionally
> >   }
> >
> > it is a very few cycles per byte. all of the
> > time is reading a block. i cant imagine b/m
> > could be faster. the best b/m could do is
> > calculate the skip and then jump over
> > bytes that you have already read.
> >
> >
> > russ cox used it to do the (now defunct) code
> > search in google.
>
>