[TUHS] regex early discussions
Will Senn
will.senn at gmail.com
Tue Mar 5 03:05:38 AEST 2024
To close the loop a bit...
I really appreciate the anecdotes and background. It's helpful to those
of us who didn't live it.
On the best resources front:
The Unix Programmer's Manual for v7 contains:
"A Tutorial Introduction to the UNIX Text Editor" by B. W. Kernighan -
excellent coverage of Context Searching using a limited subset of regex.
"Advanced Editing on UNIX" by B. W. Kernighan - lots of examples.
"ed(1)" by authors of the manpages - super concise but thorough coverage
of the regex rules (great followup to the tutorial).
Articles:
"Regular Expression Search Algorithm", by K. Thompson - an Algol-60
implementation of regex described in 4 pages... in 1968... I was 2 1/2.
"Regular Expression Matching Can Be Simple and Fast", by Russ Cox - how
can an article be both simple and deep? Great concision.
Other Books:
"The AWK Programming Language" by A. V. Aho, B. W. Kernighan, & P. J.
Weinberger - the discussion on pp. 28-31, Regular Expressions, is the
best I've seen.
"Chapter 9. Regular Expresssions" in the XBD section of the SUS (IEEE
Std 1003.1-2017) - Comprehensive presentation of the spec (good stuff,
even if nobody perfectly implements it).
There are plenty more, but with the tutorial, ed(1), and AWK book in
hand, I think a beginner is covered.
BTW, awk is awesome (particularly with the new csv additions) - I don't
"need" the new unicode support, but it's nice. I didn't get awk, but
when I figured out you could do this:
awk '/SYS.*\(write\,/, /\)/' */*
SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
size_t, count)
in the kernel source, I was sold. I've never really wrapped my head
around how to efficiently search over multiple lines, awk's range
patterns... just make sense :). Even in it looks crazy, it works.
ranges bounded by regexes... who'd of thunk it?
Will
On 3/3/24 8:03 PM, Marc Rochkind wrote:
> Will, here's my recollection, when I got to UNIX in late 1972 or
> thereabouts:
>
> First, there was ed. grep and sed were derived from ed, so came along
> later. awk came along way later.
>
> There were only manual pages. You typed "man ed" and there it was. The
> man pages were very accurate, very clear, and very authoritative. Many
> found them too succinct, especially as UNIX got more popular, but all
> of us back in the day found them perfect. Maybe you had to read the
> man page a few times to understand it, but at least that's all you had
> to read. No need to hunt around for more documentation!
>
> (Well, there was more documentation: The source code, which was all
> online. But reading the ed source to understand regular expressions
> was impossible. It was in assembler, and Ken was generating code on
> the fly as the expression was compiled.)
>
> Also, it should be noted that ed produced a single error message: a
> question mark. No wasting of teletype paper!
>
> The motivation for learning regular expressions was that that's how
> you edited files. ed was the only game in town.
>
> (sh used a greatly restricted form of regular expressions, which were
> documented on the sh man page.)
>
> Marc Rochkind
>
> On Sun, Mar 3, 2024 at 6:31 PM Will Senn <will.senn at gmail.com> wrote:
>
> Hi All,
>
> I was wondering, what were the best early sources of information
> for regexes and why did folks need to know them to use unix? In my
> recent explorations, I have needed to have a better understanding
> of them, so I'm digging in... awk's my most recent thing and it's
> deeply associated with them, so here we are. I went to the
> bookshelf to find something appropriate and as usual, I've traced
> to primary sources to some extent. I started with Mastering
> Regular Expressions by Friedl, and I won't knock it (it's one of
> the bestsellers in our field), but it's much to long for my
> personal taste and it's not quite as systematic as I would like
> (the author himself notes that his interests are less technical
> than authors preceding him on the subject). So, back to the
> shelves... Bourne's, The Unix Environment, and Kernighan & Pike's,
> The Unix Programming Evironment both talk about them in the
> context of grep, ed, sed, and awk. Going further back, the Unix
> Programmer's Manual v7 - ed, grep, sed, awk...
>
> After digging around it seems like folks needed regexes for ed,
> grep, sed and awk... and any other utility that leveraged the
> wonderful nature of these handy expressions. Fine. Where did folks
> go learn them? Was there a particularly good (succinct and
> accurate) source of information that folks kept handy? I'm
> imagining (based on what I've seen) that someone might cut out the
> ed discussion or the grep pages of the manual and tape them to
> their monitors, but maybe I'm stooopid and they didn't need no
> stinkin' memory device for regexes - surely they're intuitive
> enough that even a simpleton could pick them up after seeing a few
> examples... but if that were really the case, Friedl's book would
> have been a flop and it wasn't :). So seriously, if you remember
> that far back - what was the definitive source of your regex
> knowledge and what were the first motivators for learning them?
>
> Thanks,
>
> Will
>
>
>
> --
> /My new email address is mrochkind at gmail.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20240304/d22a60c7/attachment-0001.htm>
More information about the TUHS
mailing list