[TUHS] regex early discussions

Mon Mar 4 18:43:48 AEST 2024

On Mon, 4 Mar 2024, 08:27 Rob Pike, <robpike at gmail.com> wrote [to Larry]

Oh happy days. Hi Rob, loved the book.

If that's really true, that you learned from Spencer's library, then you
> didn't learn the most important thing about them, which is the automata
> theory that guarantees their performance is always linear. Not to take
> anything away from Henry, who admitted at the time that it could be slow
> for bad expressions, but we're still paying the price for refusing to
> connect "regex" with the theory that created them, ignoring it in fact.
>

I once got into a bunfight with a Googler on the topic of coding interview
questions, on a related matter. He was promulgating a regular expression to
correctly match/parse-out legitimate dotted-quad IPv4 addresses, including
bounds-checking the octets to be in the range 0..255, and arguing that it
since it was going to be run through a DFA that it was a sunk cost for
efficiency and therefore perfect.

The result looked like line noise, and he was perturbed that I said I would
prefer to take a much simpler (NFA?) RE, parse out the ints and
bounds-check them, just to reduce cognitive load and increase
maintainability of code.

We didn't really come to an agreement.

-a
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.tuhs.org/pipermail/tuhs/attachments/20240304/2def2bab/attachment.htm>