[TUHS] Origins of globbing

John Cowan cowan at ccil.org
Wed Oct 7 01:17:58 AEST 2020


On Tue, Oct 6, 2020 at 5:54 AM Tyler Adams <coppero1237 at gmail.com> wrote:

How did globbing come about in unix?
>

It's been present at least since the PDP-11 migration. The Thompson shell,
used in the 1st through 6th Editions, used a separate program called
/etc/glob to do the dirty work, presumably in order to keep /bin/sh as
small as possible. Unfortunately, glob never got its own man page, so its
protocol for communicating with the shell is lost, unless someone remembers
it and writes it down (hint, hint).

Related, as regexes were already well known because of qed/ed, why wasn't a
> subset of regular expressions used instead?
>

The use of * and ?  along with file extensions preceded by dot (as in ".c"
and ".o") are, or so it seems to me, an inheritance from the DEC operating
systems, starting with Monitor (later called TOPS-10) in 1964 and going
right through OpenVMS.  In the file systems used by those OSes, the
"filename" (typically up to 6 characters) and the "extension" (typically up
to 3 characters) were stored separately both on disk and in memory, and the
separating dot was parsed by user programs before invoking the appropriate
kernel routine.  (That is why it is still true in WIndows that "foo" and
"foo." refer to the same file.)

Because dot was not in any way magic to the Unix file system, and because
file names were limited to 14 characters, extensions were kept short.
However, the path that leads from DEC OSes to CP/M to MS-DOS to Windows has
kept the 3-letter extension alive, and we now see plenty of it in
Unix-style OSes.  Thus using dot to mean "any character" would seriously
collide with this well-established usage as the extension separator.

Globbing was uninterpreted by the shell-equivalent in the DEC OSes, and was
understood only by a few programs, those responsible for listing
directories and copying, renaming, and deleting files.  Universal globbing
in the shell was AFAIK original with Unix, though Prime Computer's PRIMOS
also had it and may have been earlier by a year or two.  "It steam-engines
when it comes steam-engine time."  Both were direct descendants of Multics;
I have not been able to find out anything about

TIL that GNU find(1) supplements the standard -name option (which globs
against the filename) with -regex (which matches the regex against the
whole path).



John Cowan          http://vrici.lojban.org/~cowan        cowan at ccil.org
The Imperials are decadent, 300 pound free-range chickens (except they have
teeth, arms instead of wings, and dinosaurlike tails).  --Elyse Grasso
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20201006/042de9d1/attachment.htm>


More information about the TUHS mailing list