[TUHS] Command line options and complexity

Doug McIlroy doug at cs.dartmouth.edu
Wed Mar 11 04:42:53 AEST 2020


> This begs questions of stability

Astute question. I had that in my original draft, but eliminited
it for what I thought was clarity. Anyway, depending on implementation
of sort, you may need sort -s. Of course it doesn't matter which copy
among several equal lines uniq produces, nor does it matter in sort 
when there are no comparison options--they're all the same.

> I don't know enough about the
> internals of sed to know even what algorithm it uses 
> (... a disk-based merge sort?)

sed is not a sorting program--basically it copies input to     
output, making line-by-line editing changes. That's the       
way I meant to use it in sed s/nonkeys//|sort -keys|uniq.
(I have added options to sort, hopefully for clarity).
The argument to sed here means substitute the empty
string for the nonkey fields (specified by a regular expression).


If "sed" was a typo for "sort", all versions of sort that
I know of use an internal sorting algorithm for big chunks
of the file, then combines the chunks by merge. But internal
sorting varies all over the map--variations on quicksort,
radix sort, merge sort, ...

Doug


More information about the TUHS mailing list