[TUHS] Likely a one-liner in Unix

Ralph Corderoy ralph at inputplus.co.uk
Tue Jun 11 18:05:06 AEST 2024


Hi James,

> > >    "Show me the last 5 files read in a directory tree"

Given sort(1) gained -u for efficiency, I've often wondered why, in
those constrained times, it didn't have a ‘-m n’ to output only the
n ‘minimums’, e.g. ‘sed ${n}q’.  With ‘-m 5’, this would let sort track
the current fifth entry and discard input which was bigger, so avoiding
both storing many unwanted lines and finding the current line's location
within them.


> OK, I'll bite (NB: using GNU find):

I think the POSIX way of getting the atime would be ‘LC_CTIME=C ls -lu’
and then parsing the two possible date formats.  So non-POSIX find is
simpler.  Also, GNU find shows me the sub-second part but ls doesn't.
Neither does GNU ‘stat -c '%X %n'’.

> find "$directory_tree" -type f -printf "%A+ %p\n" | sort -r | cut -d' ' -f2 | head -5

- I'd switch the atime format to seconds since epoch for easier
  formatting given it's discarded.
- When atimes tie, sort's -r will give file Z before A so I'd add some
  -k's so A comes first.
- I'd move the head to before the cut so cut processes fewer lines...
- But on so few lines, I'd just use sed to do both in one.

    find "$@" -type f -printf '%A@ %p\n' |
    sort -k1,1nr -k2 |
    sed 's/^[^ ]* //; 5q'


Remaining issues...

If tied entries bridge the top-five border then this isn't shown.
Is the real requirement to show files with the five most recent distinct
atimes?

    awk '{t += !s[$0]; s[$0] = 1; print} t == 5 {exit}'

Though this might give many lines.  Instead, an ellipsis could show
a tie bridged the cut-off.

    awk 't {if ($0 == l) print "..."; exit} NR == 5 {l = $0; t = 1} 1'

Paths can contain linefeeds and some versions allow handling NULs to be
tediously employed.

    find "$@" -type f -printf '%A@ %p\0' |
    sort -z -k1,1nr -k2 |
    sed -z 's/[^ ]* //; 5q' |
    tr \\0 \\n


David Wheeler has a nice article he maintains on unusual characters in
filenames: how to cope, and what other systems do, e.g. Plan 9.

    Fixing Unix/Linux/POSIX filenames: control characters (such as
        newline), leading dashes, and other problems
    David A. Wheeler, 2023-08-22 (originally 2009-03-24)
    https://dwheeler.com/essays/fixing-unix-linux-filenames.html

As he writes, Linux already returns EINVAL for some paths on some
filesystem types.  A mount option which had a syscall return an error on
meeting an insensible path would be useful.  It avoids any attempt at
escapement and its greater risk of implementation errors.  I could
always re-mount some old volume without the option to list the directory
and fix up its entries.  The second-best day to plant a tree is today.

-- 
Cheers, Ralph.


More information about the TUHS mailing list