[TUHS] Likely a one-liner in Unix
Ralph Corderoy
ralph at inputplus.co.uk
Tue Jun 11 18:05:06 AEST 2024
Hi James,
> > > "Show me the last 5 files read in a directory tree"
Given sort(1) gained -u for efficiency, I've often wondered why, in
those constrained times, it didn't have a ‘-m n’ to output only the
n ‘minimums’, e.g. ‘sed ${n}q’. With ‘-m 5’, this would let sort track
the current fifth entry and discard input which was bigger, so avoiding
both storing many unwanted lines and finding the current line's location
within them.
> OK, I'll bite (NB: using GNU find):
I think the POSIX way of getting the atime would be ‘LC_CTIME=C ls -lu’
and then parsing the two possible date formats. So non-POSIX find is
simpler. Also, GNU find shows me the sub-second part but ls doesn't.
Neither does GNU ‘stat -c '%X %n'’.
> find "$directory_tree" -type f -printf "%A+ %p\n" | sort -r | cut -d' ' -f2 | head -5
- I'd switch the atime format to seconds since epoch for easier
formatting given it's discarded.
- When atimes tie, sort's -r will give file Z before A so I'd add some
-k's so A comes first.
- I'd move the head to before the cut so cut processes fewer lines...
- But on so few lines, I'd just use sed to do both in one.
find "$@" -type f -printf '%A@ %p\n' |
sort -k1,1nr -k2 |
sed 's/^[^ ]* //; 5q'
Remaining issues...
If tied entries bridge the top-five border then this isn't shown.
Is the real requirement to show files with the five most recent distinct
atimes?
awk '{t += !s[$0]; s[$0] = 1; print} t == 5 {exit}'
Though this might give many lines. Instead, an ellipsis could show
a tie bridged the cut-off.
awk 't {if ($0 == l) print "..."; exit} NR == 5 {l = $0; t = 1} 1'
Paths can contain linefeeds and some versions allow handling NULs to be
tediously employed.
find "$@" -type f -printf '%A@ %p\0' |
sort -z -k1,1nr -k2 |
sed -z 's/[^ ]* //; 5q' |
tr \\0 \\n
David Wheeler has a nice article he maintains on unusual characters in
filenames: how to cope, and what other systems do, e.g. Plan 9.
Fixing Unix/Linux/POSIX filenames: control characters (such as
newline), leading dashes, and other problems
David A. Wheeler, 2023-08-22 (originally 2009-03-24)
https://dwheeler.com/essays/fixing-unix-linux-filenames.html
As he writes, Linux already returns EINVAL for some paths on some
filesystem types. A mount option which had a syscall return an error on
meeting an insensible path would be useful. It avoids any attempt at
escapement and its greater risk of implementation errors. I could
always re-mount some old volume without the option to list the directory
and fix up its entries. The second-best day to plant a tree is today.
--
Cheers, Ralph.
More information about the TUHS
mailing list