[COFF] Requesting thoughts on extended regular expressions in grep.
Ralph Corderoy
ralph at inputplus.co.uk
Sat Mar 4 20:07:17 AEST 2023
Hi Grant,
> Suppose I have the following two lines:
>
> aaa aaa
> aaa bbb
>
> Does the following RE w/ back-reference introduce a big performance
> penalty?
>
> (aaa|bbb) \1
>
> As in:
>
> % echo "aaa aaa" | egrep "(aaa|bbb) \1"
> aaa aaa
You could measure the number of CPU instructions and experiment.
$ echo xyzaaa aaaxyz >f
$ ticks() { LC_ALL=C perf stat -e instructions egrep "$@"; }
$
$ ticks '(aaa|bbb) \1' <f
xyzaaa aaaxyz
Performance counter stats for 'egrep (aaa|bbb) \1':
2790889 instructions:u
0.009146904 seconds time elapsed
0.009178000 seconds user
0.000000000 seconds sys
$
Bear in mind that egreps differ, even within GNU egrep, say, over time.
$ LC_ALL=C perf stat -e instructions egrep '(aaa|bbb) \1' f
xyzaaa aaaxyz
...
2795836 instructions:u
...
$ LC_ALL=C perf stat -e instructions perl -ne '/(aaa|bbb) \1/ and print' f
xyzaaa aaaxyz
...
2563488 instructions:u
...
$ LC_ALL=C perf stat -e instructions sed -nr '/(aaa|bbb) \1/p' f
xyzaaa aaaxyz
...
610213 instructions:u
...
$
--
Cheers, Ralph.
More information about the COFF
mailing list