[COFF] Requesting thoughts on extended regular expressions in grep.

Grant Taylor via COFF coff at tuhs.org
Sat Mar 4 05:36:35 AEST 2023


On 3/3/23 9:12 AM, Dave Horsfall wrote:
> I can't help but provide an extract from my antispam log summariser 
> (AWK):
> 
>      # Yes, I have a warped sense of humour here.
>      /^[JFMAMJJASOND][aeapauuuecoc][nbrrynlgptvc] [ 0123][0-9] / \
>      {
> 	date = sprintf("%4d/%.2d/%.2d",
> 	    year, months[substr($0, 1, 3)], substr($0, 5, 2))

Thank you for sharing that Dave.

> Etc.  The idea is not to validate so much as to grab a line of interest 
> to me and extract the bits that I want.

Fair enough.

Using bracket expressions for the three letters is definitely another 
idea that I hadn't considered.

But I believe I like what I think is -- what I'm going to describe as -- 
the more precise alternation listing out each month. (Jan|Feb|Mar...

Such an alternation is not going to match Jer like the three bracket 
expressions will.  I also believe that the alternation will be easier to 
maintain in the future.  Especially by someone other than me that has 
less experience with REs.

> In this case I trust the source (the Sendmail log), but of course 
> that is not always the case...

I trust that syslog will produce consistent line beginnings more than I 
trust the data that is provided to syslog.  But I'd still like to be 
able to detect "Jer" or "Dot" if syslog ever tosses it's cookies.

> When doing things like this, you need to ask yourself at least the 
> following questions:
> 
> 1) What exactly am I trying to do?  This is fairly important :-)

Filter out known to be okay log entries.

> 2) Can I trust the data?  Bobby Tables, Reflections on Trusting 
> Trust...

Given that I'm effectively negating things and filtering out log entries 
that I want to not see (because they are okay) I'm comfortable with 
trusting the data from syslog.

Brown M&Ms come to mind.

> 3) Etc.
> 
> And let's not get started on the difference betwixt "trusted" and 
> "trustworthy" (that distinction keeps security bods awake at night).

ACK



-- 
Grant. . . .
unix || die

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4017 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://www.tuhs.org/pipermail/coff/attachments/20230303/eaaed4d0/attachment.p7s>


More information about the COFF mailing list