Rules for Mail Flitering
Posted: Mon 28 Jul 2014 15:30
In a previous topic, http://british-caving.org.uk/phpBB3/vie ... =31&t=1254, I noted some useful filter rules, including
Another useful regular expression: I wanted to discard all messages to the BCRA Trustees list, where the sender was not in a .uk, .com or .net domain. (Non-members of the list are rejected anyway, but I wanted to reduce my admin burden). Multiple negatives are difficult to do in regular expressions. In this case, what is needed is a "lookbehind" construction, but this is not always supported. Rather then mess around with an expression that I didnt know would be valid, I used a "lookahead" construction instead, viz: List of non-member addresses whose postings will be automatically rejected ^.*(?!(\.uk|com|net)$)...$. This is a bit more convoluted but still pretty neat. You can (if youre familiar with the concepts) see why a lookbehind would be neater.
Notes
- Body Contains Content-Type: application/x-zip-compressed, "or" Body contains Content-Type: application/zip "or" Body matches regex name=".{1,20}\.zip to get rid of ZIP attachments (often containing nasty payloads). See note 1 and next posting
- Any Header Matches regex \nX-mailer[^\n]+?(sourceforge|[a-z]{4,6}[^ \n][0-9]{2}|[A-Z][a-z]* v[0-9].[0-9])\n to get rid of suspect mailer programs
- Any header matches regex (?s:(some-string.*?,.*?){3,}) to match any message with three or more occurrences of the same (specified) string (see note2)
Another useful regular expression: I wanted to discard all messages to the BCRA Trustees list, where the sender was not in a .uk, .com or .net domain. (Non-members of the list are rejected anyway, but I wanted to reduce my admin burden). Multiple negatives are difficult to do in regular expressions. In this case, what is needed is a "lookbehind" construction, but this is not always supported. Rather then mess around with an expression that I didnt know would be valid, I used a "lookahead" construction instead, viz: List of non-member addresses whose postings will be automatically rejected ^.*(?!(\.uk|com|net)$)...$. This is a bit more convoluted but still pretty neat. You can (if youre familiar with the concepts) see why a lookbehind would be neater.
Notes
- To accept ZIPs, ask your sender to include an authorisation string in the subject line, and use a rule like Subject Begins Authorised <password> Stop Processing
- I wanted a rule that discarded any message where the To: address list contained three or more occurrences of the same (specified) string (e.g. for spammers using lists of similar addresses. My problem was that the To: list may be split over several lines, and a 'dot' matches any character other than new-line. You cannot alter the 'mode' of the regexp because that is fixed by the software that processes the user-specified regexp, but you can switch different modes on and off in sub-expressions, so you just need to encapsulate your regexp in (?s:<your regexp>). So, for example, to discard any email where the string "d.gibson" appears three or more times you would write (?s:(d\.gibson.*?,.*?){3,})