Custom SpamAssassin scores and rules [message #114168] |
Fri, 13 June 2014 04:33  |
MarkK
Messages: 342 Registered: April 2007
|
|
|
|
Just sharing this with everyone. As I have said in my past posts, I am an advocate for setting up custom scores in SpamAssassin. Some of the default values such as .001 are just more of an 'informative score' rather than a 'score building up a spam score'.
This will be a little bit long, but trying to make it easy to understand after reading it a couple of times. You can probably do some of the same thing with the built in Custom Spam rules in Connect Admin, but it may take more individual rules to accomplish the same thing.
Tired of still seeing spams making it through the filter with maybe only a hit on the RDNS, reverse dns lookup, I decided it was time to learn how to start writing some of the scoring rules. Basic rules are not very hard. That is far as I have made it so far; I need to learn some Perl scripting language structure in order to get more advanced.
So I thought I would share the basic rule writing here. These are additional lines that you would put in to your custom .cf file, should as zlocal.cf. DON'T change the .cf files that came with Kerio / SpamAssassin, those will be overwritten the next time you upgrade. it would also be a good idea to keep a duplicate copy of your custom .cf file, even if it is in your disk/tape backup.
You can find this info and a little more at http://wiki.apache.org/spamassassin/WritingRules
To write one basic rule, you will need to add at least 2 lines to your file, and a 3rd line if you want to put a informative description of that particular rule.
body {RuleName} {Rule} OR header {RuleName} {HeaderName} =~ {Rule}
score {Rulename} {Score}
description {RuleDescription}
Figure out a name for your local rule. To avoid naming your rule the same as one of the existing SA rules, start it with LOCAL_; for example LOCAL_DOCOZ.
Figure out if you will be looking at the Body or at one of the headers, such as Subject. Determine what score you want to assess if the rule is true.
In this example, I will add a score of 2.0 for emails that have reference to DocOz in the body.
body LOCAL_DOCOZ /docoz/i
score LOCAL_DOCOZ 2.0
describe LOCAL_DOCOZ Found DocOz in email body
Line1 - Looking in email "body", rule name is "LOCAL_DOCOZ", and searching for the text "docoz" (characters between the forward slashes / /) any where in the body, and letter case is "i" or insensitive. Will match 'docoz' or 'DocOz'. This will also match DocOzzie or DocOzborn, since I have not specified any word breaks.
Line2 - If this rule LOCAL_DOCOZ finds a hit, it will add 2.0 to the spam score. You can set any number you want. Personally, I try not to go overboard with the scoring numbers.
Line3 - The describe statement contains the text which will be placed into the verbose report, if verbose reports are used. I don't believe they are used in Kerio. You can omit this line if you want.
That is a simple functional rule. But DocOz can be referenced in a number of ways in spams, such as 'Doc Oz' or 'Dr Oz'. This can be handled easily by adding ( ) and | characters.
body LOCAL_DOCOZ /(DocOz|Doc Oz|DrOz|Dr Oz|Dr. Oz)/i
Everything between the ( ) characters is considered to be a list of items, separated by the | character. So now we will have a hit if the body contains "DocOz", "Doc Oz", "DrOz", "Dr Oz", or "Dr. Oz"; all case insensitive. But we will still match the name Ozzie or Ozborn.
You can add \b, which searches for a word-break (anything that isn't an alphanumeric character or underscore) so that the match is more exacting.
body LOCAL_DOCOZ /\b(DocOz|Doc Oz|DrOz|Dr Oz|Dr. Oz)\b/i
Now we are looking for the following:
" DocOz "
" Doc Oz "
" DrOz "
" Dr Oz "
" Dr. Oz "
Now none of what we are looking for will no longer match "Dr Ozborn", since the character after the "z" is not a word-break character such as a space.
To match a header:
header LOCAL_DOCOZ_SUBJECT Subject =~ /\b(DocOz|Doc Oz|DrOz|Dr Oz|Dr. Oz)\b/i
score LOCAL_DOCOZ_SUBJECT 2.0
Here we are looking at the Subject header (don't forget the =~ characters), searching for the same DocOz variations. If found, 2.0 is added to the score.
You can examine any of the headers. You can find the header name by looking at the emails headers or view the message's source code.
This is just a quick basic lesson on writing your own rules. Once you have saved the .cf file that you put your rules in, you will need to restart Kerio Connect in order for them to start being used.
May all your spam rules find their mark and kill the nasty little beasts.
|
|
|
Re: Custom SpamAssassin scores and rules [message #114254 is a reply to message #114168] |
Mon, 16 June 2014 18:30   |
MarkK
Messages: 342 Registered: April 2007
|
|
|
|
For those that might be wondering, after a couple of days of looking at the spams that were passing through the filters, here are the custom rules that I have written so far. They have caught a majority of the spams slipping through.
Of course, you have to be careful about what words or phrases you pick out so that you are not blocking what might be good emails for your particular industry line. I would suggest finding out who gets your most spams, get their permission, and have copies of their emails forwarded in to a private public folder (by private, I mean access restricted to just that need it). Then start looking to see what spam there is and what you can pick out to identify specifically spam and not good emails also. Then write a new rule or add the text to an existing rule.
Some of these rules probably could be combined and named a little more generically. This is just the result of learning how to write my own basic rules kind of one at a time at first, then adding more hits to a rule I already wrote.
These lines would go in to your custom .cf rules file. I wouldn't use the local.cf since Kerio will overwrite it at your next version upgrade. I suggest creating your own local file, such as zlocal.cf, since .cf files are processed in alphabetically file name order. Starting your file with a "z" will make it the last file processed and should allow it to survive a Kerio version update.
===
body lcl_BODY_05 /(facelift|face lift)/i
score lcl_BODY_05 0.5
body lcl_BODY_30 /(Public Records Now Available|Hollywood celebrity|Celeb Gossip|auto warranty|Pet stores are so -overpriced|monthly AutoInsurance payments were just-reduced|Dealers have teamed up and are offering a special Clearance|Score-Report Team|Score Report Team|Your Pre-Approved Card|weight gain|no workout needed|lose weight rapidly)/i
score lcl_BODY_30 3.0
header lcl_SUBJECT_30 Subject =~ /(Belly Bulge|Restore Vision loss|Look Years-Younger|Someone has run a background|background-check on you|background check on you)/i
score lcl_SUBJECT_30 3.0
body lcl_BODY_35 /(Ondemand Research|\bICANCiANCg==\b)/i
score lcl_BODY_35 3.5
header lcl_SUBJECT_35 Subject =~ /(slim-fast|Home Depot Replacement Windows|Cigars|Satellite Internet|High Speed Internet)/i
score lcl_SUBJECT_35 3.5
header lcl_FROM_35 From =~ /spammer/i
score lcl_FROM_35 3.5
body lcl_BODY_40 /(losing( |-)pound|Transform Your Body|melt fat|melt away fat|melting fat|need cash fast|free profit|Profit Maker|easy trick to save you|To YOUR Success|per day part-time|weird food|never eat this food|DocOz|Doc Oz|DrOz|Dr Oz|Dr. Oz|Dr-Oz)/i
score lcl_BODY_40 4.0
header lcl_SUBJECT_40 Subject =~ /(DocOz|Doc Oz|DrOz|Dr Oz|Dr. Oz|Dr-Oz)/i
score lcl_SUBJECT_40 4.0
body lcl_BODY_45 /dirty little/i
score lcl_BODY_45 4.5
body lcl_BODY_50 /(ashleymadison|ashley madison|eHarmony|Fountain of Youth)/i
score lcl_BODY_50 5.0
header lcl_FROM_50 From =~ /(ashleymadison|ashley madison|eHarmony)/i
score lcl_FROM_50 5.0
[Updated on: Tue, 17 June 2014 18:56] Report message to a moderator
|
|
|
|
Re: Custom SpamAssassin scores and rules [message #114366 is a reply to message #114288] |
Fri, 20 June 2014 20:29   |
MarkK
Messages: 342 Registered: April 2007
|
|
|
|
Just FYI update to anyone that might be following this.
With the above rules, and a couple of additional ones that I have added since, we are pretty much free of real spam being delivered to user's inboxes; the spams that are just pure junk.
Some of my users need to realize the difference between real spam, and the unsolicited emails that you can actually unsubscribe from. I consider the latter to be more of annoyance spam.
I know that this status can change at any moment, and will require some rule changes, but it feels good at the moment to have it under control.
|
|
|
|
Re: Custom SpamAssassin scores and rules [message #116280 is a reply to message #114168] |
Wed, 24 September 2014 13:38   |
yukiomishima
Messages: 116 Registered: July 2006
|
|
|
|
howdy
we seem to be getting a fair amount of spam creeping thru again (it seems to come in waves through-out the day)
i am hoping to get my own rules written for spamAssassin per your fab tutorial
is there:
- any chance you can upload your .cf file so that i can use / modify it to my needs
- let me know where i need to put in on an OSX install
huge thanks for your brilliant work on the tutorial
yukioMishima
|
|
|
Re: Custom SpamAssassin scores and rules [message #116292 is a reply to message #114168] |
Wed, 24 September 2014 18:17   |
MarkK
Messages: 342 Registered: April 2007
|
|
|
|
Here is a copy pretty much what I am using. BUT, this is geared towards both my industry and the spams that we received. So you industry and received spam will probably require you to tweak things.
I do not use OSX, so I don't know the exact folder of where it should go on your server, but all you need to do is find the other .CF and put your modified version of zMyRules.cf with them, then restart Connect for it to start using the new rules.
Remember, the rule files are processed in alphanumeric order, so that is why the file name starts with a "z", so it is at the last of the processing list and so that during the next Connect update, the file is not overwritten. You will probably should keep a copy of the file elsewhere, just in case the Connect update just deletes the old rules folder and creates a fresh one, thereby wiping out your custom rule file. I haven't had that happen, but better safe than sorry.
-
Attachment: zMyRules.cf
(Size: 5.36KB, Downloaded 919 times)
|
|
|
Re: Custom SpamAssassin scores and rules [message #116294 is a reply to message #116292] |
Wed, 24 September 2014 18:30   |
yukiomishima
Messages: 116 Registered: July 2006
|
|
|
|
MarkK
HUGE thanks for the file
i believe i have found the location for the .cf file on an OSX install... so i will take a look at what you have and modify as per the spam we are currently receiving (we seem to be getting a lot with the works MILK and CANCER recently)
i will post back with results once i have optimised the file to our needs
might be good for others that have created custom .cf files to share?
just when i think i am on top of the spam... a new wave seems to hit
huge thanks again for all
yukioMishima
|
|
|
Re: Custom SpamAssassin scores and rules [message #116296 is a reply to message #116294] |
Wed, 24 September 2014 18:43   |
MarkK
Messages: 342 Registered: April 2007
|
|
|
|
The MILK-CANCER is in there. I haven't added anything for the spams that we have gotten on VEGTABLES-CANCER, or my new favorite WATER-CANCER. Didn't read the water one to see what it was suggesting as a replacement.
|
|
|
|
Re: Custom SpamAssassin scores and rules [message #116299 is a reply to message #116298] |
Wed, 24 September 2014 19:53   |
yukiomishima
Messages: 116 Registered: July 2006
|
|
|
|
btw.... which blacklists are you using... and what scores do you have associated with them..... i have a fair few... which used to be quite good with tagging the spam... but... they seem less successful now
we had to disable greylisting for a few days... it has been reenabled... but.. since i was off/on... there has been a noticeable increase in spam
also
we seem to be getting spam addressed to employees that have left the firm years ago (and weren't getting spam til recently)... weird
thanks
yukioMishima
|
|
|
Re: Custom SpamAssassin scores and rules [message #116300 is a reply to message #114168] |
Wed, 24 September 2014 20:08   |
MarkK
Messages: 342 Registered: April 2007
|
|
|
|
I tag at 5, block at 8.
Lashback was giving me too many false positives. URIBL seems to hit its limit of providing feedback pretty quickly, so quite often it ends up always getting triggered in the ratings.
Basically, you end up doing what some people pay companies to do; monitor the spams and make adjustments to the filters as the characteristics change. Once you get it functioning pretty good, you will get a few weeks off, but then the spams change and you start tweaking again for the new stuff. Just won't take as long this time.
What I do is see what the score are on un-tagged spams, find a characteristic that identifies that spam (whether it be some word(s) in the Subject and/or body) and add that to the least scoring rule that will boast it up over a 5 score.
Though I have some instant 5 rules in there, I try to start out low to allow any valid emails that may have trait to hopefully come in with being mis-tagged.
|
|
|
|
|
Re: Custom SpamAssassin scores and rules [message #116466 is a reply to message #114168] |
Thu, 02 October 2014 14:42   |
yukiomishima
Messages: 116 Registered: July 2006
|
|
|
|
markK
apologies for the delay in getting back to you.. i have been away from my desk for the past few days... putting out some IT fires
i have indeed implemented... and it appears to be working wonders
seems like there is a constant barrage of new spam getting thru though... so it seems i will be adding to the file as time goes by
how often are you adding / modifying the file... or do you feel that where you have it now does a pretty good job for you
HUGE thanks again for all... your help & sharing has made a HUGE difference to the effectives of our fight against spam
yukioMishima
|
|
|