Difference between revisions of "DSPAM"

From SME Server
Jump to navigationJump to search
m
Line 35: Line 35:
 
==Configuration==
 
==Configuration==
  
The contrib initially does DSPAM training and will continue to do so until DSPAM claims that training is complete. It monitors the output of "dspam_stats -H" to see when training has completed and will then switch to scoring/tagging mode. When training is complete the admin will receive an email notification. Until it received this mode you will not see any DSPAM benefits. When it starts scoring you can see the dspam qpsmtpd plugin in actions by tailing the log.
+
The contrib initially does DSPAM training and will continue to do so until DSPAM claims that training is complete. It monitors the output of "dspam_stats -H" to see when training has completed and will then switch to scoring/tagging mode. When training is complete the admin will receive an email notification. Until it received this mode you will not see any DSPAM benefits.  
 
 
tail -f /var/log/qpsmtpd/current | tai64nlocal | grep -i dspam
 
  
 
The training of DSPAM is done based on SpamAssassin scores and by default it will train as SPAM if SpamAssassin rejects the email and score is above 9. It will train as ham (DSPAM terminology innocent) when mail is scores lower than 5 by SpamAssassin.
 
The training of DSPAM is done based on SpamAssassin scores and by default it will train as SPAM if SpamAssassin rejects the email and score is above 9. It will train as ham (DSPAM terminology innocent) when mail is scores lower than 5 by SpamAssassin.
Line 49: Line 47:
  
 
  signal-event email-update
 
  signal-event email-update
 
+
 
==Statistics==
 
==Statistics==
  
Line 68: Line 66:
 
                 OCA Overall Accuracy:            97.16%
 
                 OCA Overall Accuracy:            97.16%
  
 +
 +
When contrib is in training mode you should see the following type of event in your qpsmptd log when issuing the command:
 +
 +
tail -f /var/log/qpsmtpd/current | tai64nlocal | grep dspam
 +
 +
2010-01-04 16:05:43.495837500 24369 dspam plugin: Training email as spam (32.3 > 9)
 +
2010-01-04 16:06:12.922243500 24460 dspam plugin: Training email as spam (26.2 > 9)
 +
2010-01-04 16:08:30.707928500 24571 dspam plugin: Training email as spam (40.2 > 9)
 +
2010-01-04 16:15:09.209315500 25154 dspam plugin: Training email as spam (28.7 > 9)
 +
2010-01-04 16:15:12.657721500 25093 dspam plugin: Training email as innocent (-2.3 < 5)
 +
2010-01-04 16:15:31.505187500 25230 dspam plugin: Training email as innocent (1.0 < 5)
 +
2010-01-04 16:15:56.084894500 25261 dspam plugin: Training email as spam (33.2 > 9)
 +
2010-01-04 16:16:35.734852500 25302 dspam plugin: Training email as innocent (0.1 < 5)
 +
2010-01-04 16:16:37.373583500 25297 dspam plugin: Training email as spam (39.5 > 9)
 +
2010-01-04 16:17:50.398104500 25284 dspam plugin: Training email as spam (30.2 > 9)
 +
2010-01-04 16:18:13.514300500 25412 dspam plugin: Training email as spam (23.2 > 9)
 +
2010-01-04 16:18:41.653611500 25396 dspam plugin: Training email as spam (35.2 > 9)
 +
2010-01-04 16:20:05.432484500 25486 dspam plugin: Training email as spam (24.6 > 9)
 +
2010-01-04 16:20:07.036783500 25528 dspam plugin: Training email as innocent (1.7 < 5)
 +
2010-01-04 16:21:04.378237500 25766 dspam plugin: Training email as innocent (1.0 < 5)
 +
2010-01-04 16:21:21.849091500 25797 dspam plugin: Training email as innocent (-2.6 < 5)
 +
2010-01-04 16:22:32.693008500 25860 dspam plugin: Training email as spam (30.3 > 9)
 +
2010-01-04 16:28:22.610804500 26245 dspam plugin: Training email as spam (24.3 > 9)
 +
 +
When contrib is in tagging mode you can see the following type of output from the command:
 +
 +
tail -f /var/log/qpsmtpd/current | tai64nlocal | grep dspam
 +
 +
2010-01-04 16:14:27.830989500 21955 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b4205d3219672044083174)
 +
2010-01-04 16:15:57.446155500 22065 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42062d220731786917372)
 +
2010-01-04 16:20:55.422770500 22430 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b420757224401732614111)
 +
2010-01-04 16:21:05.836167500 22453 dspam plugin: dspam result: Innocent with Confidence of 0.99 and Probability of 0.0000 (4b420761224588618216848)
 +
2010-01-04 16:21:20.033604500 22330 dspam plugin: dspam result: Spam with Confidence of 0.80 and Probability of 1.0000 (4b420770224877713217748)
 +
2010-01-04 16:24:41.615738500 22636 dspam plugin: dspam result: Innocent with Confidence of 0.76 and Probability of 0.0000 (4b420839226414726512081)
 +
2010-01-04 16:24:43.453742500 22636 dspam plugin: Retraining email as spam classification (14.9 > 9)
 +
2010-01-04 16:25:34.647693500 22729 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42086e227377747245261)
 +
2010-01-04 16:25:38.648186500 22743 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b420872227551892345671)
 +
2010-01-04 16:26:04.702731500 22773 dspam plugin: dspam result: Innocent with Confidence of 1.00 and Probability of 0.0000 (4b42088c227818922614116)
 +
2010-01-04 16:26:06.441017500 22770 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42088e227882615116573)
 +
 +
Notice the retraining of DSPAM that took place after a DSPAM classification as Innocent but with a total SpamAssassin score of 14.9
  
 
==FAQ==
 
==FAQ==

Revision as of 16:32, 4 January 2010


Maintainer

This contrib has been developed by Jesper Knudsen

Description

I have for a long time used SME's built-in SpamAssassin with a few custom additions to get rid of most of my spam. Recently I noticed that the DSPAM project was alive again and have since heard from many sources that it did a great job for them. I did not want to get rid of SpamAssassin but wanted to combine the strength of the two spam engines. One of the "weaknesses" of DSPAM is that it requires a significant amount of training before it provides reliable result - this training I am using SpamAssassin scoring to provide.

I have therefore made this DSPAM plug-in which works in co-operation with SpamAssassin to get rid of even more spam.

This contrib consists for most of two items:

  • qpsmtpd plugin which handles the training of the DSPAM engines based on SpamAssassin results and the which also, when training is complete, ensures that emails are classified with DSPAM for later scoring.
  • SpamAssassin plugin which used the DSPAM classification results to provide additional SpamAssassin scoring based on the DSPAM classification.

Installation

The package needs a working DSPAM installation and the sme-dspam contrib.

wget \
http://mirror.contribs.org/smeserver/contribs/swerts-knudsen/SME7/sme-dspam/sme-dspam-1.0.2-5.noarch.rpm \
http://mirror.contribs.org/smeserver/contribs/swerts-knudsen/SME7/sme-dspam/dspam-3.9.0-RC2.sme7.i386.rpm \
http://mirror.contribs.org/smeserver/contribs/swerts-knudsen/SME7/sme-dspam/libdspam-3.9.0-RC2.sme7.i386.rpm \
http://mirror.contribs.org/smeserver/contribs/swerts-knudsen/SME7/sme-dspam/libdspam-mysql-3.9.0-RC2.sme7.i386.rpm
yum localinstall \
sme-dspam-1.0.2-5.noarch.rpm \
dspam-3.9.0-RC2.sme7.i386.rpm \
libdspam-3.9.0-RC2.sme7.i386.rpm \
libdspam-mysql-3.9.0-RC2.sme7.i386.rpm

Uninstall

You can simply remove the package again with the usual yum command.

yum remove sme-dspam

Configuration

The contrib initially does DSPAM training and will continue to do so until DSPAM claims that training is complete. It monitors the output of "dspam_stats -H" to see when training has completed and will then switch to scoring/tagging mode. When training is complete the admin will receive an email notification. Until it received this mode you will not see any DSPAM benefits.

The training of DSPAM is done based on SpamAssassin scores and by default it will train as SPAM if SpamAssassin rejects the email and score is above 9. It will train as ham (DSPAM terminology innocent) when mail is scores lower than 5 by SpamAssassin.

These two values can be configured by the config system

config setprop dspam hamlevel xx (default: 5)
config setprop dspam spamlevel xx (default: 9)

and then do a:

signal-event email-update

Statistics

You can follow how DSPAM is doing by use of the dspam_stats command. Below is an example where I started the tagging process before training was complete. Here you can see that 4 emails reported as False Negatives meaning DSPAM claimed they were ham and SpamAssassin scored them as Spam (above spamlevel).

[root@mx]# dspam_stats -H

qpsmtpd:
               TP True Positives:                    71
               TN True Negatives:                    66
               FP False Positives:                    0
               FN False Negatives:                    4
               SC Spam Corpusfed:                  5890
               NC Nonspam Corpusfed:                872
               TL Training Left:                   1562
               SHR Spam Hit Rate                 94.67%
               HSR Ham Strike Rate:               0.00%
               PPV Positive predictive value:   100.00%
               OCA Overall Accuracy:             97.16%


When contrib is in training mode you should see the following type of event in your qpsmptd log when issuing the command:

tail -f /var/log/qpsmtpd/current | tai64nlocal | grep dspam
2010-01-04 16:05:43.495837500 24369 dspam plugin: Training email as spam (32.3 > 9)
2010-01-04 16:06:12.922243500 24460 dspam plugin: Training email as spam (26.2 > 9)
2010-01-04 16:08:30.707928500 24571 dspam plugin: Training email as spam (40.2 > 9)
2010-01-04 16:15:09.209315500 25154 dspam plugin: Training email as spam (28.7 > 9)
2010-01-04 16:15:12.657721500 25093 dspam plugin: Training email as innocent (-2.3 < 5)
2010-01-04 16:15:31.505187500 25230 dspam plugin: Training email as innocent (1.0 < 5)
2010-01-04 16:15:56.084894500 25261 dspam plugin: Training email as spam (33.2 > 9)
2010-01-04 16:16:35.734852500 25302 dspam plugin: Training email as innocent (0.1 < 5)
2010-01-04 16:16:37.373583500 25297 dspam plugin: Training email as spam (39.5 > 9)
2010-01-04 16:17:50.398104500 25284 dspam plugin: Training email as spam (30.2 > 9)
2010-01-04 16:18:13.514300500 25412 dspam plugin: Training email as spam (23.2 > 9)
2010-01-04 16:18:41.653611500 25396 dspam plugin: Training email as spam (35.2 > 9)
2010-01-04 16:20:05.432484500 25486 dspam plugin: Training email as spam (24.6 > 9)
2010-01-04 16:20:07.036783500 25528 dspam plugin: Training email as innocent (1.7 < 5)
2010-01-04 16:21:04.378237500 25766 dspam plugin: Training email as innocent (1.0 < 5)
2010-01-04 16:21:21.849091500 25797 dspam plugin: Training email as innocent (-2.6 < 5)
2010-01-04 16:22:32.693008500 25860 dspam plugin: Training email as spam (30.3 > 9)
2010-01-04 16:28:22.610804500 26245 dspam plugin: Training email as spam (24.3 > 9)

When contrib is in tagging mode you can see the following type of output from the command:

tail -f /var/log/qpsmtpd/current | tai64nlocal | grep dspam
2010-01-04 16:14:27.830989500 21955 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b4205d3219672044083174)
2010-01-04 16:15:57.446155500 22065 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42062d220731786917372)
2010-01-04 16:20:55.422770500 22430 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b420757224401732614111)
2010-01-04 16:21:05.836167500 22453 dspam plugin: dspam result: Innocent with Confidence of 0.99 and Probability of 0.0000 (4b420761224588618216848)
2010-01-04 16:21:20.033604500 22330 dspam plugin: dspam result: Spam with Confidence of 0.80 and Probability of 1.0000 (4b420770224877713217748)
2010-01-04 16:24:41.615738500 22636 dspam plugin: dspam result: Innocent with Confidence of 0.76 and Probability of 0.0000 (4b420839226414726512081)
2010-01-04 16:24:43.453742500 22636 dspam plugin: Retraining email as spam classification (14.9 > 9)
2010-01-04 16:25:34.647693500 22729 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42086e227377747245261)
2010-01-04 16:25:38.648186500 22743 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b420872227551892345671)
2010-01-04 16:26:04.702731500 22773 dspam plugin: dspam result: Innocent with Confidence of 1.00 and Probability of 0.0000 (4b42088c227818922614116)
2010-01-04 16:26:06.441017500 22770 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42088e227882615116573)

Notice the retraining of DSPAM that took place after a DSPAM classification as Innocent but with a total SpamAssassin score of 14.9

FAQ

Can I force it to start scoring even though training hasn't completed?

Yes, you can do this by changing config:

config setprop dspam action tag
signal-event email-update

Can I alter the score given to DSPAM classified emails?

Yes, you have to manually edit the /etc/mail/spamassassin/dspam.cf file. Notice that an upgrade of sme-dspam later, will overwrite your modifications. When you have made your modification issue an:

signal-event email-update

How do I report a problem or a suggestion?

This contrib has not yet been created in the bugtracker so just send an email to mailto:contribs@swerts-knudsen.dk