Difference between revisions of "DSPAM"
m (→Installation) |
m |
||
Line 1: | Line 1: | ||
==Maintainer== | ==Maintainer== | ||
− | This contrib has been developed by [[User:Knuddi|Jesper Knudsen]] from [http://smeoptimizer.swerts-knudsen.dk | + | This contrib has been developed by [[User:Knuddi|Jesper Knudsen]] from [http://smeoptimizer.swerts-knudsen.dk SME Optimizer] |
==Description== | ==Description== |
Revision as of 09:53, 25 May 2013
Maintainer
This contrib has been developed by Jesper Knudsen from SME Optimizer
Description
I have for a long time used SME's built-in SpamAssassin with a few custom additions to get rid of most of my spam. Recently I noticed that the DSPAM project was alive again and have since heard from many sources that it did a great job for them. I did not want to get rid of SpamAssassin but wanted to combine the strength of the two spam engines. One of the "weaknesses" of DSPAM is that it requires a significant amount of training before it provides reliable result - this training I am using SpamAssassin scoring to provide.
I have therefore made this DSPAM plug-in which works in co-operation with SpamAssassin to get rid of even more spam.
This contrib consists for most of two items:
- qpsmtpd plugin which handles the training of the DSPAM engines based on SpamAssassin results and the which also, when training is complete, ensures that emails are classified with DSPAM for later scoring.
- SpamAssassin plugin which used the DSPAM classification results to provide additional SpamAssassin scoring based on the DSPAM classification.
Installation
The package needs a working DSPAM installation and the sme-dspam contrib. ONLY SME 7x - Not support for SME 8x
wget \ http://sme.swerts-knudsen.dk/downloads/DSPAM/sme-dspam-1.0.2-5.noarch.rpm \ http://sme.swerts-knudsen.dk/downloads/DSPAM/dspam-3.9.0-sme7.i386.rpm \ http://sme.swerts-knudsen.dk/downloads/DSPAM/libdspam-3.9.0-sme7.i386.rpm \ http://sme.swerts-knudsen.dk/downloads/DSPAM/libdspam-mysql-3.9.0-sme7.i386.rpm yum localinstall \ sme-dspam-1.0.2-5.noarch.rpm \ dspam-3.9.0-sme7.i386.rpm \ libdspam-3.9.0-sme7.i386.rpm \ libdspam-mysql-3.9.0-sme7.i386.rpm
Uninstall
You can simply remove the package again with the usual yum command.
yum remove sme-dspam
Configuration
The contrib initially does DSPAM training and will continue to do so until DSPAM claims that training is complete. It monitors the output of "dspam_stats -H" to see when training has completed and will then switch to scoring/tagging mode. When training is complete the admin will receive an email notification. Until it received this mode you will not see any DSPAM benefits.
The training of DSPAM is done based on SpamAssassin scores and by default it will train as SPAM if SpamAssassin rejects the email and score is above 9. It will train as ham (DSPAM terminology innocent) when mail is scores lower than 5 by SpamAssassin.
These two values can be configured by the config system
config setprop dspam hamlevel xx (default: 5) config setprop dspam spamlevel xx (default: 9)
and then do a:
signal-event email-update
Statistics
DSPAM Specific Statistics
You can follow how DSPAM is doing by use of the dspam_stats command. Below is an example where I started the tagging process before training was complete. Here you can see that 4 emails reported as False Negatives meaning DSPAM claimed they were ham and SpamAssassin scored them as Spam (above spamlevel).
[root@mx]# dspam_stats -H
qpsmtpd: TP True Positives: 71 TN True Negatives: 66 FP False Positives: 0 FN False Negatives: 4 SC Spam Corpusfed: 5890 NC Nonspam Corpusfed: 872 TL Training Left: 1562 SHR Spam Hit Rate 94.67% HSR Ham Strike Rate: 0.00% PPV Positive predictive value: 100.00% OCA Overall Accuracy: 97.16%
When contrib is in training mode you should see the following type of event in your qpsmptd log when issuing the command:
tail -f /var/log/qpsmtpd/current | tai64nlocal | grep dspam
2010-01-04 16:05:43.495837500 24369 dspam plugin: Training email as spam (32.3 > 9) 2010-01-04 16:06:12.922243500 24460 dspam plugin: Training email as spam (26.2 > 9) 2010-01-04 16:08:30.707928500 24571 dspam plugin: Training email as spam (40.2 > 9) 2010-01-04 16:15:09.209315500 25154 dspam plugin: Training email as spam (28.7 > 9) 2010-01-04 16:15:12.657721500 25093 dspam plugin: Training email as innocent (-2.3 < 5) 2010-01-04 16:15:31.505187500 25230 dspam plugin: Training email as innocent (1.0 < 5) 2010-01-04 16:15:56.084894500 25261 dspam plugin: Training email as spam (33.2 > 9) 2010-01-04 16:16:35.734852500 25302 dspam plugin: Training email as innocent (0.1 < 5) 2010-01-04 16:16:37.373583500 25297 dspam plugin: Training email as spam (39.5 > 9) 2010-01-04 16:17:50.398104500 25284 dspam plugin: Training email as spam (30.2 > 9) 2010-01-04 16:18:13.514300500 25412 dspam plugin: Training email as spam (23.2 > 9) 2010-01-04 16:18:41.653611500 25396 dspam plugin: Training email as spam (35.2 > 9) 2010-01-04 16:20:05.432484500 25486 dspam plugin: Training email as spam (24.6 > 9) 2010-01-04 16:20:07.036783500 25528 dspam plugin: Training email as innocent (1.7 < 5) 2010-01-04 16:21:04.378237500 25766 dspam plugin: Training email as innocent (1.0 < 5) 2010-01-04 16:21:21.849091500 25797 dspam plugin: Training email as innocent (-2.6 < 5) 2010-01-04 16:22:32.693008500 25860 dspam plugin: Training email as spam (30.3 > 9) 2010-01-04 16:28:22.610804500 26245 dspam plugin: Training email as spam (24.3 > 9)
When contrib is in tagging mode you can see the following type of output from the command:
tail -f /var/log/qpsmtpd/current | tai64nlocal | grep dspam
2010-01-04 16:14:27.830989500 21955 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b4205d3219672044083174) 2010-01-04 16:15:57.446155500 22065 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42062d220731786917372) 2010-01-04 16:20:55.422770500 22430 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b420757224401732614111) 2010-01-04 16:21:05.836167500 22453 dspam plugin: dspam result: Innocent with Confidence of 0.99 and Probability of 0.0000 (4b420761224588618216848) 2010-01-04 16:21:20.033604500 22330 dspam plugin: dspam result: Spam with Confidence of 0.80 and Probability of 1.0000 (4b420770224877713217748) 2010-01-04 16:24:41.615738500 22636 dspam plugin: dspam result: Innocent with Confidence of 0.76 and Probability of 0.0000 (4b420839226414726512081) 2010-01-04 16:24:43.453742500 22636 dspam plugin: Retraining email as spam classification (14.9 > 9) 2010-01-04 16:25:34.647693500 22729 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42086e227377747245261) 2010-01-04 16:25:38.648186500 22743 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b420872227551892345671) 2010-01-04 16:26:04.702731500 22773 dspam plugin: dspam result: Innocent with Confidence of 1.00 and Probability of 0.0000 (4b42088c227818922614116) 2010-01-04 16:26:06.441017500 22770 dspam plugin: dspam result: Spam with Confidence of 0.99 and Probability of 1.0000 (4b42088e227882615116573)
Notice the retraining of DSPAM that took place after a DSPAM classification as Innocent but with a total SpamAssassin score of 14.9
SpamAssassin General Statistics
You can monitor with rules are fired by SpamAssassin for both spam and ham with this little script which runs through the /var/log/spamd/current log file.
cd /usr/bin/ wget http://sme.swerts-knudsen.dk/downloads/DSPAM/sa-stats chmod +x sa-stats ./sa-stats
The output will look something like this.
Email: 2895 Autolearn: 2591 AvgScore: 22.54 AvgScanTime: 3.74 sec Spam: 2165 Autolearn: 2075 AvgScore: 33.86 AvgScanTime: 3.44 sec Ham: 730 Autolearn: 516 AvgScore: -11.05 AvgScanTime: 4.64 sec Time Spent Running SA: 3.01 hours Time Spent Processing Spam: 2.07 hours Time Spent Processing Ham: 0.94 hours TOP SPAM RULES FIRED ---------------------------------------------------------------------- RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM ---------------------------------------------------------------------- 1 RCVD_IN_APEWSL2 1809 67.05 83.56 18.08 2 RCVD_IN_BRBL 1789 62.04 82.63 0.96 3 RAZOR2_CHECK 1786 61.93 82.49 0.96 4 BAYES_99 1780 61.49 82.22 0.00 5 RAZOR2_CF_RANGE_51_100 1759 61.00 81.25 0.96 6 DIGEST_MULTIPLE 1656 57.37 76.49 0.68 7 DCC_CHECK 1567 56.93 72.38 11.10 8 URIBL_BLACK 1528 53.26 70.58 1.92 9 RCVD_IN_XBL 1494 51.64 69.01 0.14 10 RAZOR2_CF_RANGE_E8_51_100 1485 51.47 68.59 0.68 11 RCVD_IN_JMF_BL 1484 51.68 68.55 1.64 12 PYZOR_CHECK 1445 50.36 66.74 1.78 13 RCVD_IN_PBL 1413 48.95 65.27 0.55 14 URIBL_JP_SURBL 1347 46.53 62.22 0.00 15 URIBL_SBL 1320 45.60 60.97 0.00 16 URIBL_WS_SURBL 1294 44.70 59.77 0.00 17 DSPAM_SPAM_99 1147 39.62 52.98 0.00 18 SEM_URIRED 1135 39.79 52.42 2.33 19 SEM_URI 1002 34.78 46.28 0.68 20 HTML_MESSAGE 981 52.92 45.31 75.48 ---------------------------------------------------------------------- TOP HAM RULES FIRED ---------------------------------------------------------------------- RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM ---------------------------------------------------------------------- 1 BAYES_00 715 25.98 1.71 97.95 2 DSPAM_HAM_99 696 25.01 1.29 95.34 3 HTML_MESSAGE 551 52.92 45.31 75.48 4 SPF_PASS 329 13.68 3.09 45.07 5 RCVD_IN_JMF_W 145 5.11 0.14 19.86 6 RCVD_IN_APEWSL2 132 67.05 83.56 18.08 7 MIME_HTML_ONLY 131 14.82 13.76 17.95 8 SPF_HELO_PASS 96 3.52 0.28 13.15 9 DCC_CHECK 81 56.93 72.38 11.10 10 RCVD_IN_DNSWL_MED 63 2.18 0.00 8.63 11 RCVD_IN_DNSWL_LOW 62 2.14 0.00 8.49 12 SARE_SUB_ENC_UTF8 59 3.56 2.03 8.08 13 MPART_ALT_DIFF 55 2.63 0.97 7.53 14 USER_IN_WHITELIST 48 1.66 0.00 6.58 15 MIME_HTML_MOSTLY 43 2.00 0.69 5.89 16 MIME_QP_LONG_LINE 31 2.56 1.99 4.25 17 EXTRA_MPART_TYPE 31 1.52 0.60 4.25 18 MIME_BASE64_BLANKS 31 1.07 0.00 4.25 19 HTML_IMAGE_RATIO_06 29 1.04 0.05 3.97 20 MISSING_MID 28 1.52 0.74 3.84 ----------------------------------------------------------------------
FAQ
Can I force it to start scoring even though training hasn't completed?
Yes, you can do this by changing config:
config setprop dspam action tag signal-event email-update
Can I alter the score given to DSPAM classified emails?
Yes, you have to manually edit the /etc/mail/spamassassin/dspam.cf file. Notice that an upgrade of sme-dspam later, will overwrite your modifications. When you have made your modification issue an:
signal-event email-update
How do I report a problem or a suggestion?
This contrib has not yet been created in the bugtracker so just send an email to mailto:contribs@swerts-knudsen.dk