Line 1: |
Line 1: |
− | === Zarafa Bayesian learning ===
| + | == Zarafa Bayesian learning == |
| | | |
| This howto enables SpamAssasin Bayesian learning for [[:Zarafa]] | | This howto enables SpamAssasin Bayesian learning for [[:Zarafa]] |
Line 5: |
Line 5: |
| The DMZS script (LGPL) works over IMAP. It reads the mail from two folders (LearnAsSpam and LearnAsHam) and feeds it to SpamAssasin's sa-learn. This script is implemented here in a way that it makes use of public folders in Zarafa. | | The DMZS script (LGPL) works over IMAP. It reads the mail from two folders (LearnAsSpam and LearnAsHam) and feeds it to SpamAssasin's sa-learn. This script is implemented here in a way that it makes use of public folders in Zarafa. |
| | | |
− | ==== Installation ====
| + | === Installation === |
− | wget http://www.dmzs.com/tools/files/spam/DMZS-sa-learn.pl
| |
− | mv DMZS-sa-learn.pl /usr/bin/
| |
| | | |
− | Create a user-account in Zarafa for reading the public spam-folders. Replace the <MyPassword> with a proper strong password.
| + | ====Bayes==== |
− | zarafa-admin -c 'SpamAdmin' -p '<MyPassword>' -f 'Spam Administration Account' -e root@localhost
| + | yum install perl-Mail-IMAPClient --enablerepo=smecontribs |
| | | |
− | Now we'll edit the script and replace the Server, User and Password values. We will also have to replace two folder names throughout the script:
| + | Create a new script-file: |
− | pico /usr/bin/DMZS-sa-learn.pl
| + | nano -w /usr/bin/DMZS-sa-learn.pl |
| | | |
− | Replace the values so it looks like below, replace <MyPassword> for the password you have chosen in a previous step:
| + | Paste the code below in this script-file and change the <tt>'SpamAdminPassword'</tt> into a proper (strong) password: |
− | | + | #!/usr/bin/perl |
− | my $imap = Mail::IMAPClient->new( Server=> '127.0.0.1:8143',
| + | # |
− | User => 'SpamAdmin',
| + | # Process mail from imap server shared folder 'Public folders/LearnAsSpam' & 'Public folders/LearnAsHam' through spamassassin sa-learn |
− | Password => '<MyPassword>',
| + | # dmz@dmzs.com - March 19, 2004 |
− | Debug => $debug);
| + | # http://www.dmzs.com/tools/files/spam.phtml |
− | | + | # http://www.dmzs.com/tools/files/spam/DMZS-sa-learn.pl [modified for SMEServer] |
− | Throughout the script (be aware of the quotes):
| + | # LGPL |
− | replace: 'spam' -> with: 'Public folders/LearnAsSpam' | + | |
− | replace: 'not-spam' -> with: 'Public folders/LearnAsHam' | + | use Mail::IMAPClient; |
− | remove: --showdots | + | |
| + | my $debug=0; |
| + | my $salearn; |
| + | |
| + | # # # # # # # # # # EDIT USER AND PASSWORD # # # # # # # # # # |
| + | |
| + | my $imap = Mail::IMAPClient->new( Server=> '127.0.0.1:8143', |
| + | User => 'SpamAdmin', |
| + | Password => 'SpamAdminPassword', |
| + | Debug => $debug); |
| + | |
| + | if (!defined($imap)) { die "IMAP Login Failed"; } |
| + | |
| + | # If debugging, print out the total counts for each mailbox |
| + | if ($debug) { |
| + | my $spamcount = $imap->message_count('Public folders/LearnAsSpam'); |
| + | print $spamcount, " Spam to process\n"; |
| + | |
| + | my $nonspamcount = $imap->message_count('Public folders/LearnAsHam'); |
| + | print $nonspamcount, " Notspam to process\n" if $debug; |
| + | } |
| + | |
| + | # Process the spam mailbox |
| + | $imap->select('Public folders/LearnAsSpam'); |
| + | my @msgs = $imap->search("ALL"); |
| + | for (my $i=0;$i <= $#msgs; $i++) |
| + | { |
| + | # I put it into a file for processing, doing it into a perl var & piping through sa-learn just didn't seem to work |
| + | $imap->message_to_file("/tmp/salearn",$msgs[$i]); |
| + | |
| + | # execute sa-learn w/data |
| + | if ($debug) { $salearn = `/usr/bin/sa-learn -D --no-sync --spam /tmp/salearn`; } |
| + | else { $salearn = `/usr/bin/sa-learn --no-sync --spam /tmp/salearn`; } |
| + | print "-------\nSpam: ",$salearn,"\n-------\n" if $debug; |
| + | |
| + | # delete processed message |
| + | $imap->delete_message($msgs[$i]); |
| + | unlink("/tmp/salearn"); |
| + | } |
| + | $imap->expunge(); |
| + | $imap->close(); |
| + | |
| + | # Process the not-spam mailbox |
| + | $imap->select('Public folders/LearnAsHam'); |
| + | my @msgs = $imap->search("ALL"); |
| + | for (my $i=0;$i <= $#msgs; $i++) |
| + | { |
| + | $imap->message_to_file("/tmp/salearn",$msgs[$i]); |
| + | # execute sa-learn w/data |
| + | if ($debug) { $salearn = `/usr/bin/sa-learn -D --no-sync --ham /tmp/salearn`; } |
| + | else { $salearn = `/usr/bin/sa-learn --no-sync --ham /tmp/salearn`; } |
| + | print "-------\nNotSpam: ",$salearn,"\n-------\n" if $debug; |
| + | |
| + | # delete processed message |
| + | $imap->delete_message($msgs[$i]); |
| + | unlink("/tmp/salearn"); |
| + | } |
| + | $imap->expunge(); |
| + | $imap->close(); |
| + | |
| + | $imap->logout(); |
| + | |
| + | # integrate learned stuff |
| + | my $sarebuild = `/usr/bin/sa-learn --sync`; |
| + | print "-------\nRebuild: ",$sarebuild,"\n-------\n" if $debug; |
| | | |
| Set proper permissions on the script: | | Set proper permissions on the script: |
| chmod 555 /usr/bin/DMZS-sa-learn.pl | | chmod 555 /usr/bin/DMZS-sa-learn.pl |
| | | |
− | Create a file for the script to write some temporary output to: | + | ====Zarafa==== |
− | touch /tmp/salearn | + | Create a user-account in Zarafa for reading the public spam-folders. |
| + | |
| + | db method, Replace the <MyPassword> with a proper strong password. |
| + | zarafa-admin -c 'SpamAdmin' -p '<MyPassword>' -f 'Spam Administration Account' -e root@localhost |
| + | If you have configured Zarafa to use the unix method and if you enable Zarafa usage on a per user base: |
| + | db accounts setprop SpamAdmin zarafa enabled |
| + | /etc/e-smith/events/actions/qmail-update-user |
| | | |
| Login to Zarafa with an account that has admin rights and make two new folders LearnAsSpam and LearnAsHam under: Public folder > Public folders. | | Login to Zarafa with an account that has admin rights and make two new folders LearnAsSpam and LearnAsHam under: Public folder > Public folders. |
Line 48: |
Line 116: |
| {{Note box| Dropping mail in the public 'LearnAsHam' folder may pose a privacy problem if permissions are set less restrictive as shown above!}} | | {{Note box| Dropping mail in the public 'LearnAsHam' folder may pose a privacy problem if permissions are set less restrictive as shown above!}} |
| | | |
| + | ====Cron==== |
| Create a new crontab fragment: | | Create a new crontab fragment: |
− | pico /etc/e-smith/templates/etc/crontab/91_SpamAssasinLearn | + | nano -w /etc/e-smith/templates/etc/crontab/91_SpamAssasinLearn |
| | | |
| Add the following to the template (change the execution times to your own likings -- [http://en.wikipedia.org/wiki/Cron Wikipedia on Cron]): | | Add the following to the template (change the execution times to your own likings -- [http://en.wikipedia.org/wiki/Cron Wikipedia on Cron]): |
Line 58: |
Line 127: |
| expand-template /etc/crontab | | expand-template /etc/crontab |
| | | |
− | ==== Configuration ====
| + | === Configuration === |
− | Bayesian learning has to be enabled and configured in SME, you can read how to do this in: [[Email#Setup_Blacklists_.26_Bayesian_Autolearning | E-mail - Bayesian Autolearning]] | + | Spamassassin has to be enabled in the Email Panel |
| + | |
| + | Bayesian learning has to be enabled and configured in SME with |
| | | |
− | ==== Usage ====
| + | config setprop spamassassin UseBayes 1 |
| + | config setprop spamassassin BayesAutoLearnThresholdSpam 6.00 |
| + | config setprop spamassassin BayesAutoLearnThresholdNonspam 0.10 |
| + | expand-template /etc/mail/spamassassin/local.cf |
| + | sa-learn --sync --dbpath /var/spool/spamd/.spamassassin -u spamd |
| + | chown spamd.spamd /var/spool/spamd/.spamassassin/bayes_* |
| + | chown spamd.spamd /var/spool/spamd/.spamassassin/bayes.mutex |
| + | chmod 640 /var/spool/spamd/.spamassassin/bayes_* |
| + | signal-event email-update |
| + | |
| + | These commands will: |
| + | * enable bayesian filter |
| + | * 'autolearn' as SPAM any email with a score above 6.00 |
| + | Note: SpamAssassin requires at least 3 points from the header, and 3 points from the body |
| + | to auto-learn as spam. |
| + | Therefore, the minimum working value for this option is 6, to be changed in increments of 3, |
| + | 12 considered to be a good working value.. |
| + | * 'autolearn' as HAM any email with a score below 0.10 |
| + | |
| + | === Usage === |
| {{Warning box| All mail dropped in the LearnAsSpam and LearnAsHam folders will be automatically deleted !!}} | | {{Warning box| All mail dropped in the LearnAsSpam and LearnAsHam folders will be automatically deleted !!}} |
| | | |
Line 68: |
Line 158: |
| | | |
| After the messages have been processed they will be deleted to save your valuable space. | | After the messages have been processed they will be deleted to save your valuable space. |
| + | |
| + | |
| + | [[Category:Howto]] |
| + | [[Category:Groupware]] |