Excellent, I can now run the sa-learn command successfully.
Apparently it was the DB_File module that became corrupt on my box as well is in my profile. Support fixed it and all is well now :D:D
Cheers
J
Printable View
Excellent, I can now run the sa-learn command successfully.
Apparently it was the DB_File module that became corrupt on my box as well is in my profile. Support fixed it and all is well now :D:D
Cheers
J
I have been getting alot of spam that is not being caught by the filters.... But I dont fully understand how to teach SA what is spam and what is not... could you please start at step 1 and explain it a bit slower for the rest of us.... :confused: do I need to get my mail via the web interface to move it? What if I am pulling it via pop?
Thanks
I'm also interested in sa-learn, and found this. It looks like sa-learn isn't too hard to use, but I'll have to try it.
http://spamassassin.apache.org/full/.../sa-learn.html
It's pretty simple. I use imap, and I file spam that slips through in a mailbox named Junk. I run this cron job once a week (mailing myself the rather verbose output):
Code:sa-learn -D --mbox --spam mail/mydomain.com/myaddress/Junk
cat </dev/null >mail/mydomain.com/myaddress/Junk
Does anyone know how to enable the Spam Box? For some reason it is marked as disabled, and the button to enable the Spam Box is missing, and nothing seems to be happening, even though I've done my training.
Found the problem. I needed to setup the Email Filtering to trigger off of the SpamAssassin header and move the mail to a discard or elsewhere as desired.
Quote:
Originally Posted by richard
my recomendation would be to nice down the process as to keep the load down on the server.
Code:nice -n 19 sa-learn -D --mbox --spam mail/mydomain.com/myaddress/Junk
You can also enable the spam box by giving blue host a call (they disabled activating it from the web site).Quote:
Originally Posted by userwaldo
Jansen
Resurrecting an old thread here, but it's a good one. How do you go about using sa-learn? How do you pipe your missed spams and ham messages that were classified as spam through sa-learn?
Those of you that are deleting your SA-tagged spam (rather than refiling it), how confident are you that you aren't deleting HAM? I haven't had SA misfire on a HAM in a while, but I guess the fear is still there. I guess I am getting tired of going through my spam inbox and don't really see it as that much of a time saver over seeing it in my regular inbox.
I'm curious if there's a way to use MY spam as fodder for the ENTIRE domain in training using sa-learn?
Also, is there a way to upload my already-popped emails from Outlook BACK to the server for sa-learn training?
Thanks for your help!
Greg
Some of this may be else where, but I thought it best to put it all in one spot for those of us that are not experts.
History and Relevance
This is for those with lots of people with different email addresses on one BlueHost account using different methods of checking and sending email. Spam Assassin (SA) uses Rules and The Bayesian classifier to identify spam. Training spam assassin will allow the Bayes to more intelligently mark your emails both ham and spam. This will allow your users to redirect mislabled messages to be properly learned by SA as either spam or ham.
Terms
You will need to change CAPITALIZED TERMS to make things work for you
USERNAME = Your Bluehost UserName
SITE.COM = You domain
"SPAM@SITE.COM" = Email to redirect spam to (Step 2&4)
"HAM@SITE.COM" = Email to redirect ham to (Step 2&4)
1. Setting Up Spam Assasin
First Turn On Spamassasin: cPanel->Email Manager->Spam Assassin-> "Enable Spam Assassin"
Next Turn on Bayes (auto learning for SA) "Configure Spam Assassin" -> check the box for "use-bayes".
Note: You may also want to tweak other settings like required score (I like 4.0), and the scores assigned to each test (go to: http://spamassassin.apache.org/tests_3_1_x.html for a list of all tests and the default values. After training SA and Bayes I made Bayes_40 and up worth more than their default amounts).
2. Set up Email Addresses for Spam and Ham
Create 2 email addresses. One for Spam one for Ham. I will call them "SPAM@SITE.COM" and "HAM@SITE.COM" for simplicity. But it is suggested that you don't use Ham@... since a spammer might guess to send spam to that address and that would cause Spamassassin to start learning spam as ham.
Users will redirect/resend miss marked messages to these email accounts and a cron job will alow SA to learn from them.
3. Setting up Cron Jobs so learning happens automatically
Set up 2 Cron Jobs, 1 to learn spam, 1 to learn ham. Spam assassin will learn tokens from all the messages at "SPAM@SITE.COM" & "HAM@SITE.COM" as spam or ham respectively.
sa-learn --spam /home/USERNAME/mail/SITE.COM/SPAM/*
where "SPAM@SITE.COM" is where you direct spam. Set it up to run one a day or once a week.
sa-learn --ham /home/USERNAME/mail/SITE.COM/HAM/*
where "HAM@SITE.COM" is where you direct HAM that got mislabeled as spam, or that you just want SA to learn as Ham. Set it up to run one a day or once a week.
OPTION Adding "-d" after "sa-learn" will give you a more detailed printout of what it did. Otherwise you will just see that it learned tokens from X messages and examined Y messages. SA will only learn new tokens from emails it has not already examined. So depending on space, you might log in and empty those emails on occasion OR run this cron job to automatically delete the HAM email as often as you want.
rm ~/mail/SITE.COM/HAM/cur/*
4. Instructions for Users
Your users will then want to Redirect or Resend messages to "SPAM@SITE.COM" or "HAM@SITE.COM" appropriately. This varies depending on how your users check their email.
Web Mail with HORDE - Choose "Redirect"Outlook (Anyone with a script so this can be done with the push of a button?)
A. Open the message (double click on it)
B. Action-->Resend This Message...
C. Say "Yes" to the warning that you were not the original sender.
D. Change the "To..." line to: "SPAM@SITE.COM" or "HAM@SITE.COM"
E. Hit "Send" (do not change the message)IMAP - Set up an account for each email and simply move messages (or copies) into the appropriate folder
NOTE - While not on the front of your mind, it is good to teach SA Ham as well as Spam as it will decrease the risk of falsely marking ham as spam.
5. Advanced Options (like everything else was easy)
5.1 Scores Review spam and ham yourself and look at the headers (I look at them with the web mail HORDE) to see which tests results seem common for Ham and Spam. If lots of your Ham gets caught by TEST_A then turn it off in Spam Assassin Configuration (score "TEST_A 0"). If lots of Spam meets rule TEST_B and none of your real mail (ham) seems to meet it, give it more points (score TEST_B 2.0). You will also notice that just because a message was flaged as spam, it was not nessesarily learned as ham. You can redirect it to "SPAM@SITE.COM" to help Bayes even more.
5.2 Protection - Do not list your email addresses on your website. Mask them with ASCII or Hexadecimal values that appear as normal letters to viewers but not to robots http://homepages.comnet.co.nz/disguise.html.
5.3 Advanced Learning - Hide your "SPAM@SITE.COM" in the source of your website such that humans will not see it but spam robots will. Then they will kindly send you spam to "SPAM@SITE.COM" which SA will learn as spam. (better yet, hide "X@SITE.COM" which auto forwards to "SPAM@SITE.COM" in case the robot is smart enough not to send emails to "SPAM@SITE.COM").
5.4 Risky If you have a folder full of spam (like Spam Box which is default to ".spam") , and are certain that no ham are in it, run this cron job.
sa-learn --spam /home/USERNAME/mail/SITE.COM/EMAIL/.SPAM/*
where .SPAM = Folder where spam is stored and EMAIL = Your email name ("email@site.com").
Or to learn from ALL email accounts on your domain (only do this if you know for certain their is no ham on anyone's ".spam" folder)
sa-learn --spam /home/USERNAME/mail/SITE.COM/*/.SPAM/*