October, 10, 2011 – Over the last week or so, we’ve been noticing a higher incidence of “false positives” on our spam tagging system. The system is partially based on a Bayesian probability system which “learns” what spam looks like. It seems that the Bayesian part of the spam filtering system has decided that just about everything is spam, so each email is getting a higher spam rating than it normally would. This morning, we made the decision to clear our Bayesian database to eliminate this issue. A side effect of this is that we may now have more spam getting through until the system re-learns for a bit. We will also be “teaching” the database with some collected spam that we have. Hopefully, within the next couple of days, the spam filter should be back to normal.
October 17, 2011 – Our changes seemed to have worked for a bit, but the filter started getting too aggressive again. We have cleared out the learning database and are attempting to “teach” it with HAM (email that is not SPAM). Hopefully, we’ll get the filter working correctly shortly.
Update – October, 20, 2011 – We have identified an issue with our filter that is contributing to this issue. It seems that our filter thinks that the Subject and Date are missing on many email and that is contributing to the points acquired by the email. After 5.0 points, our system considers the email spam. The lack of subject and date contributes 3.2 points, so that goes a long way toward getting emails classified incorrectly.
If you have any emails that are continuing to get tagged with *****SPAM*****, please forward the sender’s address to email@example.com and we will add the address to our whitelist so it no longer gets tagged.