Re: Spam Filter Reset
- From:
- Luke S Crawford
- Date:
- 2009-12-16 @ 05:30
"Zed A. Shaw" <zedshaw@zedshaw.com> writes:
> On Tue, Dec 15, 2009 at 09:26:06AM -0800, Zed A. Shaw wrote:
> > Hi Everyone,
> >
> > Over the weekend I had turned on spam filtering and trained the filter
> > with some spam, but it turned out to be much much too aggressive. I've
> > since scaled it back and made the following changes:
>
> Scratch that, spam filter is totally disabled. Spambayes is a piece of
> junk that has way too many false positives, so I'll be looking for an
> alternative.
The best filtering I've ever gotten was from dspam; but the problem
is that it required manual training. If you made sure to mark every spam
message as spam, it was really awesome. Almost no false positives
and well under one percent false negatives.
The problem was feeding the thing. the false negative rate would go up
fairly noticeably if you didn't mark spam. Also, this was four years ago
that I used it.
I also like a project I helped work on while at MAPS[1], DCC, and it's
a bit more 'fire and forget' - If you reject HTML mail, dcc catches
most of the rest. Now, the problem with DCC is that it doesn't detect
spam, it detects 'bulkness' - the idea is that it takes a cryptographic
checksum (like md5sum) and passes that around to others who use dcc, so
you can then see if anyone else has seen the same message as you.
Obviously, it marks legitimate mailing lists as 'bulk' - normally
you just whitelist what you want. For this application, though, that
might not be a problem. I mean, you don't want any mailing lists mailing
your lists, right?
If you want a 'good enough' solution right now, Mark Perkel of
junkemailfilter.com is getting some free hosting for one of his
backup boxes from me, so if you want I can get you free spamfiltering
with that. I'm using it now, and it's not as good as dspam,
but it's 'good enough' and god damn, it's easy. you just set your
MXs to the junkemailfilter MXs and they forward it to your real server
mailhub style, so it's easy to switch out if you want. It's probably
not the best you could do for a long-term solution, if you wanted
to put some time into it, but it sure is easy if you need to stop the
deluge of spam right now.
[1] I say I helped Vernon, and I did, but it might have been more in the
way a kid might help his dad fix a car; I was pretty young. I did some
m4 macros for the sendmail config, though.
--
Luke S. Crawford
http://prgmr.com/xen/ - Hosting for the technically adept
http://nostarch.com/xen.htm - We don't assume you are stupid.
Re: Spam Filter Reset
- From:
- Eric Wong
- Date:
- 2009-12-15 @ 20:10
"Zed A. Shaw" <zedshaw@zedshaw.com> wrote:
> On Tue, Dec 15, 2009 at 09:26:06AM -0800, Zed A. Shaw wrote:
> > Hi Everyone,
> >
> > Over the weekend I had turned on spam filtering and trained the filter
> > with some spam, but it turned out to be much much too aggressive. I've
> > since scaled it back and made the following changes:
>
> Scratch that, spam filter is totally disabled. Spambayes is a piece of
> junk that has way too many false positives, so I'll be looking for an
> alternative.
Hi Zed, I've had good experiences with SpamAssassin (spamc + spamd).
I like there being a combination manual rules in addition to Bayes,
so a weakness in one approach can get covered by the other and vice
versa.
In my experience, the default threshold score of 5.00 is a bit low for
new installations, so I initially set it to 9.00 and gradually decreased
it over time as the Bayes filter got trained.
--
Eric Wong
Re: Spam Filter Reset
- From:
- Mauricio Pasquier
- Date:
- 2009-12-16 @ 04:12
On Tue, Dec 15, 2009 at 18:10, Eric Wong <normalperson@yhbt.net> wrote:
> "Zed A. Shaw" <zedshaw@zedshaw.com> wrote:
>> On Tue, Dec 15, 2009 at 09:26:06AM -0800, Zed A. Shaw wrote:
>> > Hi Everyone,
>> >
>> > Over the weekend I had turned on spam filtering and trained the filter
>> > with some spam, but it turned out to be much much too aggressive. I've
>> > since scaled it back and made the following changes:
>>
>> Scratch that, spam filter is totally disabled. Spambayes is a piece of
>> junk that has way too many false positives, so I'll be looking for an
>> alternative.
>
> Hi Zed, I've had good experiences with SpamAssassin (spamc + spamd).
I've heard some very good things about MailAvenger[0], which can be
used in combination with SpamAssassin and bayesian filters too.
By the way, thanks for the effort Zed!
[0]: http://www.mailavenger.org/
> I like there being a combination manual rules in addition to Bayes,
> so a weakness in one approach can get covered by the other and vice
> versa.
>
> In my experience, the default threshold score of 5.00 is a bit low for
> new installations, so I initially set it to 9.00 and gradually decreased
> it over time as the Bayes filter got trained.
>
> --
> Eric Wong
>