The problem with blacklists and false positives

Christopher Chan christopher.chan at bradbury.edu.hk
Thu May 7 23:35:39 UTC 2009


>>> Filtering by automated content matching will not work.  Too many false
>>> positives, and too many
>>> ways around it.  How many different ways do the spammers discuss the
>>> size of your <xxx> ?
>>>       
>
> And doesn't your software catch it?  I get almost no spam these days.  If 
> bogofilter was used, and allowed to either pass posts or direct them to 
> moderators, it would quickly become quite reliable.


I doubt it. I have used bogofilter against 419 and other scammers and 
their messages more or less have a set pattern and yet you still had to 
eyeball stuff to ensure bogofilter's accuracy. This was when I worked at 
an ISP that dealt with 20 million incoming smtp transactions and had a 
million or so outgoing emails on a daily basis.

Moderators would have to be able to train bogofilter and in fact they 
would have to do that from the very start and approve each and every 
mail until bogofilter becomes sufficiently accurate to leave only a 
small workload if it ever gets to that point.




More information about the ubuntu-users mailing list