hardware raid solutions?
Eric S. Johansson
esj at harvee.org
Wed Jul 12 01:19:04 UTC 2006
David Abrahams wrote:
> "Eric S. Johansson" <esj at harvee.org> writes:
>
>> sorry, I said that wrong. this is for an open-source project.
>>
>>> [1] python knowledgeable help wanted. Job descriptions available on
>>> request.
>
> Will this do a magically-better job than my current combo of SpamBayes
> and the SpamCop blacklists applied by my sysadmin?
yes. better quality filtering, more opportunities to eliminate false
positives, fewer special case hacks, and most importantly, less work for
you.
note: this filter system runs inside of postfix using the pre and post
queuing filters stages
geek view starting from the very front:
---- front-end (can be distributed among multiple relay servers) ---
1) Blacklist test?
If blacklisted, does it contain a very large proof of work stamp (i.e.
10 minute) for emergency bypass a.k.a. Brown listing? yes, pass. no, 5xx
return code the message
2) does the to: address exist on the mail server. Enter in local Brown
list if not
3) does e-mail arrived too fast for a given address? if so, 4xx return
code the message the message
----- backend
4) proof of work stamp test, make a stamp, passed directly to inbox
5) friends "automatic white" list. if I know you, you pass
6) slow white list (match a pattern, don't go to jail)
6a) sumo filter. If the message is bigger than 50k, pass)
7) content filter (CRM 114 with three band interpretation of pR)
messages are directed either to inbox, dumpster, or spam trap
where the human can interpret whether the messages are spam or not.
the spam trap user interface is a simple mechanism for recategorizing
messages they could not be analyzed by any the other stages. the
recategorization trains a content filter for at first higher, then
different accuracy.
I am rather proud of the fact that myself and a friend made a spam trap
user interface that is simple enough that an ordinary administrative
assistant could manage the spam trap handling for a company of roughly
hundred people in only 10 minutes per day at most. That is, as long as
CRM 114 behaves itself. :-)
an interesting side effect of the friends list and the content filter is
that if you look at the score of any message passing the friends list,
if it scores as spam, that's an extremely high probability indication
that the content filter is having problems. Typically I have found that
around 5%-15% of the messages coming through the friends filter are
considered spam.
on the outbound side, all messages are given stamps if they are going to
people you don't know. Proof of work stamps make a great introducer and
this level of effort is virtually invisible in most organizations but
the benefit is high especially when you consider that spam assassin and
a few other tools have the stamp recognition code in place.
There are other features I haven't yet exploited. For example, the
output of the dumpster (clearly spam) can be used to build a brownlist
database and if one is feeling the really clever, sharing the brownlist
with others will make for a more inclusive and accurate brownlist.
Interpreting the brownlist entries according to what CIDR block they
share with other brownlist entries could be used as a trigger to force
spam trap interpretation until one has a sufficient number of "spots in
the cidr block" to declare the block bad.
but the one place where I absolutely need the most help is closing the
feedback between the e-mail client and my system. When spam leaks
through, I have no way to easily communicate from the client to the
antispam gateway. I just don't have the time to acquire the knowledge
to do something that will work well for the ignorant user.
so that's it in a nutshell. Feel free to contact me off list. Any
further discussion and I will take it to new thread.
More information about the ubuntu-users
mailing list