hardware raid solutions?

David Abrahams dave at boost-consulting.com
Wed Jul 12 02:55:54 UTC 2006


"Eric S. Johansson" <esj at harvee.org> writes:

> David Abrahams wrote:
>> "Eric S. Johansson" <esj at harvee.org> writes:
>> 
>>> sorry, I said that wrong.  this is for an open-source project.
>>>
>>>> [1] python knowledgeable help wanted.  Job descriptions available on
>>>> request.
>> Will this do a magically-better job than my current combo of
>> SpamBayes
>> and the SpamCop blacklists applied by my sysadmin?
>
> yes.  better quality filtering, more opportunities to eliminate false
> positives, fewer special case hacks

I usually think the weaknesses in SpamBayes is that it has NO
special-case hacks.  What hacks are you talking about?

> , and most importantly, less work for you.

Not so fast.

> note: this filter system runs inside of postfix using the pre and post
> queuing filters stages

Note: our server runs Communigate Pro.  The only way I can filter is
to send incoming mail into a program from a CGPro filter.  That
program either has to stuff the message into my IMAP mailboxes
directly or it has to add a header that causes the CGPro filter not to
run.

> geek view starting from the very front:
>
> ---- front-end (can be distributed among multiple relay servers) ---
> 1) Blacklist test?
> If blacklisted, does it contain a very large proof of work stamp
> (i.e. 10 minute) for emergency bypass a.k.a. Brown listing? yes,
> pass. no, 5xx return code the message

I guess I don't know what a single word of that means :)

> 2) does the to: address exist on the mail server.  Enter in local
> Brown list if not
>
> 3) does e-mail arrived too fast for a given address?  if so, 4xx
> return code the message the message
>
> ----- backend
>
> 4) proof of work stamp test, make a stamp, passed directly to inbox
> 5) friends "automatic white" list.  if I know you, you pass

Sounds like a special case hack, no?

> 6) slow white list (match a pattern, don't go to jail)
> 6a) sumo filter.  If the message is bigger than 50k, pass)
> 7) content filter (CRM 114 with three band interpretation of pR)
>    messages are directed either to inbox, dumpster, or spam trap
>    where the human can interpret whether the messages are spam or not.

Sounds a lot more complicated than what I'm doing today.  Why will it
do better?

> the spam trap user interface is a simple mechanism for recategorizing
> messages they could not be analyzed by any the other stages.  the
> recategorization trains a content filter for at first higher, then
> different accuracy.
>
> I am rather proud of the fact that myself and a friend made a spam
> trap user interface that is simple enough that an ordinary
> administrative assistant could manage the spam trap handling for a
> company of roughly hundred people in only 10 minutes per day at most.
> That is, as long as CRM 114 behaves itself.  :-)

CRM114 looks interesting.  Seems like, compared with SpamBayes, it's
full of special-case rules, but that might be a good thing.

> an interesting side effect of the friends list and the content filter
> is that if you look at the score of any message passing the friends
> list, if it scores as spam, that's an extremely high probability
> indication that the content filter is having problems.  Typically I
> have found that around 5%-15% of the messages coming through the
> friends filter are considered spam.
>
> on the outbound side, all messages are given stamps if they are going
> to people you don't know.  Proof of work stamps make a great
> introducer and this level of effort is virtually invisible in most
> organizations but the benefit is high especially when you consider
> that spam assassin and a few other tools have the stamp recognition
> code in place.
>
> There are other features I haven't yet exploited.  For example, the
> output of the dumpster (clearly spam) can be used to build a brownlist
> database and if one is feeling the really clever, sharing the
> brownlist with others will make for a more inclusive and accurate
> brownlist.
>
> Interpreting the brownlist entries according to what CIDR block they
> share with other brownlist entries could be used as a trigger to force
> spam trap interpretation until one has a sufficient number of "spots
> in the cidr block" to declare the block bad.
>
> but the one place where I absolutely need the most help is closing the
> feedback between the e-mail client and my system.  When spam leaks
> through, I have no way to easily communicate from the client to the
> antispam gateway.  I just don't have the time to acquire the knowledge
> to do something that will work well for the ignorant user.

Well, I'm way out of my depth here, so don't look at me!

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com





More information about the ubuntu-users mailing list