[Bug 320829] Re: Bogofilter seems to fail decoding base64

Loïc Minier lool at dooz.org
Sat Apr 10 10:28:31 BST 2010


Actually this bug uncovers an important issue with parsing of the first
line of the body; bumping to high.

** Changed in: bogofilter (Ubuntu)
       Status: Confirmed => Fix Committed

** Changed in: bogofilter (Ubuntu)
   Importance: Undecided => Medium

** Changed in: bogofilter (Ubuntu)
     Assignee: (unassigned) => Loïc Minier (lool)

** Also affects: bogofilter (Ubuntu Lucid)
   Importance: Medium
     Assignee: Loïc Minier (lool)
       Status: Fix Committed

** Changed in: bogofilter (Ubuntu Lucid)
   Importance: Medium => High

-- 
Bogofilter seems to fail decoding base64
https://bugs.launchpad.net/bugs/320829
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is a direct subscriber.

Status in Bogofilter: Bayesian Mail Filtering: Confirmed
Status in “bogofilter” package in Ubuntu: Fix Committed
Status in “bogofilter” source package in Lucid: Fix Committed

Bug description:
Binary package hint: bogofilter-bdb

Description: Ubuntu 8.04.1
Release: 8.04
Package: bogofilter-bdb
Source-Package: bogofilter
Version: 1.1.5-2ubuntu5

During the last days I received a lot of similar spam that passed bogofilter marked as Ham. Even after tagging a lot of mails (>50) this was not improved. Neither for already tagged mails nor for new mails.

Looking on the plain mail text I found out that the mails although plain text with cp1251 formatting were base64 encoded. Thus I first assumed that bogofilter might be unable of handling base64 encoding. But actually this is integrated since version 0.10 and should therefore be still in 1.1.5-2ubuntu5 as I have installed here.

A brief test brought up the following:

Test:
I tagged one of the spam mails using a new database with "bogofilter -s" and compared the database contents (retrieved via "bogoutil -d") with another new database were I tagged the same mail but with decoded body and subject.

Result:
In the first DB only information on header fields was present. In the second DB there was also information regarding the body of the mail.

Thus I conclude that bogofilter did not manage to decode the mail - whereas KMail does this flawlessly.

I attach an mbox folder with a selection of mails.





More information about the Ubuntu-sponsors mailing list