Ubuntuforums.org, Message-IDs, and the mail archives
Matthew R. Dempsky
mrd at alkemio.org
Sat May 13 16:17:09 UTC 2006
I noticed that ubuntuforums.org's mailing list gateway doesn't generate
unique Message-IDs for all emails it sends. Their Message-IDs are based
on the poster's username and the time of posting, but with a course
enough granularity that if a poster sends multiple emails within a few
minutes time frame, Message-IDs will be reused.
Further, Pipermail, the software running the Ubuntu lists archives,
doesn't seem to handle duplicate Message-IDs well. It checks for
duplicates simply by comparing Message-IDs (allowing trivial archive
corruption). In the Threaded view, only the first occurence is visible,
while in Subject, Author, or Date views, all occurences are visible,
although they each link to the same message.
For example, check [1] and [2] for posts by ``linuxcity''. He posted in
the ``[Q] upgrade Flight-6 to Flight-7?'' and ``Packages ! get the
current ones???'' threads, but only one is present in the archive
because both contained <linuxcity.27lf9z at gs1.ubuntuforums.org> as their
Message-ID.
Is this a known issue? (I noticed this issue when trying to index all
of Ubuntuforums.org's ricochet spam[3], but the problem affects several
legitimate posts as well.)
[1] https://lists.ubuntu.com/archives/ubuntu-users/2006-May/thread.html
[2] https://lists.ubuntu.com/archives/ubuntu-users/2006-May/author.html
[3] http://odin.alkemio.org/ubuntuforums.html
More information about the ubuntu-users
mailing list