Checksums Done Right
scott at cse.ucdavis.edu
Sat Jun 30 00:04:14 UTC 2007
So I mentioned this on #ubuntu-devel and #ubuntu-mirrors and some people
thought it would be better to discuss it here.
Most Ubuntu packages (95% - my estimate) come with MD5 checksums at the
file level (in a file called md5sums in control.tar.gz). debsums uses
these (well actually a cache stored in /var/lib/dpkg/info/*.md5sums) for
doing a *rough* verification that what is installed matches what
*should* be installed. This is great until md5 collision attacks and
kernel-based rootkits are used on your system (common these days).
Tripwire is fine and dandy but assumes you run it on a known good
system, uses a local cache, and has no integration into the mirror or
We have been working on a to-be-open-sourced product we are calling
Checksums Done Right (CDR). A colleague gave a talk last week that
included some notes about CDR. Basically we've processed the md5sums
files in dapper, edgy, and feisty and dumped it into a database. When we
update our mirror we update our database. The mirror seems like the best
place to offer this type of verification service. We have used it to
verify binaries on Xen installations by taking LVM snapshots of the
virtualized machine and sending checksums to the mirror using ssh all
from the dom0. Our tests show that we can verify a system installation
(libraries, binaries, and kernel modules) of up to 12k files in around 4
seconds. This theoretically scales to 5k full machine scans per mirror
With that, I have a few questions.
1. What is the timeline for moving to a more robust checksum algorithm
(say sha256) for files inside packages?
2. Are there any plans to enforce including checksums for all
(non-changing) files in a package (so all debs have a md5sums file -
currently around 5% don't)?
3. Are there any plans for including a "isconfig" or "isbinary"
flagging system like RPM has?
4. Is anyone already doing this?
More information about the Ubuntu-devel-discuss