[ubuntu-art] Recompressing PNGs to save space?

Matt Zimmerman mdz at ubuntu.com
Wed Apr 26 22:06:11 BST 2006

On Wed, Apr 26, 2006 at 09:30:20PM +0200, Frank Schoep wrote:
> Brief Summary
> =============================
> By removing duplicate images and recompressing all image files we can save 
> about 21 megabytes in the uncompressed default Dapper installation and shave 
> off anywhere from 11 to 21 megabytes from the installation CD. This last 
> figure depends on the unpredictable actual compression ratio of the package 
> file format.

Thanks again for doing this analysis.  Note that the package file format
incorporates gzip compression (gzip -9), so you could get a very close
approximation of the package size delta by comparing the gzip-compressed
sizes of the images before and after recompressing (ideally individually,
though a tarball would probably be fairly close).  Likewise for the desktop
CD's filesystem compression.

> Detailed Summary
> =============================
> There are 12114 filenames matching the PNG extension, of which 9420 are actual 
> files, 2694 are symbolic links. I ignored symbolic links in the calculations 
> because they do not take up "space" themselves and can not be optimized 
> further for distribution.
> In the current situation, all 9420 image files take up a total of 50 Mb 
> uncompressed and 39 Mb compressed (bzip2).
> To save space I first checked for duplicate images, there were 1888 (!) binary 
> duplicate images in the default installation which could easily be replaced 
> by symlinks saving about 8 Mb.

Replacing these with symlinks may not be as easy as it appears; consider
that the package with the symlink must depend on the package with the actual
file, and this may not always be desirable or appropriate.

Could you send the list of duplicate images?

Using your figure of a 20% average compression ratio for the existing
images, those 8M of duplicate images are probably being compressed to around
6M already, but that is still significant enough to warrant investigation.
We may be able to save a few megabytes with minimal effort.

> Next up I tried recompressing the remaining unique images, this shrunk the 
> uncompressed size from 42 Mb to 29 Mb, the compressed size shrunk from 35 Mb 
> to 29 Mb.

Another potential 6M gain here, but this one has a much greater cost in
development effort and maintenance which I don't think will be feasible for
Dapper.  The best approach for these images would be to send the
recompressed versions to the upstream maintainer and ask that they be
included in a future release, since for technical reasons it is problematic
for us to include the recompressed versions in the packaging.

 - mdz

