Compression on CDs -- an amusing topic for a Saturday

Loye Young loye.young at iycc.net
Sat Oct 27 20:29:07 UTC 2007


> Your analogy with meteorites is not correct.
> --
> Peter van der Does
>

You are so close to the trees that you are missing the forest. The comparison 
is between a full CD that has low or no compression versus a full CD that has 
very high compression being written using the same physical device. My 
original observation was that there is a tradeoff between getting more on the 
CD via higher compression on the one hand, versus getting lower failure rates 
on CDs with lower compression. Such is in accord with observations and with 
theoretical analysis.

When speaking of compression, we aren't actually speaking about disk surface 
that doesn't have data written to it. Instead we are really referring to 
whole sections of disk real estate that are filled with zeros or other 
repetitive and unimportant data. If you need to make better use of areas that 
are literally unpopulated, you employ defragmentation, which is a related but 
different technique needed for antiquated file systems.

On a typical CD, the entire CD is populated, as Soren rightly mentioned, with 
zeros and ones. The physical device makes the same number of read/writes 
whether the data is compressed or not, and the error rate is the same either 
way. On the surface, it would appear that compression doesn't introduce more 
significant error. 

The difference is that on an uncompressed CD, much of what is written is not 
important. Text files, for instance, are mostly a bunch of zeros at the 
physcial layer. Compression uses algorhythms to represent all those zeros in 
a shorthand way, so that the device doesn't actually have to write each one 
of them. This frees up disk real estate for more information. The consequence 
is that the compressed disk has a higher density of important bits and bytes 
on the same disk. 

Assuming that the device has a constant error rate and assuming that the CD is 
filled to the same capacity, it is more likely that the errors on compressed 
disks will affect something important and cause a failure, simply because 
there is more important data on the CD. 

I actually remember when "floppy disks" were flexible 12 inch disks and how 
amazed everyone was to get so much information on 5 1/2 inch disks. Engineers 
have made remarkable progress over the last 30 years. Much of the heavy 
lifting to make that possible was the improvements in error prevention, 
detection, and correction necessitated by the compression. 




More information about the ubuntu-server mailing list