Compression on CDs -- an amusing topic for a Saturday
Loye Young
loye.young at iycc.net
Sat Oct 27 20:29:07 UTC 2007
> Your analogy with meteorites is not correct.
> --
> Peter van der Does
>
You are so close to the trees that you are missing the forest. The comparison
is between a full CD that has low or no compression versus a full CD that has
very high compression being written using the same physical device. My
original observation was that there is a tradeoff between getting more on the
CD via higher compression on the one hand, versus getting lower failure rates
on CDs with lower compression. Such is in accord with observations and with
theoretical analysis.
When speaking of compression, we aren't actually speaking about disk surface
that doesn't have data written to it. Instead we are really referring to
whole sections of disk real estate that are filled with zeros or other
repetitive and unimportant data. If you need to make better use of areas that
are literally unpopulated, you employ defragmentation, which is a related but
different technique needed for antiquated file systems.
On a typical CD, the entire CD is populated, as Soren rightly mentioned, with
zeros and ones. The physical device makes the same number of read/writes
whether the data is compressed or not, and the error rate is the same either
way. On the surface, it would appear that compression doesn't introduce more
significant error.
The difference is that on an uncompressed CD, much of what is written is not
important. Text files, for instance, are mostly a bunch of zeros at the
physcial layer. Compression uses algorhythms to represent all those zeros in
a shorthand way, so that the device doesn't actually have to write each one
of them. This frees up disk real estate for more information. The consequence
is that the compressed disk has a higher density of important bits and bytes
on the same disk.
Assuming that the device has a constant error rate and assuming that the CD is
filled to the same capacity, it is more likely that the errors on compressed
disks will affect something important and cause a failure, simply because
there is more important data on the CD.
I actually remember when "floppy disks" were flexible 12 inch disks and how
amazed everyone was to get so much information on 5 1/2 inch disks. Engineers
have made remarkable progress over the last 30 years. Much of the heavy
lifting to make that possible was the improvements in error prevention,
detection, and correction necessitated by the compression.
More information about the ubuntu-server
mailing list