Merging Live and Install CDs

John Richard Moser nigelenki at
Sat Feb 12 15:02:25 CST 2005

Hash: SHA1

Matt Zimmerman wrote:
> On Fri, Feb 11, 2005 at 11:21:41PM -0500, John Richard Moser wrote:
>>Matt Zimmerman wrote:
>>>And of course some disadvantages as well:
>>>- Somewhat increased memory requirements compared to the traditional
>>>  installer (though very close to those of the installed system)
>>JigDebs wouldn't need more memory; but they'd need some extra space.
>>I'd predict 20-50M, but I don't know really.  I'm basing this off the
>>assumption that the average JigDeb is going to be 20-50k; and that
>>there's about 1000 debs or less on an install CD.
> The additional memory requirement is due to the live environment, which is
> one of the primary advantages of a combined CD.  It has nothing to do with
> the installation method (filesystem copy vs. package installation) which is
> the crux of my proposal.

Ah.  I thought you meant running memory requirement due to . . .
something.  I tend to assume the obvious is assumed.

>>>- Remastering is more complex
>>Remastering is only as complex with JigDebs as with a normal LiveCD,
>>plus one extra step to fetch all used debs and strip them down to make
>>JigDebs, transforming the CD from a LiveCD to a do-it-all.
> This cannot be the case.  Any remastering process would need to maintain
> synchronization between the live image and your proposed pseudopackages.
> This would be much more complex than the existing remastering process.

I believe it would be minorly complex.  There may be complexity in a
script, but that's a one-time cost.

To synchronize the pseudopackages, basically four major steps have to be

 - The original packages have to be retrieved
 - The packages have to be compared to the livefs
 - Parts of the packages must be discarded and replaced with small bits
of information
 - The packages must be re-packed and re-signed

While writing the development tools to do this would be complex (about
as complex as writing any other script to do a handfull of repetitive
tasks), the script could be an additional automated step, or a step in a
larger 'live_to_install' script.

In essence, once the initial cost of implementing this is gotten past,
there's about one extra step added.

>>To recap, here's the JigDeb advantages:
>> - Very simple
> Your scheme is incredibly complex.  It would require massive changes to an
> entire stack of tools in order to operate on a fundamentally different and
> incompatible binary package format with wildly different assumptions.

The new binary package format would be incompatible with the old tools;
but it would be essentially the same thing.  Currently in the DEBIAN
directory of a package there is various control data; the modifications
needed would essentially be one extra control file with several hashes
and paths in the DEBIAN/ directory, generated automatically.  The
referenced files would be discarded from the deb.

The control file would indicate the relative path to the media from the
deb.  This would most usefully indicate the path to the filesystem
image, which could then be mounted (mounting the same thing multiple
times is allowed) by apt.

For its part, apt would need to understand these two new files and use
them to locate the files on the media:

 - apt looks in the deb for DEBIAN/pseudo.list
 - apt reads the first line in pseudo.list:  "image /path/to/fs.img"
 - apt 'mount -o ro /path/to/fs.img /var/apt/pseudofs'
 - apt installs the deb as normal
 - apt then iterates each line in pseudo.list
 - apt verifies and copies each file referenced in pseudo.list from
 - apt umounts /var/apt/pseudofs

Of course constantly mounting and umounting the fs.image will be very
expensive; so umounting it would be deferred until one of the following:

 - apt finishes
 - apt encounters a package wanting another fs.img

And of course, if pseudo.list isn't found, apt processes the deb as a
normal deb anyway.
>> - Minimal extra space usage
> Extra space usage is a disadvantage.

Keyword was "minimal."  Extra space usage is bad, but using only a
little extra space is better than using a lot of extra space.  Packaging
full debs would use a lot of extra space, i.e. dvd live/install.

>> - As suitable for LAN installations as using regular debs would be
> Not nearly.  A network installation is based on retrieving binary packages
> over the network, and your scheme would break that because the binary
> packages are incomplete, requiring all of the file content to be retrieved
> by some other means.

I'm limiting my analysis of this problem to PXE booted installers and
under the assumption that a special process which provided rebuilt but
unsigned debs by merging the data from the filesystem with the
pseudodebs would be used on the server.

In retrospect, this is out of scope.  Writing such a server WOULD be
somewhat complex and generally annoying.  It is also impossible to deal
with the issue of not having the final package signed after
reconstruction due to fundamentals of compression; you can't regenerate
a bit-for-bit copy.

Another way it could be done would be for the paths to be retrieved over
LAN as http://server/pseudofs/ (contrasting the prior mentioned
/var/apt/pseudofs).  This would still require explicit code to do this
to be written.  It would also require assumptions such as that the
pseudodebs all use the same image; however, this is an assumption we can
make in scope.

In short, making a LAN installation this way is possible, but will also
require special considerations and thus a degree of added work.

>> - Simple to remaster
>> - Simple to develope
>> - Simple to use -- single script could fetch all needed debs upon
>>building a LiveCD and strip them down
> I do not see how these could possibly be true.  The defining attribute of
> this scheme would be additional complexity.

The idea is to only add extra steps which can be fully automated.
Development complexity should be kept ideally low, but in the end we're
all more interested in "does the final product just work?"  If it takes
hours and hours and hundreds of complex commands to try and generate the
stripped down pseudopackages, then it's essentially useless.

>>And disadvantages:
>> - Requires somebody to put forth the manpower to code this
>> - Likely to be slower, though not necessarily noticably slower
>> - Full debs cannot be regenerated signed
> - Extremely complex
> - Extremely fragile

Fragile being?  The pseudopackages should only be packaged on the
livecd; they're not meant to be portable.

> - Fundamentally incompatible

Any enhancement to any format-- deb, png, mp3-- is going to be
incompatible with old software.

> - Would require an enormous development effort

Howso?  Is it really that complicated to add the extra logic to apt
needed to check for and iterate a control file line by line, mount and
umount filesystem images, check sha1 sums of files, and copy files from
an alternative location specified in a control file?

. . . I realize it's not a one-liner, but seriously.  What am I
underanalyzing here?

As for modifying debs, they're basically tarballs right?

unpack_package_to package/
for i in `find pseudomount/ -type f`; do
	target_file=`echo $i | sed -e "s/pseudomount\///"`
	if (cmp $i package/${target_file} 2>/dev/null); then
		md5sum $i | sed -e"s/pseudomount//" >> pseudo.list
		rm package/${target_file}
repack_package_from package/

Something like that.  Not quite but you get the idea.

> Its advantage compared to the status quo is the possibility for a combined
> installation, upgrade and live media which has a chance to fit on a single
> CD.  However, in pursuit of this goal, it would sacrifice much of what is
> good about the existing system, especially its robustness and versatility.

The changes would not affect the rest of the system, aside from
potential bugs introduced with extra code added to apt.  Existing debs
would be processed fine; new debs would be made in the same way as
existing debs, and thus old apt would still read them.

This is an extension, not a change.  AFAIK, nothing dictates that extra
files may not be placed in DEBIAN/ with the rest of the control
information; they're normally just ignored. (correct me if I'm wrong)

Pseudodebs have no place anywhere but packed on media which also
contains installed copies of the software, which is exactly what a
live/install CD is.

>>> For DVD, of course, we can have the best of both worlds, but CD will be
>>> with us for some time yet (especially as a download option).
>>The DVD should be avoided until two conditions are met:
>> - The average user has a 30mbit/s or higher connection; current
>>connections are 5mbit/s and DVDs are about 4GiB.  We should shoot for
>>the same download timeframe.
>> - DVD burners are an ubiquitous accessory
> Judging by your conditions, I think you meant that "We should not stop
> providing CDs until...".  However, we have no plans to stop supporting CDs.

I meant generally targetting DVDs.  My concern would be that one day
somebody wakes up and says, "Hey, we can pack 4 gigs on DVD!  Uh.
Useful stuff though, CD users would want it too. . . OK, 7 CD
downloads!"  I'm somewhat a minimalist.

> We already produce installation DVD images in parallel with CD images, and a
> combination live+install+upgrade DVD will be produced for Hoary (this is
> simple to build with the current installer and live engine).

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

    Creative brains are a valuable, limited resource. They shouldn't be
    wasted on re-inventing the wheel when there are so many fascinating
    new problems waiting out there.
                                                 -- Eric Steven Raymond
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird -


More information about the ubuntu-devel mailing list