Compressing packages with bzip2 instead gzip?
Phillip Susi
psusi at cfl.rr.com
Sat Dec 10 19:37:55 GMT 2005
We need to keep clear what use case we are discussing. Eventually it
might be nice to improve the main repositories to use better
compression, but I think we should start with just the install cd, since
it is a simpler case.
In the long term, I think that apt-get should be modified to understand
a new URI scheme where it would invoke an external utility to access the
7zip archive. This external utility would open the 7z archive and begin
decompressing and extracting the requested files ( packages list, and
the individual .debs ). This utility would need to operate in the
background like a daemon so that it could be given requests for various
packages from apt, without having to start decompressing the archive
over from the beginning for each request. Apt would need to make sure
to request the packages from the daemon in the order that they appear in
the archive, again, so that the data stream only needs to be
decompressed once.
That seems like quite a bit of work. I'm not sure if the current
installer just uses apt-get to install the packages, but it might be
easier in the specific case of the language packs to modify the
installer to invoke 7z to extract the one language pack that the user
chose to install and pass it to dpkg -i.
For the main block of packages that you don't really choose a subset of
during the installation process, it also might be easy to modify the
installer to invoke 7z to extract them and pipe them to dpkg -i, rather
than modify apt to understand the new 7z scheme.
Then again, one of the goals for dapper is to have a unified
live/install cd where the system is just copied to the hard drive rather
than installing packages. That might make things a bit more interesting.
In the case of the language packs, the probably could just be stored in
a 7z archive, and part of the express cd boot up process would prompt
the user for which one they want to use, extract that one, and install
it with dpkg -i. Other packages, like OO.o or firefox, might present
more of a challenge. Maybe something involving unionfs and lots of
tmpfs to decompress to?
Timo Jyrinki wrote:
>
>
> About the usage of possible space savings: where is it needed? In my
> opinion including all languages' language packs would be the best thing
> to have. I think the limit of the CD size hasn't been a factor in what
> programs are included on the Ubuntu CDs, but the one thing that is
> downloaded from the Internet because of size constraints is the language
> files for most of the not-top-20-spoken languages.
>
> The figures for combined language pack files are very promising:
>
>>.tar.gz: 61,478,589
>>.tar.bz2: 49,982,949
>>.tar.7z: 23,081,869
>
>
> I think all of the language packs could be included in .tar.7z format
> for the size of the current separate bz2 files.
>
> Doing this only for the langpacks would have the benefit of not messing
> with the whole archive (something Canonical might not want to do for
> Dapper??), but the negative side would be to find a solution to how
> language packs would be compressed "as a whole" on the CD and still have
> a way to extract separate language pack deb:s.
>
> -Timo
>
>
More information about the ubuntu-devel
mailing list