LiveCD optimisations

Louis Simard louis.simard at gmail.com
Fri May 21 06:16:39 UTC 2010


At 2010-05-21 04:41 GMT, Martin Owens <doctormo at gmail.com> wrote:
> Hey Louis,

Hey Martin, thanks for the reply!

> Sounds great and looks like a pretty good script, I have some comments:
>
> You may be able to make it a little faster by using the find results in
> one like like this:
>
> find / -type f -name "*.svg" -print0 | xargs -0 -I FILE sh -c
> '/tmp/scour/scour.py --enable-id-stripping --indent=none -i FILE -o
> FILE-opt && test -s FILE-opt && mv FILE-opt FILE || rm FILE-opt'

I had considered using sh -c to execute the Scouring and renaming,
yes, but didn't know how to go about detecting empty files except with
another 'find'. Thanks for telling me about test -s :)

> Although if you can get all that into a script file, so much the better
> so it's not all on one line. But at least it's not doing a find 3 times
> for the same files.

True. This is a case of optimising the optimiser, which I consider a
micro-optimisation because the later invocations of 'find' are highly
likely to have the needed disk blocks in RAM - but every little bit
helps, just like with these image files. (Speaking of which, Scour.py
imports the Psyco JIT if it's available, but it doesn't help that
much. It makes the Python code itself run faster, yes, but at the cost
of greater startup time for each Scour.py instance, and most files are
optimised in 0.06 second anyway.)

> Do you need to chroot into the file system to perform these steps?
> considering that your downloading code to do it (with bzr which isn't
> installed ont he cd). Would it not be good to perform these steps
> outside of the squashfs and iso file system?
>
> For instance I got resolve issues when it tried to do the apt update.

I probably don't. That was part of a script that allowed me to
customise more things, such as updating packages (which I needed to
chroot for), removing the desktop background, updating Linux and all
that; I just trimmed it down for this email. I'll move the chroot
processing to the host.

> Are there no more things that could be optimised? For instance does
> using xmllint with --noblanks on the 12496 xml files save any space?

Will test this shortly. I hadn't thought of that yet, and I'm
flabbergasted by the number of XML files! Seeing as SVG files are also
XML files, and Scour.py seems to pretty-print XML even with
--indent=none, that might save even more, actually.

> Finally... should some of these optimisations work their way upstream so
> all packages have optimised files, smaller downloads, smarter mirror
> storage etc?

Of course! :) Working with upstreams would avoid keeping debdiffs
around for the optimised files in Ubuntu repositories, and will help
other distributions too.

I'll attach a modified script to my next email with more testing
results regarding XML.

Regards,
- Louis




More information about the Ubuntu-devel-discuss mailing list