Research on decentralized package management

Katherine Cox cox.katherine.e at gmail.com
Thu Mar 18 17:01:27 GMT 2010


Matt,

I have been discussing this topic with Gustavo Niemeyer who - I believe - is
still an employee of Canonical and also created the "Smart Package Manager".
We have exchanged a few emails now, and I think (you might want to speak
with him, I don't want to put words in his mouth) he believes that the way I
would handle user-facing application distribution is intractable. Having
said that, let me lay out how I would approach it. I hope it adds something
to the conversation. :)

Just to make this absolutely clear, this is literally something I've just
been thinking about lately. I have no agenda, I don't care what Ubuntu does,
and I'm not saying my way is the "best" way or even the right way to do
things. These are just ideas that I have been trying to get industry
feedback on.

Axioms for argument:

   1. Centralized packaging is holding Linux back.
   2. Repackaging efforts draw crucial resources away from working on things
   that could lead to "real progress".
   3. Centralized packaging unnecessarily draws all distributions into the
   development cycle and takes a linear problem and makes it exponential.
   4. Commercial ISV's do not want their software repackaged.
   5. Repackaging and patching source code by each distribution makes
   support more difficult; some ISV's consider it to be too hard to bother.
   6. This problem cannot currently be solved with policies or standards.

Because of axiom 6, I think this has to be a bottom-up approach. It has to
be a well engineered solution that "just works" on all distributions. If it
proves to be an easier way of installing user-facing software, users will
start using it without prompt, and eventually it will gain critical mass and
then standards become a feasible option across Linux and eventually a tool
like this could be discarded. Here's my attempt at such a solution.

Use Case:

   - User goes to software website (pidgin.im, etc)
   - User downloads any distribution file (dpkg, rpm, deb, etc)
   - User double clicks on distribution file
   - Tool handles dependencies
      - Notification that current system doesn't support dependency
      - Some distribution-specific method of installing said dependency.
      Note this necessitates the ability by the distribution to install/resolve
      multiple versions of libraries.
   - Tool notifies user of success/failure of installation
   - User can then use software.

Of course all the complexity lays in resolving the dependencies and managing
the differences between distributions. Here's how I would suggest resolving
that:

   - Maintain matrices of differences between distributions. These could be
   kept on the internet and downloaded much like package lists.
      - Version of LSB distribution conforms to if applicable
      - Mapping of directories (user-installed software is /bin in Ubuntu
      and /usr/bin in Fedora)
      - Any other distribution specific details.
   - When software is installed, perform the following steps:
      1. Parse package and derive needed information (name, version,
      dependencies)
      2. Query distribution for version number
      3. Query managed file for known-information about this distribution.
      4. Override known-information with information queried from the actual
      system.
      5. Resolve dependencies or notify the user of failure
      6. Install the application or exit.

Gustavo has been telling me that managing a matrix of differences between
distributions is intractable. Maybe he's right, but here is how I am looking
at things from a mathematical perspective:

   - Repackaging software
   (# of distributions) * (# of distribution releases) * (# of software
   packages or infinity) * (# of releases of software pacakge) = Infinity (or
   at least very large)
   - Managing differences between distributions
   (# of distributions) * (# of distribution releases) * (# of items that
   are meaningfully different) = Some manageable number

It's certainly not an *easy* problem, but it seems to be easier than
packaging all known software and brings with it some real good in my
opinion. Google seems to have had great success by creating a good algorithm
and then running lots of data through it. Problems that seem intractable
then become somewhat manageable (language translation, search relevance,
etc). I think by making the differences between the distributions dynamic
(ie: placing them in files that are downloaded from the internet), you can
then start out at a bad spot and rapidly iterate until you've built up a
good solution.

Almost everyone I have talked to so far has mentioned that I would have a
hard time convincing users, or distributions to adopt this methodology;
however, my entire point is: I don't think anyone should do that. If this
tool worked, it would create a huge gravity-well of benefits, and soon
enough users would use it of their own volition. If it's not the right thing
for Linux, or for users, then no one would use it and it would rightfully be
abandoned.

So that's just what has been rolling around in my head :) I hope it sparks
some discussion or ideas. You all produce a fabulous distribution.

Kate

On Thu, Mar 18, 2010 at 10:15 AM, Matt Zimmerman <mdz at ubuntu.com> wrote:

> On Sun, Mar 07, 2010 at 11:26:25AM -0600, Katherine Cox wrote:
> > Firstly, I apologize if this is not the appropriate venue for this type
> of
> > discussion, but I wanted feedback from the people who steer Ubuntu's
> > direction instead of meaningless "internet debate" from members of the
> > community. If there is a better way to discuss this, please let me know!
> :)
> > Secondly, I am making the assumption here that I have missed something in
> my
> > analysis; I'm not trying to convince anyone, rather I'm trying to educate
> > myself.
> >
> > One of the reasons I prefer Ubuntu is because very often it takes
> pragmatic
> > stances on issues to allow users to use their computer instead of
> > maintaining it. This is something I think a lot about, and something I've
> > specifically been thinking a lot about lately is package management. It
> > strikes me as the one area that could improve Linux significantly if
> > addressed properly. Having said that, I am not an expert on package
> > management nor Linux, so I was wondering if someone might point out the
> > flaws in my thinking?
> >
> > It seems to me that a disproportionate amount of resources in *any*
> > distribution
> > are consumed by centralized package management. Certainly the
> distribution
> > needs to take care of fundamental libraries (glibc, the kernel, the
> windows
> > manager, etc.), but why do we attempt to repackage all other software?
> Isn't
> > that taking a linear problem and turning it into an exponential problem?
> As
> > a software engineer, it is also frustrating that every Linux distribution
> I
> > want to support becomes part of my development cycle. If I fix a bug, I
> > can't just release it to the community, I have to wait until it is
> > repackaged by each distribution. In my eyes, this only means it's more
> room
> > for error as more people not involved with the project touch it, and much
> > more time until users can get the update.
> >
> > Has the idea of supporting or developing a de-centralized package
> management
> > system for Ubuntu been discussed? Something along the lines of how OS
> > X/Windows works? I am wondering if the Linux Standards Base has made any
> > progress in this area as well.
>
> I've discussed it in passing with a few people, and have been planning for
> a
> while to write an article about it on my blog, but that's about all I'm
> aware of so far.
>
> The current model is complex, but the complexity is justified for highly
> interdependent system components which require a lot of care and
> integration.  I'm not sure that the tradeoff is justified in the case of
> many user-facing applications, because they're more loosely coupled to the
> underlying components.  The fact that we treat everything the same way
> (while it has advantages) leads to unneeded complexity at that level.
>
> Changing this model would not be easy, but it's possible that the payoff
> would be worth it.
>
> --
>  - mdz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ubuntu.com/archives/technical-board/attachments/20100318/f85524e3/attachment.htm 


More information about the technical-board mailing list