bzr/LP issues from work discussed at UDS

Thu Dec 3 05:52:25 GMT 2009

Hi,

I'd like to provide some information about some of the discussions that
went on at UDS about UDD, and in particular some open questions related
to bzr and Launchpad.

I have just written up two specs from the sessions:

  https://blueprints.launchpad.net/ubuntu/+spec/foundations-lucid-daily-builds

about all things daily builds, and

  https://blueprints.launchpad.net/ubuntu/+spec/foundations-lucid-distributed-development/

about using bzr for Ubuntu development.

There are a whole bunch of topics tied up in them, so I'd like to pull apart
some of the threads for discussion. Most of these things are open questions,
but some are a “please help do this” request. Some of them will be blocking
things we want to roll out over the next 6 months.

1) Merging unrelated branches in a recipe.

We currently have a rather unfortunate situation, but one we entered in to
knowingly. We can associate lp:<project> with an lp:ubuntu/<package> to know
they contain the same code, and this would make it dead easy to set up
the first cut of a daily build. However, as it currently stands the two
branches will, except for a minority of projects, share no revision history,
meaning that they can't be merged, and so can't be combined in a recipe.

There are a couple of main drawbacks to this, namely that starting a daily
build is more work than it could be, and that changes made in the packaging
of the daily build aren't directly mergeable back to the packaging.

We have a plan to rectify this. It however is a multi-year plan, so we may
want to sidestep the issue somewhat. There are good reasons for it being
a long term plan, but it's not out of the question that this issue, and
others below, force us to re-evaluate this. It should not be done lightly
though.

One way to alleviate the pressure on this issue would be to make it possible
to combine lp:<project> and lp:ubuntu/<package>, even though they are not
mergeable.

https://code.edge.launchpad.net/~spiv/bzr-builder/merge-subdirs-479705/+merge/14979
is said to go some way towards doing this, but as I say within, I think
I am missing something, as it can't be the whole solution on its own.

What I am looking for here is suggestions on how we can elegantly allow
people to combine the two trees in a system that isn't too fragile.

2) Importing non-master branches

I know this is being discussed elsewhere right now, but this is another
area where this came up as being useful/a blocker. I don't want to split
the discussion, but just wanted to register another vote for being
able to do this.

We may also want to do some interesting things with SVN imports, depending
on how they are layed out. I haven't looked in to it, but I imagine that
switching to bzr-svn could change what we can do.

3) Importing a lot more branches

We want to import a lot more branches this cycle, all of those used
for maintaining packages in Debian. I don't have a definite number
that we want to import, but

  http://upsilon.cc/~zack/stuff/vcs-usage/

declares that there are 6881 source packages using a VCS. Therefore,
what would happen if tomorrow we increased the number of vcs-imports
by 5000? (What is the current number?)

It may be that the answer here is just “deal with the failures,” but
maybe there needs to be infrastructure work done before this. jml
says that it may just be a case of throwing more machines at it,
as the system is already built to be scalable.

4) API for creating code imports

I don't want to set up those 5000 new imports by hand. I also don't
want to have to maintain it as the locations change and the like.
Launchpad has an API to allow scripts to be written to manipulate it.
It would be great if that could be used to avoid doing it all by
hand. Is this just a case where we need to expose something, or
is there more involved than that?

Also, I don't think the CHR would be happy if we created 5000 import
requests tomorrow. Can the review step be removed, or at least
waived here?

I imagine that the lifecycle of these things will require locations
to be changed sometimes, is just requesting a new import the best
thing there?

5) API for requesting a code import be tried ASAP

Do Branch.requestMirror() and Branch.last_mirror_attempt refer to
importing to the code if the branch is a vcs-imports one?

If not, can we get an API similar to the above for vcs-imports?

We would want to say “try now,” and then spend a while waiting
for an indication it tried to import, so that we could be reasonably
sure the import was up to date.

6) Guessing parent relationships

We currently infer parent relationships from debian/changelog, as
if you include changelog entries of another upload then we presume
you merged the changes.

We will need to start inferring parent relationships in some cases
though, as there are some uses that means the code that was uploaded
is never exactly committed as a single revision. (Such as never
commiting the revision that changes the target from UNRELEASED
to unstable, or files modified in the clean target.)

The heuristics shouldn't have to be too fuzzy, but any fuzziness
makes me a little nervous, do the bzr developers agree? Do you
have any advice on how to do it well, so that it doesn't cause
mis-merges and the like?

7) Migration over branch history rewrites

In order to include new history in to the branch we need to
rewrite their history. This means changing revision ids.

In order to make the new branches mergeable with existing other
branches we need to change file ids.

We can do this fine for all the branches we control, but it
will instantly make developers local branches unrelated.

Therefore we need to provide a way to change a local branch
to make it mergeable again.

This means rewriting all the revision ids using a map that
we create when we import the new branches, and generating
new ones for revisions not in the map.

It also means rewriting file ids for all revisions not in the
map, and any working trees, for revisions not in the map.

There's obviously a lot of risk in this, and also a whole
lot of code needed to do this well. Jelmer said that
bzr-rewrite(?) already contains some code to do something like
this, so we will be able to start from there.

Here I'm looking for advice on how to do this well, and also
things like how to distribute the maps and where to put the code
to do the work. I'm also very keen on any suggestions you
may have for doing this in a way that avoids these issues.

Also, we have a terrible user experience on the flag day:

  # day before
  $ bzr pull
  ...
  # flag day
  $ bzr pull
  bzr: ERROR: there is no common ancestor.

any suggestions on how to improve on that would be gratefully
received.

That's all I have for now. Thanks to anyone that read this far,
your input will be valued.

Thanks,

James