what should be done to merge nested trees patch to bzr-core?

Alexander Belchenko bialix at ukr.net
Wed Jun 15 22:08:58 UTC 2011


15.06.2011 22:12, Martin Pool пишет:
> What did you learn from scmproj etc?

Scmproj taught me about a lot of things, what is important and what is
wrong. I think both bzr-externals and scmproj were a great experiment
and usable most of the time (if you don't need feature branches, see
below).

The most important lesson I've learned is very hard and very simple in
the same time: all components of nested trees should live inside the
root of the project tree. Regardless it's local checkout, or published
copy on remote server. Always all-in-one. All implementation as I know
haven't support that.

For me current implementation of scmproj, bzr-externals, submodules,
subrepository, svn:externals, all they have proved wrong. If you want
to have the ability to easy track the history of the entire project
and be able to re-create its past state -- all your nested  branches
should be kept together. Otherwise the main idea of DVCS as
decentralized system has been compromised. If you need nested trees
only as handy and builtin "config-manager" feature -- then you don't
really need nested trees as I desperately need them.

That may sound too extreme or harsh, but that's my experience based on
about 2 years experience with scmproj and nested trees emulation in my
working projects.

You may say that my conclusion is directly influenced by the reasons
why we're using scmproj and how we're using scmproj. And most likely I
will agree. But some of the problems I have so far is not new, and at
least known to implementators of other nested-tree-like systems
(git/hg). But I have to point that some of the problems are much more
visible and therefore difficult because of branch model in bzr.

Some background.

I can imagine several ways why somebody want to use nested trees. I
will name only 3 use cases, ranging them from simplest to hardest, in
my opinion.

1) Developer wants to track external dependencies to his project, so
he's added some third-party library as nested component. (In my
opinion this is that same use case that addressed by svn:externals).
In the absence of nested trees developer can use any other home brew
systems, starting from shell scripts, to buildout, and going all road
up to something like Maven or some other BIG FAMOUS SOFTWARE
CONFIGURATION TOOL, I don't know. This is the simplest usage, if you
don't have a nested trees in your DVCS that's not the big deal.

2) Developer shares some library (code, media, or other files) between
several projects. Such library might be even published as open source
project, so it may eventually have its own forks and history. Working
with the main projects relying on such library without nested trees is
not very comfortable, sometimes it's so uncomfortable that one might
be even forced to merge library into main project just to have the
ability to have a consistent project tree, if there is no nested
trees. There is one serious problem though: once the library is merged
to other project it's not possible to extract it back in it's
independent state. So if you want to be able to share changes in that
library you definitely don't want to merge it, just because bzr won't
let you to get it back, and you'll be forced to use cherrypicking all
the time.

3) At my work we have combination of both cases above: we need scmproj
as our dependency tracking and configuration tool, but also we need it
to track the state of the project itself, to track dependencies
between different components.

So far the story hasn't been really scared, has it? The problem for us
that we have A LOT of different branches multiplied by A LOT of
components that compose the final project from where we're building
our releases. Now, add there feature branches and you'll have a small
nightmare. I need to track history of releases, which components have
been used for which release and for which customer (we have enough
customers and for some reasons we can't provide only one release for
all of them, we have custom builds based on specific requirements).
Now think about feature branches, to properly test a new feature we
need to create a testing build, therefore we have to create a
temporary copy of the project for it. But feature branches should not
live too long on the server, so once the feature is merged to trunk, I
prefer to clean merged branches out of main pool (often to
intermediate limbo directories like _MERGED). So when a component
branch disappears we have no way to restore the old state of the
project. That hurts. To workaround this problem we might support local
changes to project config, this should be pretty easy, but that's only
workaround, that's not a real solution.

Some of those problems could be reduced by merging some components to
each other, but that's not always possible, sometimes just because the
main developer of the component insists that component should lived as
separate entity, and only that, and "even don't talk with me about
it". Sometimes we just haven't figured yet how to change our
situation, but anyway -- scmproj is the single way for us to handle
this madness.

Anyway, we have 2 problems with feature branches for nested components:

a) the problem with disappearance of component branches is more
painful in bzr than in git/hg, because bzr operates with single branch
per URL, rather than with single repository per URL, repository that
allows fetching an arbitrary revisions, required for recreating a
state of nested trees.

b) automatic creation of feature branch remote counterpart for nested
component. If I want to create a feature branch for my root component,
my actions are straightforward. But if I want to be sure I won't
accidentally push changes introduced by this feature from nested
components to their trunk branches I have to be very very careful. In
git/hg you can just push to main repository (because see "a" above),
but with bzr I have to manually create feature branches on the server
for all affected nested components, and won't forget update the
project config.

But even in git/hg you'll have the problem if repository of your
nested component has disappeared or unreachable right now. It breaks
DVCS model IMO, and force centralized model instead.

I'm sure that can be effectively solved only if we put all component
branches for that same project into project root branch itself.
According to NestedTreesDesign document from wiki.b.c.com, local disk
layout of the project stores nested components branches in
.bzr/subbranches, but allows them to be accessible by different URLs
on the remote side. I think this is *wrong*, and should not be default
behavior for *bzr*. Instead project should store all components in
.bzr/subbranches or something like that on the server too. And
therefore it should always keep all eggs in one basket, so you can
work with your complex project as is it a single tree, not a nested or
composite tree. But you still will be able to point on a single
history of any component and share it between different project,
different customer builds and so on.

I have more comments about recursive/non-recursive behavior of nested
trees, composite trees and composite log, but I'd better to put them
into separate mail, that one is already big.



More information about the bazaar mailing list