What I've learned from scmproj (Re: what should be done to merge nested trees patch to bzr-core?)

Thu Jun 16 10:41:16 UTC 2011

Martin Pool пишет:
> What did you learn from scmproj etc?

I will put as much of other important things about working with 
nested-trees-like projects. Also I want to note that both scmproj and 
bzr-externals have a lot in common, but I think bzr-externals is easier 
to use just because it hooks in existing commands via various hooks, 
while scmproj provides you a set of special commands to work with a 
project. I think bzr-externals is better suited for centralized 
(lock-step) development model and it seamlessly integrated into GUIs.
But it has its own flaw: it's hard to disable its recursive operations 
if you want to. With scmproj you have much more fine-grained control 
over what do you do.

1) Recursive operations and composite checkout.

a) Commit in the project root should be recursive most of the time, to 
make sure you have committed consistent project state, and you can 
restore that same state later. That's the most important point, if you 
don't need to have ability to restore the exact past state of your 
project you can live without it.

Both scmproj and bzr-externals save revision-ids of nested components in 
the root tree as plain text file, so in the case of recursive commit 
they should invoke commits in backward order -- from deepest components 
up to the root component. The same applied to nested trees design too: 
in order to be able store link to revision of nested branch you should 
invoke commit in that branch first.

Scmporj uses separate command to perform project-wide commit. It's 
important to use it to make sure snapshot of the project updated properly.

b) Status/diff -- it will be nice to have them recursive, but in the 
case of too many nested components (we have one project here with such 
problem) it becomes noticeable slow. So we're using project-wide 
status/diff only before project-wide commit, and can live with 
non-recursive behavior most of the time.

Also for status is very important thing is to provide a note when nested 
components have been committed without committing the root component, so 
you should know that you should perform project-wide commit in order to 
record new state of components (update the snapshot). That's very very 
important, and I think it should be pretty fast to check and print to 
the user. Currently I've managed to emulate it in scmproj trunk with 
pnew alias, so I can see missing-like report for components relative to 
their snapshotted state. So based on this info I can inspect which 
components are out-of-snapshot. (But I can't hook this into 
project-status alias yet).

c) Export -- really should be recursive, we don't have that yet, at 
least not for tar/gz/zip. I think it should work for exporting in the 
plain directory though.

d) Push/pull/update/merge/missing -- most of the time you want them 
recursive, but beware of feature branches!!! You should be careful to 
push into proper location, that's the pain point right now.

e) Uncommit -- very hard thing to do it properly. You don't want it to 
be blindly recursive, but you'd better have it recursive and uncommit 
what have been committed. So, if you committed root and component A, but 
component B is not changed, then on uncommit of root you also want to 
uncommit A, but not B for obvious reasons. The reverse is also true: on 
uncommit in A you also will want to uncommit in root, because it will 
have recorded revision-id of uncommitted revision in component A.

The things will faster becomes too tricky to handle if you want uncommit 
B. In this case you may want to uncommit root too, down to the point 
when revision of B has been recorded. And it may trigger uncommits in 
other components to the same level. It will be too destructive, IMO. But 
otherwise you'll have inconsistent tree, in any case. I think uncommit 
itself should be prohibited in nested trees, just because it might 
destroy too much. Where you will find uncommitted revision pointed to by 
project root in general? That's very very tricky.

I don't have any answer on this problem. But I think that problem is 
somewhat similar to problem with tags pointing to revision outside the 
branch history.
But if you can't use tag you won't loose too much. But if you're unable 
to checkout your past state of the project because there is no revision 
stored in snapshot, what will you do? That's complete inrecoverable failure.

Currently when I need to uncommit something in the project locally, I 
should tread very very carefully. And I think in general case it's much 
better to just make a reverse merge.

f) Another case similar to uncommit is `pull --overwrite` in the 
component tree may turn the history backward, thus effectively 
destroying it.  The consequences will be the same as with uncommit -- 
you simply won't be able to restore the past state of your project. 
Therefore you should never ever do that.

2) Composite history.

Currently I don't see (or don't understand) how can you get the 
composite log for the entire project. I can imagine that history of 
components could be shown as virtual merge points when root component 
has recorded revision ids of nested components. That will be really 
good, but I'm not sure about performance of such approach. Maybe it will 
be good enough to be usable.

In general I'd like to have recursive qlog by default.

But beware of uncommits and `pull --overwrite` -- they may destroy the 
history and therefore you won't be able to see composite log.

3) Other things

.bzrignore from nested components should be used in recursive 
status/diff/add operations. Most of the time they are used, at least in 
scmproj, but that should not be regressed with real nested trees.

and Last note.

As I wrote in my other mail about feature branches and their problems, I 
think bzr should store everything in the root component branch. In the 
end we have nice shared repositories mechanics, so duplicates of the 
components won't necessarily bloat you server storage. The separate 
component branches and branches integrated into projects may live in the 
same shared repository.

But as I think further, I come to conclusion that the best nested trees 
can be obtained if we just merge components into root component. This 
will automatically solve a lot of highlighted problems. But it will 
create a new problem: how to extract the component then? Share changes 
between projects using the same component?

I have some ideas about that, but as I can see they cannot be 
implemented within bzr model.

I hope my story will be useful.

-- 
All the dude wanted was his rug back