Storage requirements of tree transforms

Martin Pool mbp at sourcefrog.net
Wed Jan 18 01:30:57 GMT 2006


On 16 Jan 2006, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
> Hi all,
> 
> I'm trying satisfy three conditions, and I think I can only manage two:
> 1. tree transforms never take up extra filesystem space while they are
> pending.
> 2. tree transform results are correct when the results of creating a
> file depend on the contents of the working tree.
> 3. when the caller is a merge, it doesn't have to stash away extra
> copies of the working tree file.
> 
> My original proposal satisfied 1.  I'm now proposing to satisfy 2 and 3.
> ~ The extra size of a tree undergoing a transform will be the size of all
> new contents that are pending.  This will usually be a fraction of the
> size of the working tree, but could, in unusual circumstances, exceed it.

I think relaxing requirement #1 is completely fine.  I think having
something there is less important than how it behaves if e.g. the
operation is interrupted.  I don't think all the updates need to be
journalled, but it should at least get cleaned up by some operation like
revert or unlock.

> Explaining 2
> ============
> A good example of 2 would be the current revert* scheme, in which we do
> a three way merge with THIS = working_tree, BASE = working_tree, and
> OTHER = target_revision_tree.  By three-way logic, that collapses into
> 'turn THIS into OTHER'.

I don't quite understand how that gives condition #2.  Are you talking
about for example renaming directories which contain unversioned files?

> But there are more complex examples in which it is not apparent that two
> pathnames refer to the same file.  For example, when a tree is visible
> on the local FS and also on nfs.

How would you handle that?

> The new plan
> ============
> The transform application uses a temp directory used to hold files
> undergoing moves/renames, called ".bzr/limbo".  I propose to stick the
> new contents there.  Further, I plan to evaluate the iterators when the
> new contents are added.

Perhaps call it limbo.tmp to indicate it can be discarded?

> That way, we can do:
> ~    ...
> ~    merged = Merge3Iter(this, base, other)
> ~    tree_transform.create_file(merged, trans_id)
> ~    if merged.conflicts:
> ~        tree_transform.new_file(name+".THIS", parent, this,
> ~                                executable=executable)
> ~    ...
> 
> Does this sound like a reasonable trade off?  Does anyone actually care
> about a small size increase during transform creation/application?

So if I understand properly, creation of the text of the merged file is
delayed until the tree transform is applied.  That sounds good.  (I
wonder if it should be an object interface rather than an iterator but I
can't think of anything else that would go in that interface.)  But
perhaps you won't know whether there are textual conflicts until the
text merge is attempted.  It would seem a shame to run the text merge
twice.

-- 
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060118/0e4838fd/attachment.pgp 


More information about the bazaar mailing list