Storage requirements of tree transforms
John A Meinel
john at arbash-meinel.com
Tue Jan 17 04:25:51 GMT 2006
Aaron Bentley wrote:
> Hi all,
>
> I'm trying satisfy three conditions, and I think I can only manage two:
> 1. tree transforms never take up extra filesystem space while they are
> pending.
> 2. tree transform results are correct when the results of creating a
> file depend on the contents of the working tree.
> 3. when the caller is a merge, it doesn't have to stash away extra
> copies of the working tree file.
>
> My original proposal satisfied 1. I'm now proposing to satisfy 2 and 3.
> ~ The extra size of a tree undergoing a transform will be the size of all
> new contents that are pending. This will usually be a fraction of the
> size of the working tree, but could, in unusual circumstances, exceed it.
>
> Explaining 2
> ============
> A good example of 2 would be the current revert* scheme, in which we do
> a three way merge with THIS = working_tree, BASE = working_tree, and
> OTHER = target_revision_tree. By three-way logic, that collapses into
> 'turn THIS into OTHER'.
>
> But there are more complex examples in which it is not apparent that two
> pathnames refer to the same file. For example, when a tree is visible
> on the local FS and also on nfs.
>
> * The tree transform revert does not use merge, and thus does not delete
> newly-added files.
>
> Explaining 3
> ============
> I had originally planned that the iterators for new file contents would
> be held in memory until halfway through the transform application, and
> then applied.
>
> For in the case of merging, then, it would not be known whether the
> merge produced textual conflicts until after the transform had been
> applied. After the transform had been applied, the original contents of
> the file, needed to produce foo.THIS, would be gone. So in order to be
> able to produce foo.THIS, the merge code would need to stash a copy of
> the unaltered file somewhere.
>
> The new plan
> ============
> The transform application uses a temp directory used to hold files
> undergoing moves/renames, called ".bzr/limbo". I propose to stick the
> new contents there. Further, I plan to evaluate the iterators when the
> new contents are added.
>
> That way, we can do:
> ~ ...
> ~ merged = Merge3Iter(this, base, other)
> ~ tree_transform.create_file(merged, trans_id)
> ~ if merged.conflicts:
> ~ tree_transform.new_file(name+".THIS", parent, this,
> ~ executable=executable)
> ~ ...
>
> Does this sound like a reasonable trade off? Does anyone actually care
> about a small size increase during transform creation/application?
>
> Aaron
We've lived with the current code, which I don't believe tries to
minimize the size usage.
I think the biggest thing that people care about is speed. Taking space
does take up some time (since you are constraint by write/read speed of
a hard drive).
While it would be nice to be size stingy, it is better to be faster.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060116/3d6325fe/attachment.pgp
More information about the bazaar
mailing list