[RFC] Tracking file copies

Jelmer Vernooij jelmer at samba.org
Mon Jun 19 18:11:56 BST 2006


On Mon, 2006-06-19 at 10:46 -0500, John Arbash Meinel wrote:
> Jelmer Vernooij wrote:
> > On Sun, 2006-06-18 at 17:51 -0500, John Arbash Meinel wrote:
> >> I think a better thing to spend your time on is figuring out what you
> >> actually want to be able to do with copies. And work out how to make
> >> that happen. I could see having a file copy over its annotations from
> >> another knit. So that you still see the documentation about why/who
> >> changed a line.
> > I think it would make sense to be able to record that one file was
> > originally copied from another - not necessarily say the two are the
> > same. That way, only one file (the original one) is candidate for the
> > merge.
> > 
> > I am mainly interested in this information as metadata - so it can be
> > imported from and exported to other version control systems. It might
> > also help in terms of storage for some storage backends (in the case of
> > weaves, keep both files in one weave?).
> Doing it as metadata is not really a problem. The question is what kind
> of performance do you want from this. eg. Are you willing to wait for a
> complete search through history to find the point where the file_id was
> marked as a copy from the other id? Do we only allow creating a copy
> record at the time of creation, or can you do it at any time.
Perhaps it's best to look at the use cases for this information. I can
see this being used for:

1) annotate
2) log
3) copying data back into foreign branches
4) not having to look back into history to figure out file ids in
Subversion. This actually requires more than simply tracking copies; it
also requires that more than one InventoryEntry with the same fileid can
exist, making it all significantly more complicated. Probably not a good
idea.

I'm more and more convinced tracking fileids for Subversion should be
done using file id aliases, but I need to give that some more thought.

> > The reason I'm looking at this for Subversion is that it might help me
> > to get rid of the revision cache I'm currently keeping. I need to
> > traverse history in order to find the proper file id, even for files in
> > the latest revision. File id aliases, might also be of help here,
> > instead of copies.
> > 
> >> I think we might be able to do some neat things with a 'file_id X,
> >> copied from file_id Y'. But adding the ability to copy files can
> >> *really* complicate the model.
> > Just tracking the information wouldn't really hurt, though? It would
> > come in handy for roundtripping.
> Just recording the information is fine. But if you want it for
> 'roundtripping' you need a way of accessing it. And different recording
> locations have different performance implications. If you are willing to
> settle for any performance as long as the info is recorded, I'm sure we
> can find a place to record it. Right now you could record it as a
> Revision Property. Not really the right place, and requires a bunch of
> indirections to find it, but you could make it work without any changes
> to our current storage model.
That's sufficient for just the round tripping; when I'm pushing a
revision to Subversion I need to fetch the revision info for the
revision I'm pushing anyway.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer at samba.org> - http://samba.org/~jelmer/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060619/1e524533/attachment.pgp 


More information about the bazaar mailing list