Versioned metadata

Fri Feb 10 04:02:47 GMT 2006

On  9 Feb 2006, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
> Jan Hudec wrote:
> > Besides someone might come with a, possibly problem specific, request
> > and should be able to implement it with a plugin.
> > 
> > The above metadata can be separated into three categories:
> > 
> >   * Independently versioned metadata:
> >     This is metadata that are versioned (stored in a weave/knit), but
> >     not related to particular revisions. Latest version of this metadata
> >     is in effect unless particular revision is specifically requested.
> >     To this category belongs:
> >     + Tags
> >     + Subtree URL list
> 
> There's also been some talk about annotating revisions.  This could
> allow people to safely ajust commit messages, or to note that, say, a
> particular revision introduced a bug, received a +1 from Robert Collins,
> etc.

This still has to be stored somewhere though, and that will determine
how it behaves.

The recent refactoring of storage into branch, repo, and workingtree
compartments possibly helps clarify this: we need to decide which of
those it should go into, or if there should be another compartment.

Specifically, these can be stored in weaves/knits, but those containers
are indexed by revision-ids, and so we need a revision-id that tells
which is "latest", or ought to be in effect.  (Or we need to index them
by some other parallel-universe revision-id.)

> >   * InventoryEntry metadata:
> 
> Perhaps also a binary flag or MIME type.  For example, no jpeg image
> should ever have a text merge performed on it, even if its contents are
> all alphanumeric.

Occam's razor suggests that we should perhaps just detect at the time of
doing the merge that the contents are not plain text.

The main use of such a flag would seem to be files that look just like
text, but that should never be auto-merged or displayed as text diffs.
Since we never implicitly commit the result of a merge it seems a bit
unimportant.

> > The third category is quite populated, so I'd like to propose a bit of
> > common infrastructure for it:
> > 
> >  - InventoryEntry would get a properties list, in a way similar to how
> >    it now has the executable attribute.
> >  - Each element of that list would be an object, descendant of Property.
> >    These objects would have common interface providing:
> >    - serialize/deserialize for storing it in inventory.
> >    - method to find the value from working dir (default implementation
> >      would just get it from the inventory, for cases where it's set
> >      explicitly (for now all except executable flag)
> >    - method to apply the value to working dir ( would be a no-op except
> >      for executable flag)
> >    - method to merge that property.
> >  - Changeset/TreeTransform/merge core and other code that needs to deal
> >    with it would call know to use that interface.

This sounds fairly reasonable: I think perhaps the default for them to
just be treated as multiline strings and have no semantic meaning.  It'd
be good if plugins can attach properties that they give special meaning,
but that just hang around if they're not specially treated.

> Hmm.  I'm not sure about this.  It would be nice to be able to do scalar
> merges on all properties, regardless of type.  But scalar three-way and
> scalar weave merge are very different.
> 
> >  - There would be an 'UnknownProperty', that would serialize/deserialize
> >    by simply keeping the XML chunk, compare with the inventory content,
> >    not apply to working dir and abort a merge if not equal in both
> >    revisions. 
> 
> I would prefer if we handled unknown properties using scalar three-way
> or scalar weave merge, as appropriate.  That would handle the simple cases.

That sounds OK - but what will happen if there is a conflict?  Should we
ask the user to resolve them?  Ideally there would be some way for the
code managing the property to intervene.

> > I'd like to start hacking on this and use it for the final part of the
> > new ignore system. However I'd like to hear some opinions on it first.
> 
> Personally, I like having all the ignore data in one place, rather than
> scattered throughout the inventory.  I recognize that putting it in the
> inventory means moves perform a bit more nicely, but it also restricts
> flexability, since you can't refer to directories that don't currently
> exist.
> 
> Say you have a directory for each architecture, and you want to ignore
> some of the files.  I think this is impossible using directory properties:
> ./arch/*/*.lo

Yes, those are definitely disadvantages of having it spread around.

-- 
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060210/fb08f4ec/attachment.pgp