[MERGE] Rule-based preferences (EOL part 1 of 3)

Martin Pool mbp at canonical.com
Wed May 21 07:09:25 BST 2008


The way we tackle this ought to be consistent with what we do in other
places in Bazaar, and informed by experiences on other features or in
other systems.

I think we should add something to the developer documentation about
the general way we approach these issues, and have made a start on it.
 If we merge this in we can edit it after settling this issue.

(I can think of quite a few more things that we have learned that we
could usefully  add, but for now I'll stick to this particular issue.)

>> I think one reason to be careful with a new file in .bzr/branch is that
>> we have so far *largely* kept from having user editable files in control
>> directories; if the expectation is that there is a bzr UI for managing
>> these files that would be nice.

.. and the converse is that everything outside of .bzr is a user file,
propagated and committed in the usual way, and the user can make
arbitrary edits to it.  This doesn't mean bzr itself will never
consult or modify those files, but it's in a fairly restricted way.
It is basically data not metadata.

> I agree that principle has served us well. branch.conf is the only
> exception I know of. Shell hooks might live under there one day but
> don't yet.

To me the question is really whether we want to have a third category
of file that is neither a control file nor a user file, and how such a
category would behave.  If we take branch.conf as an example then I
believe it is never rewritten by bzr, never propagated, and not
changed by upgrades.  So it is very close to what is done by
locations.conf, but actually stored in the tree.  I can see how people
might also want some rules to be configured that way but it would be
unusual.

If we are going to do this more widely maybe we should make a
directory clearly just for this use, say .bzr/etc/.

I would suggest that generally hook scripts should be stored in the
tree because they're part of the project and should be consistent
between contributors.  But for security perhaps the configuration to
turn them on should be in .bzr/etc/ or ~/.bazaar/.

>>> Personally, I'd be happy putting a .bzrrules file in the tree because
>>> we have a file management architecture (status + diff + merge + commit
>>> + update/pull/push + ...) for files in that location. But, as you know,
>>> Robert (and others) will reject that because it locks us into a "format"
>>> which is more difficult to upgrade. I don't agree it's as big an issue
>>> as others, but I also don't have their experience and we all need to
>>> support this.
>>
>> I won't reject it; I think it would be a mistake so I certainly wouldn't
>> approve it, but perhaps we simply have to learn this lesson again, if
>> Aaron and I are failing to express the issues satisfactorily.
>

> I hope I understand your concerns, namely:
>
> 1. Putting data about the tree in the tree is evil. This causes
>   ugliness like special hacks in export to skip over it. It also
>   makes it next to impossible to abstract how we deliver functionality
>   in the future or in custom applications.

To separate them out.

1a - The working tree is the user's namespace.  Some people would say
even the  .bzr directory is an intrusion, and they like the way that
Perforce stores no metadata in the tree at all, or that svn has hidden
properties (and dislike the way there is a .svn in every directory).
But, we've already decided it's acceptable to have at least .bzr and
.bzrignore there, and users generally seem to think that's reasonable.
 This does have the advantage of a bright line between files they
should modify and files they should not, and making plain diffs to the
tree include changes to this file.

1b - The question of abstraction: are we storing the particular bytes
the user committed, or the general meaning of them which could be
given different representations?  Again at the moment it's pretty
straightforward: things in the tree we store the bytes; things in .bzr
we can change the representation in upgrades.

> 2. Exposing metadata like this locks in the format reducing
>   upgrade options and forcing us to always carry supporting code,
>   however obsolete that format is in the future.

That issue exists in either case, though it is slightly more serious
if we store it as bytes: we might eventually remove support for the
weave format, but we still need to be able to read the .bzrignore
files committed then, because it will be carried over by upgrades.

> 3. The future consequences are unclear but are likely to include
>   lower performance.

I think the specific performance concern was that having one list of
globs which is matched against every file (or every file in scope for
a particular operation) may be slow.  Secondarily we are using an
ini-style format, which is apparently pretty quick to parse may not be
the ultimate.  I think Robert's real concern though is that if we
treat it as a user file, we'll be promising to always support that
representation in future.

However, we're not totally locked in: if we do discover we really want
to use some other format, it's possible to add that as an alternative
or to suggest people switch.

The svn properties approach is far more problematic because it both
promises to store arbitrary user data and multiplies it out by the
number of files.

I believe Robert had one further significant concern which is that if
this is introduced without a format bump, people may use old clients
on such a tree and it will ignore the rules, therefore potentially
committing changes with the wrong line endings.  If we did have a
format bump, people would get an error that they could not access the
tree at all.  In fact, we would also need a repository format bump to
make sure people didn't pull such a tree into an existing checkout.

I think most of our users would not see that as a good cost benefit tradeoff.

Even if we did this, because new attributes can be handled by plugins
it would not really give total protection.  That's why I suggested to
Ian that we might in future let branches or repositories have
user-configured requirements for bzr versions and plugins, in case
projects want to set that policy.

> On the other hand, we gain rules that are stored per revision, we
> get a UI for diff/merge/etc., we don't force a format upgrade on
> everyone and users of most popular systems are already comfortable
> with managing .xxxignore files this same way, however imperfect it is.

And not only do we get a UI for merge/diff/conflict, etc for free, but
that UI is understandable: it's in the tree, so it clearly just works
the same as everything else in there.

> It's a question of trade-offs at some point. I fully agree with
> the benefits of typed data. I also think our policy of new semantics
> means new format is good. But I can live with limited magic files
> at the top of a tree, all marked as ".bzryyyyyy". To give us breathing
> room for format bumps, we could introduce an optional format marker
> either as a file extension (.bzrrules.v2) or top line of file, e.g.
>
>  # format 1.6

I think those options give a reasonable escape route if we want to
change in the future.  I do think the format marker should be optional
because needing to get it just right will feel bureaucratic.

>
> My point is that there are solutions (of varying quality) either way.
>
> IMO, "mistake" is a harsh word - I see it as being less black and
> white than that. At some point, our guidelines conflict or imply
> lots of work. Take these:
>
> 1. metadata in the tree is evil.
> 2. Users shouldn't edit things under .bzr.
>
> For complex metadata which is best expressed as a config file
> complete with embedded user comments, editing this directly rather
> than providing a command-line UI for a hidden file is a good choice.
> I'd prefer it wasn't in the tree but we don't have an architecture
> for many out-of-tree operations on files like this. So, putting some
> (potentially format-marked) files only at the top of the tree now
> is a trade-off I'm ok with. We're forever carry code for that format
> but I think many of the other problems are oversold at times.

I agree.

>>> 1. Is it really true that everything under .bzr is determined by
>>>    the repo/branch/tree "formats"? Is it illegal for plugins to
>>>    add files there and if so, where should they put them? It really
>>>    frustrates me that we have a beautiful open code architecture
>>>    (plugins) yet our data architecture is so closed, requiring
>>>    multiple upgrades per year for many users. I struggle to see,
>>>    and to explain to Joe Average users, why adding some optional
>>>    preferences to a branch is worthy of a branch upgrade, at least
>>>    until such time as those things are versioned.
>>
>> Plugins that need user settings can use the get_user_option api to get
>> settings (side-note: why doesn't that suffice?). Yes it is illegal to
>> put files there - consider 'bzr-svn' which provides branch and tree and
>> repository objects - there is no .bzr directory for them, and its having
>> a clear abstraction barrier there that allows it to work as seamlessly
>> as it does. Typed data for the win. There isn't even a guarantee that a
>> branch at a url actually *has* a directory attached to it - it can just
>> as reasonably be a postgresql database or some other system. Making it
>> expose a read-write file system is not very clean.

(Having just been cleaning up some of the code that touches or
implements control_files and seeing some code depends on it being
present, I am a little surprised this has not caused failures with
bzr-svn.  I suppose the particular cases have been addressed in either
bzr-svn or bzr.  Even now, RemoteBranch needs to provide a fake
LockableFiles to satisfy this.... but I'll get it.)

>> I've argued strongly *for* landing the core components of this with no
>> other dependencies, and I certainly don't think you need to
>> touch .bzrignore.
>
> I have no complaints and don't feel you've been blocking me in any way.
> I did feel there was no point putting up something you felt
> passionately against though. Please continue to argue for what you
> feel is right, while understanding that my views on trade-offs
> will differ from time to time.

I think landing the per-user-configured part would be worthwhile as it
is, and would reduce the size of the outstanding diff.  But having
been through this thread I do think having it in the tree is
reasonable.

-- 
Martin <http://launchpad.net/~mbp/>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: principles.txt
Url: https://lists.ubuntu.com/archives/bazaar/attachments/20080521/0e0cb9bf/attachment-0001.txt 


More information about the bazaar mailing list