[rfc] Windows symlink support

John Yates jyates at netezza.com
Sat Jan 14 00:55:57 GMT 2006


James,

I think that there are two very different uses for revision
control.  Both your comments and those of Wouter van Heyst
conflate versioning of directories for system management or
backup with the needs of distributed software development.

I do not dispute that a tool that can tackle the latter case
may have some (even many) of the attributes of a tool to
implement the former.  But that does not mean that what is
good for one use case is per force good for the other.

In the backup use case one's tool should be able to accept
and regurgitate arbitrary directory trees.  This should
include every idiosyncrasy of the host file system: time
stamps, permissions, acls, extended attributes, arbitrary
hard links, wild file names, etc.

But if you want to support distribute software development
with trees and repositories hosted on an uncontrolled set
of file systems (including networked systems) then I claim
a "good" tool should foster practices that will trigger as
few unpleasant surprises as possible.  This implies a limited
abstraction of a directory tree of files such that one can
reasonably expect to be able to instantiate that tree
successfully on nearly any plausible platform.


This discussion so far has been about symlinks.  I think
that your general philosophy is embodied in this statement:

  This would leave the responsibility of deciding
  whether or not to use symlinks in the hand of
  individual developers, where it belongs.

As you say Aaron has proposed a way for bzr to "track
symlinks in a branch so that one can still do a get on
win32 and not  have the system break down completely".

So the get succeeds and does not tell the "getter" that
he is hosed.  Remember that the symlinks are in a source
tree, not the result of a build step.  Presumably that
build depends on the semantics of the symlink.  It is
possible that had the get had simply cloned the symlink's
target building might have succeeded.  But with Aaron's
proposal it is nearly guaranteed that the build will fail.
So you call it success even though he leaves behind a broken
tree that cannot be built.  I call it dysfunction.

Now what about filename case or Unicode canonicalization
collisions?  A Linux developer who is not attentive /
knowledgeable of his "responsibility of deciding whether
or not to use" such features can just as readily create
a tree that cannot be "gotten" on a mac or win32 platform.
Would you then advocate an equally dysfunctional solution?


I want to catch errors early.  Aaron's proposal pushes
detection ever later.  Instead of discovering a cross-
platform problem at the point of doing the get he would
wait until attempting the build.  I want to catch the
error at the time the developer attempts to place an
object under version control.


Philosophy:

I generally like to build a system such that the default
semantics are chosen as to be as uncontroversial as I am
able to make them.  When I encounter a situation / use
case over which reasonable folk might differ in guessing /
predicting what my code should / will do I simply declare
that situation / use case to be illegal.  This leads to a
design in which ALL unobvious /  controversial semantics /
behaviors must be requested explicitly.  (I will readily
admit that this style is at variance with the *nix attitude
of the "the semantics are whatever the first release did".
But it has stood me in very good stead with my documentation
and customer support organizations.)


Applying such an approach to bzr by default I would reject

 - symlinks and multiple hard links
 - names that collide under case folding and/or Unicode
   canonicalization
 - anything else that violates the common intersection of
   the various file systems that I would hope to support

I would then create a means of relaxing those defaults via
project policies.  Thus one could record within a project
that it is okay to add symlinks as versioned objects.  But
a freshly minted project would not come into existence with
that capability.  You would have to actively and consciously
enable it.

Ultimately I think that I am not precluding the functionality
that you desire.  But I am a very strong proponent of the
principle of least astonishment.  And perhaps I may be
more willing than you to advocate trading off ease of use in
the system management / backup use case so as to make bzr a
better, less error-prone tool for distributed software
development across disparate platforms.

/john





More information about the bazaar mailing list