[rfc] Windows symlink support

Aaron Bentley aaron.bentley at utoronto.ca
Sun Jan 15 22:21:43 GMT 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi John,

I think we're starting from two different places.  I wasn't considering
the possibility of not supporting symlinks in bzr.  We have them
already, they were added deliberately, and they're needed for Arch
compatibility, which is an explicit goal of the project.

I think you're right that we're not a structure-first project.  The goal
is to make a revision control system that's a joy to use, and we
strongly feel that the architecture should be designed to serve the UI,
not the other way around.  Sometimes, we can't come up with a sane
architecture to support a UI goal, and then we don't do anything.  But
when we can, we do.

Symlink support on Unix is a nice thing to have.  I wouldn't defend it
with my dying breath, but from my perspective, it makes users happier,
so my default position would be to include it.  You could certainly make
a case against it, e.g. that it harms Windows compatibility, or
introduces unacceptable complication to the codebase.

So, given that we have symlinks, I think it's unacceptable that using
symlinks in a bzr tree would make it impossible to check out on Windows.
~ Barriers are bad.  And it's especially bad when that barrier would
prevent a windows user from making a *nix-oriented project work better
on Windows.

So I think:
1. At minimum, checkouts must succeed
2. It would be much better if win32 commits didn't automatically destroy
symlinks.
3. It would be nicer still if there was a placeholder, so win32 users
didn't accidentally create new files with the same names as symlinks,
and so they could move/rename/delete symlinks.

John Yates wrote:
| The git project has formulated a clean abstraction
| of a tree-structured namespace:

Thanks for the link.  It looks like it would be fairly easy for git to
support symlinks.  Just add another blob type.

|>From the very top of Aaron's wiki page:
|
|   Add support for symlinks to bzr-on-windows, so that
|   windows users can collaborate on projects which use
|   symlinks in their source trees.
|
| This seems an ill-thought out use case.

It's true that I didn't spend a lot of time on it, because I felt the
need was obvious.

| If symlinks
| are used in any general way in the source tree such
| that any tooling used to build the project needs to
| follow those symlinks then in reality that project
| has written off developers on Window boxes being
| real participants.

I think that's a very absolute way to look at it.  It's very easy for
projects that are *nix-focused to have win32-incompatible properties,
due to accident or ignorance.  (e.g. bzr) Or for projects that
historically didn't care about Windows to start caring (e.g Arch).
However, being unable to get the project makes it very hard for
Windows-based developers to start fixing these problems.

| ...if bzr
| really must version links when hosted on a file system
| supporting them then I REALLY want a way to tell bzr that
| my project has forbidden links of any ilk.

That sounds reasonable.

| PS: Is it not the case that if bzr admits link semantics
|     into its tree abstraction then it is more or less
|     closing the door on leveraging git technology and/or
|     any future git convergence?

I wasn't discussing adding link semantics to our tree abstraction,
because they are already there.  I don't believe there's been much
discussion of leveraging git technology.  They seem to have different
goals from ours.  But as I noted above, it wouldn't be hard for git to
support links, if they desired.

~From your other message:
| In the backup use case one's tool should be able to accept
| and regurgitate arbitrary directory trees.  This should
| include every idiosyncrasy of the host file system: time
| stamps, permissions, acls, extended attributes, arbitrary
| hard links, wild file names, etc.
|
| But if you want to support distribute software development
| with trees and repositories hosted on an uncontrolled set
| of file systems (including networked systems) then I claim
| a "good" tool should foster practices that will trigger as
| few unpleasant surprises as possible.

I agree that we should stay focused on creating a tool to support
distributed software development.  If that happens to also be a good
tool for backups, all the better.

| This implies a limited
| abstraction of a directory tree of files such that one can
| reasonably expect to be able to instantiate that tree
| successfully on nearly any plausible platform.

We aren't going strictly for a lowest-common-denominator approach.  We
don't version all POSIX file permissions, because so far, no one has
come up with a plausible reason why it would be useful for software
development.  However, we do version the execute bit, even though it's
not supported on Windows, because it's very useful.

| As you say Aaron has proposed a way for bzr to "track
| symlinks in a branch so that one can still do a get on
| win32 and not  have the system break down completely".
|
| So the get succeeds and does not tell the "getter" that
| he is hosed.

I don't think anyone said we couldn't warn the getter that he is hosed.
~ We certainly can.

| But with Aaron's
| proposal it is nearly guaranteed that the build will fail.
| So you call it success even though he leaves behind a broken
| tree that cannot be built.  I call it dysfunction.

I see it as the first step on the road to recovery.

| Now what about filename case or Unicode canonicalization
| collisions?  A Linux developer who is not attentive /
| knowledgeable of his "responsibility of deciding whether
| or not to use" such features can just as readily create
| a tree that cannot be "gotten" on a mac or win32 platform.
| Would you then advocate an equally dysfunctional solution?

While the case-collision issue is a long-known problem, Unicode
canonicalization issues have only recently been raised.  I would prefer
to insist that data be in a specified canonical form, but this appears
as though it may be problematic on mac, which enforces its own
canonicalization regime.

| Philosophy:
|
| I generally like to build a system such that the default
| semantics are chosen as to be as uncontroversial as I am
| able to make them.  When I encounter a situation / use
| case over which reasonable folk might differ in guessing /
| predicting what my code should / will do I simply declare
| that situation / use case to be illegal.

Yes, we definitely do differ there.  I want to make the best decision,
not the least controversial.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFDysr30F+nu1YWqI0RAuNNAJ9QPQ2A3Ne/qlAltZTbB5EVtCWecgCfd1Op
RgTLyZHrSA5qbB1CGF6qLTM=
=27vn
-----END PGP SIGNATURE-----




More information about the bazaar mailing list