[PREVIEW] line-endings support

John Arbash Meinel john at arbash-meinel.com
Wed Apr 16 00:07:38 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jelmer Vernooij wrote:
| Am Dienstag, den 15.04.2008, 10:11 +0300 schrieb Alexander Belchenko:
|> Ian Clatworthy пишет:
|>> Talden wrote:
|>>> I agree that
|>>> - EOL conversion shouldn't be applied by default (behaviour won't
|>>> change with upgrades unless requested).
|>>> - Flags controlling EOL conversion should be version controlled
|>>> (needed for history stability).
|>> So if our initial cut of eol support stored it's settings in user
|>> configuration files (like bazaar.conf and locations.conf), that would be
|>> a problem? Or just suboptimal?
|> In my understanding using bazaar.conf to store this properties -- means
|> follow mercurial way. And this is way to nowhere.
| I think it's an acceptable way to start adding eol support because it
| doesn't add any burdens on format upgrades in the future.
|
| It's always possible to add tree-bound eol policy support later.
|
| Cheers,
|
| Jelmer

So... my opinions on the good and bad of something like this.

1) consistency over time. If the properties are not versioned then checking out
an old revision will give different results depending on your current state.
This could range from odd to really annoying.
Specifically, if you check the code in as CRLF and then check it out as Native,
will it now that the file in memory is CRLF or will it change it to CRLFLF
because it thought that everything marked "Native" would be stored in memory as LF?
Or as another example, if you check in a mixed-line-ending file as Exact, and
then check it out as Native/CRLF/LF does it have to scan every line of the file
to figure out what the line ending is here, and convert it to the "correct" form?
If I understand correctly, doing:

*.c: exact

bzr commit -m "some mixed line-ending-files"

*.c: native

bzr checkout

bzr status
# This should show that files are modified, because a new checkin will generate
# different line endings than what is already stored.

At least I think that is true. And doing a "pristine" checkout and having "bzr
status" report that contents have changed is a bit odd.

This makes more sense if the file is versioned in the tree. Because then if you
*modify* the line-ending file, it makes sense that it would modify how bzr
considers all of the files in that project. As 'bzr status' would show that the
file would be considered modified.


2) User-specific rather than project specific.

A given user may use a different text-editor/toolchain than the others. This
would allow them to override the default settings of the project, without
committing that change.


3) Inconsistency at commit time

If I change the '.bzrprop' file, and then do a:

bzr commit -m "just foo" foo

How should 'foo' be handled. I think it should technically use the '.bzrprop'
that has already been committed, ignoring the one in the working tree. Why?
Because when you check it out, that is the one that is going to be read.

Having the file somewhere else makes it clear that you just use that file.

Having versioned properties associates the data with the file you are
committing. So it isn't possible to commit the data without the meta-data. (Now,
you can get creative with a .bzrprop file and commit the portions of the file
that relate to what you are changing, but that gets *really* tricky.)

Note that I think the way to solve this is to introduce a
SelectedFilesWorkingTree object, which can grab the contents of files based on
whether they were

4) Merge conflicts

If the .bzrprop file conflicts at merge time, how do you finish checking out the
working tree? Because you may have conflicting info for the files that need to
be modified.

Further, *merge* needs to know the "canonical" form for the file, so it can take
both canonical forms, do the merge, and then spit out a "munged" form.
(Otherwise you run into problems with the tree form has CRLF even though you
said "native" as the preferred format.)

Again, Versioned Properties tend to handle this a little bit better. When the
property changes, you can know that it used to be LF but *now* it is Native, etc.


5) Recursive properties

One thing I really *don't* like about Subversion is that "svn:ignore" is not a
recursive property. Which means that every time I create a new Visual Studio
project, it tries to add all of the executable files and .obj files. And there
isn't a good way around it. *Because* until the directory is added, you don't
have a way to *set* the "svn:ignore" property. So you have to:

1) add just the directory, without recursion
2) open up the parent directory in explorer, and go to properties
3) edit the svn:ignore property without modifying it, but check the "apply this
recursively" button. This will cause the newly added directory to have the
correct ignores.
4) add everything else.

Note that newer visual studio (2008?) creates 2 levels of these directories. You
have the Solution which gets a 'debug' with the .exe/.dll files, and you have a
Project subdirectory which gets a 'Debug' with the .obj files. (And yes, one of
them is capitalized and the other lowercase by default, I could be wrong about
which is which.)
So you have to do it twice.

I believe 1 solution was to set a global ignore setting for the user. I can't
use that, though, because co-workers don't have it set and aren't very savvy.
Which means that if I create a new project, commit. And then they work on it,
add a new file, add, commit. Then suddenly all of the exe's, objects, (and more
annoyingly the 'this is the workspace view right now.ncb' file gets versioned,
which is both huge and conflicts based on your local settings.)

(admittedly for a new Solution you are likely to 'bzr init' a new branch. but
you can easily copy over your .bzrignore, and it will stay correct every time
you add a new sub-project.)

I think our .bzrignore is much better than svn:ignore, it would be nice to do
similarly well with our line-ending support.


6) Patterns versus explicit file-id

This goes along with (5). In that if you were versioning an explicit property
for every file then how do you make sure they get set correctly. I think the
solution is to have a versioned file that has the project "defaults" (this might
not be versioned, but it probably needs to be project specific). I know this has
been suggested, and I certainly support it. (IIRC svn does have this for
svn:mime-type and svn:line-endings (or whatever the property is)

There are a few UI issues, though. When a property conflicts, it should be
presented to the user by path, not by file-id. Generally users have no idea what
file ids are. This gets a little bit tricky with renames, etc, but we generally
have that sort of thing solved.

The other good/bad of using file-id is that if someone does "rename foo.txt
foo.rst". Are they intending the line-ending of the file to change? Maybe yes,
maybe no, I would generally say no unless they explicitly request it. Which is
where file-ids work better.


That's all I have for now, but it might give Ian some more context about
line-ending support.

At this point, I'm okay with a 95% solution, because, as always, the 100%
solution is going to be really hard, and never really attainable. (The key is to
just make the 5% not terribly broken)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgFNToACgkQJdeBCYSNAAOEFgCffZtvwWh3yAoOeov0cAr89P2X
oXoAn0QVVKY79f7hsvPyTXsInNznQyaY
=4H9t
-----END PGP SIGNATURE-----



More information about the bazaar mailing list