line endings redux

Alexander Belchenko bialix at
Tue Mar 6 20:48:02 GMT 2007

Hash: SHA1

John Whitley пишет:
> Some open questions to spark discussion and/or further spec writing:
> * What command(s) are needed to manipulate the EOL property on existing
> files?
> * What values of the EOL property are supported?  What are the semantics
> of each?
>   (Jan Hudec gives a nice enumeration of the possibilities in the
> current spec.)

At least we should support LF, CRLF, CR, 'native' (platform-dependent) and
'exact' (without any conversion).
But there is 2 more types:

1) completely binary files (similar to 'exact' but *never* produce textual diff)
2) auto -- this is mostly how bzr handle files now: i.e. auto-detecting of text/binary and then use
exact for texts.

> * Are there any issues to consider when upgrading an existing branch?

I think it will pretty safe. All commited revisions automatically will get 'auto' property for all
files. I think we need to provide some tool (a plugin) to change the repository: i.e. by letting
people to set eol properties of commited files. Of course it means that revid will changed,
and this probably will be hard to implement. But I think it important part of upgrade.
Or user could consider to not change 'auto' property of committed revisions, and track eol
in new revisions. It may break diff (i.e. diff as all deleted and then all added) between old and
new revisions, but this is reasonable tradeoff.

> * When a user upgrades a branch to the new working tree format, how do
> they set
>   the appropriate eol properties on existing files in the tree?

Manually, via some new command (prop-set ?)

> * Binary file detection/handling.  EOL conversion can corrupt files, so
> care
>   is needed here.

I'm convinced that user should explicitly marks binary files.

> * How/where to manifest EOL settings in the UI (status, etc. )?  This
> dovetails
>   with the above item -- e.g. can an alert user see when an added file
> doesn't
>   have the correct EOL setting?

Why for it's needed actually?

In my understanding:
1) text files (excluding 'exact' eol) *always* saved in repository with LF-only.
This simplify cat, checkout, and diff between some revisions even if in those revisions file has
different eol settings.
2) EOL settings is used only to checkout files from repository on disk.
When bzr reads text files from disk for commit, check status or diff, line-endings automatically
strips and forced LF internally (but don't affects actual file on disk). Using LF internally
simplify many operations and therefore provide good performance.
3) 'bzr update' should update the files and set current EOL for each file.
In this case user can change EOL property of some files, run 'bzr update' and get these files updated.
4) Roughly the same applied to encodings property of text files: in repository text files should be
stored as utf-8. If file contains some characters that cannot be decoded in unicode with current
encodings property value, then commit should fails.

> I'm up for helping drive the spec to closure.  Depending on the
> timeframe, I may also be able to spend some cycles on implementation.

For me the main blockers is new inventory format.
I'm working on new format, my goal not only implement support for new properties, but also provide
better performance in regards to inventory operations. I have very slow progress,
and now I'm stop my work until dirstate changes is not stabilized and landing.

- --

Version: GnuPG v1.4.6 (MingW32)
Comment: Using GnuPG with Mozilla -


More information about the bazaar mailing list