[MERGE] integrated EOL conversion support
Stephen J. Turnbull
stephen at xemacs.org
Mon Mar 30 08:03:32 BST 2009
Ian Clatworthy writes:
> Thanks for the feedback Jonathan. I think you're confirming what John is
> saying: people mentally think about the working tree format as their
> priority even if the collective "project" cares more about what's stored
> in the VCS repository.
I still don't understand this claim. The only project that ever cares
about the repo format per se is Bazaar. Nobody else ever reads or
writes repo storage, except for a few thrill seekers with hex editors.
Us users would like to have repo formats just plain go away, eol
format, rich root, locks, stock and barrel, thank-you-very-much! They
are a major source of complexity, not to mention confusion, in bazaar.
The "project" cares about two things: it wants its developers to see
no extraneous control characters or formatting bogosity in their
language translators, editors and viewers, and it doesn't want
non-text files changed at all except by an explicit act of a
developer.
So IMHO what you really want here is the following:
(1) A user-visible flag that says whether the file is text or binary,
probably defaulting to binary. bzr should never munge binary
files, on pain of being dpkg --purge'd with extreme prejudice.
(It might be reasonable to have a pre-commit check warning "if you
commit this file, all the line endings will change!") It should
be moderately annoying to change this flag to text. Ie, being
asked to confirm "If you set this to text, on checkout and commit
bzr will silently change the content of the affected files so that
characters that look like line endings conform to the platform
default or user settings. It is safe and convenient for most text
files. BUT THIS CAN RESULT IN IRRETRIEVABLE DATA CORRUPTION! Are
you sure you want to do this?" This setting is propagated to
branches when pulled, pushed, or cherry-picked.
(2) If the file is text, a user-visible option that sets the
checked-out file's line ending format. This option may get its
value from several places: an explicit flag to the checkout
command (syntax will be painful since this should be a per-file
setting, not an "everything in this command" setting); an explicit
configuration in the workspace, an explicit user preference, or a
default for the platform where bzr is being executed. If a value
is stored in the part of the branch that gets communicated to
other branches, it should be very low priority (higher than
platform default, but lower than any user setting). It might be
useful to warn if a repo default is set, and the user sets EOL
otherwise (eg, in case of MSVC project files).
(3) An internal (not user-visible) per-file flag that determines the
storage format of non-binary files. (By "not user-visible" I
suppose I mean that there is exactly one command that can change
the internal representation of EOLs for a given file, and it does
absolutely nothing else, and it scolds you for even dreaming of
using it.) This is necessary so that changing from binary to
text, or changing the internal representation, doesn't affect
annotate results. You'd probably have to maintain a history of
such property changes, but I haven't thought carefully about that.
This is propagated from branch to branch.
> Several people, John included from memory, have suggested using compound
> names like native-lf, crlf-crlf, etc. where one part is the WT format and
> another part is the repo format. If we do this and put the WT format first,
> we get:
>
> * native-lf (instead of lf)
> * native-crlf (instead of crlf)
> * lf-lf (instead of lf-always)
> * crcf-crlf (instead of crlf-always)
> * exact.
>
> Would those names be better and reduce the potential for confusion?
I think they present an unnecessary potential for confusion. The user
only cares about the file he edits and compiles (or whatever), not
what's in the repository. Admins care about what's in the repository
because of the diff/annotate problem, but only to the extent that they
don't want it to change by accident.
Isn't exact another name for 'binary'?
More information about the bazaar
mailing list