line endings

Alexander Belchenko bialix at ukr.net
Thu Jan 31 09:47:53 GMT 2008


Stuart McGraw пишет:
> Marcus Sundman wrote:
>> Stuart McGraw <smcg4191 at frii.com> wrote:
>>> So my question is, how does one use Bazaar so that
>>> branches created on a Windows box will have the right
>>> line endings?  Is there some process I overlooked?
>>> A plugin?  A command option I missed?  Some third party
>>> add-in somewhere?  Does Bazaar development have any
>>> plans regarding this?
>>
>> Well, before changing line breaks one has to know the encoding of the
>> file (or otherwise it might break completely), 
> 
> How many encodings exist were this is actually a
> problem?  And how often are they used?  How often
> does the confluence of using such an encoding, and
> requiring auto l.e. conversion actually occur in
> the real world?  Its not a problem in the encodings
> I have to deal with (the common 1-byte iso encodings,
> utf8, iso-2022-jp, sjis).  I have not heard any
> screams of outrage from the TortoiseCVS folks.
> How about from the Mercurial folks who also have
> this?

If you're using 1-byte encodings (including utf-8)
the problem with line-endings pretty simple.
It's always  \r\n or \n (CRLF or LF).

But for 2-bytes unicode encodings like UTF-16
(and I think it's true for 4-bytes UTF-32 as well)
line-endings becomes more complex, i.e. for UTF16-LE

\r\0\n\0 and \n\0 (CRLF or LF).

For UTF16-BE of course byte order will be different.

So for UTF-16 conversion CRLF <-> LF should be slightly
different from 1-byte texts.
And this conversion should be blazingly fast, of course.

> And how would those numbers compare to the numbers
> of files handled by people like me, who develop multi-
> platform code?  I submit that the latter number is
> many, many orders of magnitude greater than the former.
> Given that such processing can be made completely
> optional, encoding breakage does not seem like a
> strong objection to me.
> 
> Binary file breakage might be a little stronger but
> looking at existing experience with other systems
> does the (small) risk outweigh the IMO much greater
> problems created for the common case of multi-platform
> development?  (Again we're talking about something
> purely opt-in.)
> 
>> doesn't yet support that I don't think we'll see line break changing
>> implemented anytime soon.
> 
> That's unfortunate.  Its a show-stopper for my use
> of Bazaar.

I know. It's hurt me too.



More information about the bazaar mailing list