my strategy on implementing line-endings (eol) support

Stefan Monnier monnier at iro.umontreal.ca
Fri Apr 4 16:20:44 BST 2008


Just a plea here: please don't assume Unicode = utf-16.
E.g., in the unix world, utf-16 basically doesn't exist and most uses of
Unicode is with a utf-8 encoding instead.

The problems you're talking about have nothing to do with Unicode, but
with the utf-16 encoding instead (tho it probably affect utf-32 as
well).  I wouldn't be surprised if other (non-Unicode) encodings suffer
from similar problems.  So please say "utf-16" rather than "Unicode".

Also I wouldn't worry too much about this problem: utf-16 being almost
exclusively used under Windows, I suspect that all the tools that can
handle UTF-16 can also perfectly deal with CRLF line endings, so there's
probably never any need for any form of EOL conversion on those files.


        Stefan "who's been confused for the Nth time by this mixup in
                this thread."




More information about the bazaar mailing list