my strategy on implementing line-endings (eol) support
Alexander Belchenko
bialix at ukr.net
Wed Apr 2 18:15:37 BST 2008
Nicholas Allen пишет:
>
> |
> | In my conviction there is 4 types of files:
> |
> | 1) binary files
> | 2) text files with exact line-endings
> | 3) text files with native/LF/CRLF/CR line-endings
> | 4) unicode text files similar to 3.
> Isn't there just 2 types of files (binary and text)? 4 above is just a
> text file with encoding set to unicode. So I think file encoding needs
> to be another property (UTF8, ASCII, unicode etc).
From eol-conversion point of view it's not:
In [1]: u'\n'.encode('utf-16-le')
Out[1]: '\n\x00'
In [2]: u'\n'.encode('utf-16-be')
Out[2]: '\x00\n'
In [3]: u'\n'.encode('utf-16')
Out[3]: '\xff\xfe\n\x00'
By 'unicode text files' I actually mean 'utf-16'-encoded files.
More information about the bazaar
mailing list