[BUG] Re: bug: files with non-ascii chars?

Ramon Diaz-Uriarte rdiaz02 at gmail.com
Sat Dec 30 23:36:31 GMT 2006


On 12/30/06, Jan Hudec <bulb at ucw.cz> wrote:
> On Fri, Dec 29, 2006 at 05:31:05PM +0100, Ramon Diaz-Uriarte wrote:
> > Dear All,
> >
> > This isn't really serious, but it might seem worrisome. In a directory
> > where I accidentally named a file
> >
> > f3.pngç (the weird char is the "ç").
> >
> >
> > I get bzr to crash (problems disappear when I delete that file). I
> > think it'd be better to say explicitly that a file has a weird name.
>
> Well, bzr is supposed to handle all filenames, that can be decoded to
> unicode using user's locale setting.
>

Actually, I think I created/renamed the file under an emacs shell
(which allowed me to type "ç" without noticing). I cannot get that
under my usual xterm.



> Can you please tell us what is your locale setting? The problem seems to
> be, that the filename is invalid in your LC_CTYPE locale. Does

output from locale is
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=


> 'echo f3.png*' show that filename correctly?
>
>

Under an xterm it does work; under the emacs shell is shows
f3.png\347


> Anyway, bzr should not give a backtrace even in such a case. Would you
> be so kind and fill in a bugreport on launchpad?
> https://bugs.launchpad.net/products/bzr/+bugs
>

Done.



> To everybody: The message output after the traceback still asks users to
> send the backtrace to this list. Shouldn't it ask them to file a bug on
> launchpad instead?
>

(I wasn't sure whether to fill a bug report, or the message to the
list was intended to prevent unneeded /irrelevant/inapropriate bug
reports).


R.

>
> More detailed analysis:
> sort method of a collection can only give UnicodeDecodeError inside
> comparison. The comparison will raise it if a non-ascii string is
> compared to a unicode. Now os.listdir with unicode argument only yields
> string objects if the filename is not valid according to current locale.
>
> It is probably OK to complain when such non-decodeable file is
> encountered, but it should at least be with an understandable message.
>
> > $ bzr status
> > bzr: ERROR: exceptions.UnicodeDecodeError: 'ascii' codec can't decode
> > byte 0xe7 in position 6: ordinal not in range(128)
> >
> > Traceback (most recent call last):
> >  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line
> > 626, in run_bzr_catch_errors
> >    return run_bzr(argv)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line
> > 588, in run_bzr
> >    ret = run(*run_argv)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line
> > 292, in run_argv_aliases
> >    return self.run(**all_cmd_args)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/commands.py", line
> > 598, in ignore_pipe
> >    result = func(*args, **kwargs)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/builtins.py", line 177, in
> >  run
> >    to_file=self.outf)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/status.py", line 137,
> > in show_tree_status
> >    specific_files=specific_files)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/tree.py", line 86, in
> > changes_from
> >    include_root=include_root
> >  File "/usr/lib/python2.4/site-packages/bzrlib/decorators.py", line
> > 38, in read_locked
> >    return unbound(self, *args, **kwargs)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/tree.py", line 428, in
> >  compare
> >    specific_file_ids, include_root)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/delta.py", line 186,
> > in _compare_trees
> >    new_path, new_class, new_kind, new_file_id, new_entry =
> >    get_next(new_files)
> >  File "/usr/lib/python2.4/site-packages/bzrlib/delta.py", line 182, in
> >  get_next
> >    return iter.next()
> >  File "/usr/lib/python2.4/site-packages/bzrlib/workingtree.py", line
> > 875, in list_files
> >    children.sort()
> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position
> > 6: ordinal not in range(128)
> >
> > bzr 0.13.0 on python 2.4.4.candidate.0 (linux2)
> > arguments: ['/usr/bin/bzr', 'status']
> >
> >
> > Best,
> >
> > R.
> >
> > --
> > Ramon Diaz-Uriarte
> > Statistical Computing Team
> > Structural Biology and Biocomputing Programme
> > Spanish National Cancer Centre (CNIO)
> > http://ligarto.org/rdiaz
> --------------------------------------------------------------------------------
>                                                 - Jan Hudec `Bulb' <bulb at ucw.cz>
>


-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz




More information about the bazaar mailing list