[MERGE] UTF-8 encoding in binary diffs

Thu Jul 12 07:50:50 BST 2007

On 7/12/07, Robert Collins <robertc at robertcollins.net> wrote:
> On Wed, 2007-07-11 at 10:35 -0500, John Arbash Meinel wrote:
> >
> >
> > try:
> >   self.build_tree(u'unicode\u03a9')
> > except UnicodeError:
> >   raise TestSkipped('Platform cannot represent unicode characters')
> >
> > It isn't what I would prefer, but that is what we have been using.
> > John
>
> I'd prefer to see a test helper:
>
> path = self.get_nasty_supported_filename(prefix='foo')
>
> -> full unicode on unicode platforms
> -> just ascii in unicode form on other platforms or when LC_ALL=C
>
> This means that the test can run always, and will test to the limit of
> the platform.
>
> Separately we should have a test that tests the behaviour of filename
> encoding only, e.g. of two historical revisions - and that should work
> regardless of the filesystem unicode support, all it needs is a terminal
> that can show it.

I agree it should be factored out.  I'm not so sure we should return
an ascii filename for use in the test if it's running in the C locale
- after all, jml's test will then be not really exercising what it's
meant to do, and I'd rather have it visibly be skipped than pass
without really testing things.

In other cases this might have value though - for example when we want
to allow for platforms with normalization.  But even then it's
probably better handled by having tests that don't need to exercise
normalization use names that won't be affected.

-- 
Martin