is utf-8 the standard filename encoding?

Steve Langasek steve.langasek at
Wed Dec 21 19:28:06 UTC 2011

On Wed, Dec 21, 2011 at 02:18:11PM -0500, Rodney Dawes wrote:
> On Wed, 2011-12-21 at 09:42 -0800, Steve Langasek wrote:
> > It's possible I'm mistaken about the default behavior on Ubuntu
> > Server,
> > though - someone please correct me if I'm wrong.  Maybe this is
> > another
> > reason why we need to get the C.UTF-8 locale going everywhere.

> It is definitely not using C.UTF-8 everywhere.

No, I'm saying we *need* to get a C.UTF-8 locale.  We currently don't have
one for installed systems.

> And just C is not UTF-8.


> Is it even valid to specify a charset for C locale?  Doesn't POSIX define
> it as always being ASCII?

It's not valid for the C locale.  C.UTF-8 locale would be a distinct locale.

> > Notwithstanding the above (which indeed also explains why using the
> > locale's
> > charset value is a poor heuristic for interpreting filenames on the
> > Linux
> > filesystem), it's my understanding that the GNOME vfs stack has
> > refused for
> > several years now to work with any filenames that aren't UTF-8.  So
> > desktop
> > users with non-utf8 filenames are going to have a hard time of it.

> This isn't quite true. There is a complicated set of environment
> variables, and checks in the code, to ensure that display is always
> UTF-8, but it generally handles non-UTF-8 filenames gracefully.

It's graceful compared to a python backtrace, but AFAIK it doesn't actually
get you access to the files with wrong names?

Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
Ubuntu Developer                          
slangasek at                                     vorlon at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: Digital signature
URL: <>

More information about the ubuntu-devel mailing list