Last day to vote/reject on proposed EOL names

Stephen J. Turnbull stephen at xemacs.org
Wed Apr 1 06:23:02 BST 2009


Paul Schauble writes:

 > Unfortunately I think the only way to be sure of the file format is to
 > have the person sitting at the computer tell you. You can mechanically
 > guess at the format and ask "Is this file UTF-16LE?" but ultimately only
 > the one creating it knows.

Actually, this isn't true, in two ways.  First off, the whole problem
here is that the ones creating the files don't know and don't care.
The world should conform to whatever format their editors spit up.
Asking the creator is often not useful.

OTOH, if the file is valid XML, you will know.  If the file is
conformant MIME mail, you will know.  If the file is conforming
Python, you will know.  And so on.  The project can have standards
about such things, and if the file gets broken by an autotransform,
then fire the author (or whatever, presumably lesser, punishment is
appropriate in your organization).  And everything else should be
considered binary until proven text, IMO.

But because of the attitude problem alluded to above, you can't do
that if you want your software to be broadly acceptable.  I've been
contributing to Emacs/Mule for almost 20 years now, and I assure you
the problem is intractable.  Thing is, even a totally naive user can
detect text when he sees it (and when wrong, as for Makefiles and MSVC
project files, he'll exclaim, "That's cheating!"  "Man is the measure
of all things," you know.)  So in practice, to placate the majority of
your market, you have to autodetect format, and accept breakage when
that fails.

The questions here are (a) how to reduce the breakage to a minimum,
(b) how to document the inevitability of breakage, and (c) how to
enable responsible, clued-in admins like Alexander to set up easy-to-
follow, mostly-automated, policies in their shop that will *eliminate*
breakage for conforming usage.  I hope (c) is given precedence!




More information about the bazaar mailing list