Making Wrong Code Look Wrong

Thu Apr 21 19:03:47 UTC 2011

On Thu, 2011-04-21 at 09:26 -0400, Barry Warsaw wrote:
> Being absolutely, rock-solid
> clear about what are bytes and what are strings is typically the most
> difficult part of a Python 3 conversion.

For a coding convention that helps with that sort of thing, see
Joel Spolsky's "Making Wrong Code Look Wrong":
http://www.joelonsoftware.com/articles/Wrong.html

The gist: The version of Hungarian Notation that escaped into the
wild (largely via Windows's APIs and recommended coding
conventions) got a deservedly bad reputation, but it was a
badly botched misinterpretation of the concept.  The original
idea (from Microsoft's Applications team -- the Excel and Word
people) was much more useful.

The idea is not to add a prefix label denoting a variable's data
type, but rather its *semantic* type.  So instead of prefixing
every string variable with "s" for "string", which is pointless
and annoying, they might be given prefixes to denote whether a
given string contains Unicode, UTF-8, the locale's default
encoding, straight ASCII, or binary data.

That way, one can spot encoding mismatches at a glance, rather
than having to trace back to find out what flavour was assigned
to each variable in a given statement.  Which of these has an
encoding-mismatch bug?
    this = that + other
    uc_this = uc_that + uc_other
    uc_this = uc_that + utf_other
    uc_this = uc_that + uc_from_utf(utf_other)

I infer that Python 3's byte-vs-string distinction would help
somewhat, but presumably it's not nearly as fine-grained.

That's the theory.  I've only just discovered this, so haven't
had a chance yet to use it "in anger", which means I can't report
on how well it works in practice.  Does anyone else have
experience with it?  Does Bazaar already have any such
conventions?

  - Eric