File-types/encodings

Marcus Sundman sundman at iki.fi
Mon Jan 14 15:41:29 GMT 2008


> > How can I tell bzr what types my files are (including, but not
> > limited to, what encodings my text files use)? I mean, obviously
> > diff tools etc. must know how to interpret the files they operate
> > on (e.g., I don't want 'ä' to be shown as '}' or cause some error
> > because its iso-8859-1 encoded byte is invalid when the diff tool
> > read it as utf-8, or whatever), but I can't seem to find where to
> > enter this information. Also, does bzr have something like
> > subversion's auto-props (although perhaps something that might
> > actually work with basic things like "text/html;charset=utf-8")?
> 
> We don't track character sets explicitly (at the moment).  bzr's
> builtin diff should not care whether the file is iso-8859-1 or utf-8
> or similar character sets.

I see. Well, although the end result is the same no matter what
encoding one uses if the diff does a binary diff there's still a
difference in how it's presented to the user. Also, interactive merge
utilities can't work very well if they don't know the encodings of the
files. Some textual file types embed the encoding (e.g., xml files).
However, many do not, and without knowing the encoding it's impossible
to be sure what the file content is. Therefore it would be nice if the
version control system would store this crucial piece of information.


- Marcus



More information about the bazaar mailing list