newformat format change
Martin Pool
martinpool at gmail.com
Thu Oct 6 11:16:21 BST 2005
On 05/10/05, John A Meinel <john at arbash-meinel.com> wrote:
> One problem with the current trapping (where you use a regular
> expression to substitute everything that isn't a word character)
> name = re.sub(r'[^\w.]', '', name)
>
> Which I believe will catch newlines and tabs. But it also seems to catch
> too much in the way of international characters.
>
> Back when I was testing with Arabic characters, it was essentially
> generating file-ids with just the last portion (no filename part).
> Now, maybe you feel that your unique identifier is sufficient (it could be).
I thought it would consider all unicode word characters to match \w.
We should fix that; it seems reasonable to treat the id as unicode
since filenames can be.
--
Martin
More information about the bazaar
mailing list