[MERGE] push handles file-ids containing quotes correctly

Aaron Bentley aaron.bentley at utoronto.ca
Mon Jul 10 17:42:13 BST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> +1 on having our own _unescape that does the right thing. 


> +0 on having
> the tests all the way up at 'push'.

There are two tests.  One to make sure the push command does the right
thing for evermore, and one to make sure that fetch does the right thing
with all InterRepository implementations.

 also +0 for having to pass over the
> string N times for N replacements.

I'm just fixing a broken function, not rewriting it the way I think it
should have been written in the first place.

> I think a better implementation would look like:
> 
> _unescape_map = {
> 	'apos':"'",
> 	'quot':'"',
> 	'amp':'&',
> ...
> }
> _unescape_re = None
> 
> def _unescaper(match, _map=_unescape_map):
>     return _map[match.group(0)]
> 
> def _unescape_xml(data):
>     global _unescape_re
>     if _unescape_re is None:
> 	_unescape_re = re.compile('\&([^;]*);')
>     return _unescape_re.sub(_unescaper, data)

Yes, that's what I would do also, if I were rewriting it instead of
fixing it.  Or I might try to use a real entity decoder that can handle
numeric entity references.

> This means that we will raise a
> KeyError if a pattern isn't found, but it is better than an assert error
> (which won't be run in python -O). 

If there is a codepath that can produce this, there is a programming
error somewhere, so I think an assert is reasonable.

> It also means a single pass over the
> data rather than multiple passes.

This seems nicer to me, but I have no benchmarks to prove that the
difference is significant.

> Also, this should be tested directly (by calling _unescape_xml directly
> with several different inputs). If there is a problem higher up, then
> maybe we should also have a test for 'push'. But to start with, we
> should have direct tests of '_unescape_xml'.

Strongly disagree.  Private methods need not be tested, but their
clients must be tested.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEsoNl0F+nu1YWqI0RAjo3AJ9Dj5asEBHaYN+JhL+Q7XQzFmYQngCeMQcE
IWTL0IOzzq7QckizozLe6Nw=
=bzab
-----END PGP SIGNATURE-----




More information about the bazaar mailing list