[MERGE][1.0] update C PatienceDiff to allow for non-strings
John Arbash Meinel
john at arbash-meinel.com
Wed Dec 5 16:26:10 GMT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Aaron Bentley wrote:
> John Arbash Meinel wrote:
>
>> So anyway, I have an updated patch, but it is certainly less critical now. It
>> will allow arbitrary objects, but the performance is pretty equal. (I do wonder
>> if PyObject_Hash isn't the right way to go though. As there will be times when
>> the hash can already be computed, or help in future comparisons.)
>
> If you want to benchmark an actual command, annotate is probably your
> best bet. "diff" only spends a small fraction of its time doing text
> comparisons.
>
>
> Aaron
Well, I'm doing (psuedocode)
texts = w.get_line_list(w.versions())
old_text = []
for text in texts:
text = [l[:10] + l[10:] for l in text]
time(PatienceMatcher(None, old_text, text).get_matching_blocks()))
old_text = [l[:10] + l[10:] for l in text]
So it is running ~1,600 diffs for the whole ancestry of builtins.py
I think that code like annotate would show larger gains for using PyObject_Hash.
And, in fact, when I was doing:
old_text = None
for text in texts:
if old_text is None:
old_text = text
continue
time(PatienceMatcher(None, old_text, text).get_matching_blocks())
old_text = None
Which means that it only compares every other pair of texts. In this case
PyObject_Hash is a noticable win. Because of how "get_line_list()" works, it
will re-use lines that have not changed. So all of those hash lookups are
already cached.
I would imagine that annotate would also give similar benefits to PyObject_Hash
because it is also using get_line_list() to get the texts.
- From what I measured. Using PyObject_Hash costs 25% of the time if they need to
be computed, but saves about 25% of the time if they don't. (4-5s one way 4-3s
the other).
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHVtEiJdeBCYSNAAMRAovuAKC8G3GkpmUSzY4k0wkBZU5feSBGkACeORv+
tRcLCzYIuzSyL8E9Q1rKpGM=
=ElB/
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list