[MERGE][1.0] update C PatienceDiff to allow for non-strings

John Arbash Meinel john at arbash-meinel.com
Wed Dec 5 16:26:10 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> John Arbash Meinel wrote:
> 
>> So anyway, I have an updated patch, but it is certainly less critical now. It
>> will allow arbitrary objects, but the performance is pretty equal. (I do wonder
>> if PyObject_Hash isn't the right way to go though. As there will be times when
>> the hash can already be computed, or help in future comparisons.)
> 
> If you want to benchmark an actual command, annotate is probably your
> best bet.  "diff" only spends a small fraction of its time doing text
> comparisons.
> 
> 
> Aaron

Well, I'm doing (psuedocode)

texts = w.get_line_list(w.versions())

old_text = []
for text in texts:
  text = [l[:10] + l[10:] for l in text]
  time(PatienceMatcher(None, old_text, text).get_matching_blocks()))
  old_text = [l[:10] + l[10:] for l in text]

So it is running ~1,600 diffs for the whole ancestry of builtins.py

I think that code like annotate would show larger gains for using PyObject_Hash.

And, in fact, when I was doing:

old_text = None
for text in texts:
  if old_text is None:
    old_text = text
    continue
  time(PatienceMatcher(None, old_text, text).get_matching_blocks())
  old_text = None

Which means that it only compares every other pair of texts. In this case
PyObject_Hash is a noticable win. Because of how "get_line_list()" works, it
will re-use lines that have not changed. So all of those hash lookups are
already cached.

I would imagine that annotate would also give similar benefits to PyObject_Hash
because it is also using get_line_list() to get the texts.

- From what I measured. Using PyObject_Hash costs 25% of the time if they need to
be computed, but saves about 25% of the time if they don't. (4-5s one way 4-3s
the other).

John
=:->


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHVtEiJdeBCYSNAAMRAovuAKC8G3GkpmUSzY4k0wkBZU5feSBGkACeORv+
tRcLCzYIuzSyL8E9Q1rKpGM=
=ElB/
-----END PGP SIGNATURE-----



More information about the bazaar mailing list