KnitSequenceMatcher a net performance loss
Robert Collins
robertc at robertcollins.net
Mon May 29 07:53:06 BST 2006
On Sun, 2006-05-28 at 21:32 -0500, John Arbash Meinel wrote:
> I'm not sure why it is, but in my work on performance testing
> PatienceDiff, I include a test of the KnitSequenceMatcher. And what I
> found was that KnitSequenceMatcher is actually slower than difflib's
> plain sequence matcher.
>
> I turned the patience-test.py script into a bzr plugin, available here:
> http://bzr.arbash-meinel.com/plugins/patience_test
>
> And this is what I found:
>
> for 20 knits:
> pdiff time: 2.82s 1553511 bytes
> cpdiff time: 2.39s 1553511 bytes
> kdiff time: 3.45s 1550155 bytes
> diff time: 2.75s 1550155 bytes
>
> for 224 knits:
> pdiff time: 47.61s 106.2% (relative to difflib time)
> cpdiff time: 39.92s 89.0%
> kdiff time: 54.12s 122.7%
> diff time: 44.85s 100.0%
>
> I'm running a complete test, but it hasn't finished yet. But this shows
> that my modified python PatienceSequenceMatcher matcher runs within a
> few percent of difflib's SequencMatcher (48 vs 45s). The compiled
> matcher runs much faster, but the knit sequence matcher runs much slower.
>
> So I would recommend that we go ahead and switch to
> PatienceSequenceMatcher for knits. It isn't as fast as difflib, but at
> least we get some better line annotations out of it.
>
> Now, I don't know how the sequence matcher was performance tested, I
> might be doing something weird. But I'm doing it at a pretty high level,
> so I think it is valid.
What difflib are you using, perhaps the implementation I copied was a
greatly slower one ?
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060529/bedf6898/attachment.pgp
More information about the bazaar
mailing list