[Merge] performance increases for 'bzr status'
Andrew Bennetts
andrew at canonical.com
Tue May 30 16:26:27 BST 2006
On Tue, May 30, 2006 at 08:28:59AM -0500, John Arbash Meinel wrote:
> Andrew Bennetts wrote:
[...]
> >
> > The simplest way is "self.assertEqual(sorted(some_list), some_list)" I think.
> >
> > -Andrew.
>
> Actually, I though 'sorted()' sorts the list and just returns it. ie the
> definition is:
>
> def sorted(lst):
> lst.sort()
> return lst
No, it's more like:
def sorted(sequence):
lst = list(sequence)
lst.sort()
return lst
For example:
>>> l = [3, 2, 1]
>>> sorted(l) == l
False
>>> l == sorted(l)
False
(playing with stuff like this interactively is very easy, I highly recommend
getting into the habit of doing it.)
See also "What's New in Python 2.4", which says:
There is a new built-in function sorted(iterable) that works like the
in-place list.sort() method but can be used in expressions. The differences
are:
* the input may be any iterable;
* a newly formed copy is sorted, leaving the original intact; and
* the expression returns the new sorted copy
(file:///usr/share/doc/python2.4/html/whatsnew/node12.html on my system)
> Which means that the above would always return true.
As you can see, that's not right :)
> I was hoping to have something inexpensive that wouldn't require copying
> the list, but the best I could think of was:
>
> assert sorted(some_list[:]) == some_list, "lists are not sorted".
>
> (This was runtime code, not test code)
If you want the theoretically fast way to check that a sequence is sorted, i.e.
without copying the whole sequence and sorting it (which is potentially O(n log
n), although I believe CPython's implementation is O(n) in the case the list is
already sorted), check that the invariant "seq[x] < seq[x+1]" holds for the
whole sequence. i.e.:
iterator = iter(seq)
previous_element = iterator.next()
for element in iterator:
assert previous_element < element
previous_element = element
(note that for simplicity this abuses the fact that iter(iter(l)) is iter(l)
when l is a list object -- this isn't necessarily true for all iterators.)
I think you'll find that in many cases that "sorted(l) == l" will be much faster
than a for loop that has to execute python bytecodes every iteration. Even
though it does more work, it does it all in C -- and it's more readable anyway.
-Andrew.
More information about the bazaar
mailing list