[RFC] optimizing bzr-grep

Wed Mar 17 13:27:21 GMT 2010

On Wed, Mar 17, 2010 at 2:29 AM, John Arbash Meinel
<john at arbash-meinel.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Parth Malwankar wrote:
>> Hello,
>>
>> I am working on optimizing bzr-grep searching specific revs[1].
>> I managed to get the time down from ~33s to ~23s for specific
>> rev search (e.g. -r last:1, not revision range). To get this down
>> further my experiments show that majority of the time is now
>> spent in:
>>     file_text = tree.get_file_text(id)
>>
>
>  tree.iter_files_bytes()
>
> This was designed as a way to favor extraction speed. Specifically, it

Cheers for iter_files_bytes. The grep time for specific rev is now grep is
down to 8.5s for emacs tree (from 33s initial and 23s earlier today).
Its ~4s for working copy. This also has the optimization for -F/--fixed-string
from earlier today.

[emacs-bzr]% time bzr grep -r last:10 ffo  > /dev/null
bzr grep -r last:10 ffo > /dev/null  7.60s user 0.92s system 99% cpu 8.576 total
[emacs-bzr]% time bzr grep -r last:10..last:9 ffo  > /dev/null
bzr grep -r last:10..last:9 ffo > /dev/null  20.57s user 1.53s system
94% cpu 23.318 total
[emacs-bzr]%

I plan to work some more to see if it can be optimized further. Range
grep is still a little slow.

I think the binary optimization can't be done at this point as
iter_files_bytes returns the full text.

Thanks Martin, John.

Regards,
Parth