[RFC] optimizing bzr-grep
Parth Malwankar
parth.malwankar at gmail.com
Wed Mar 17 02:21:57 GMT 2010
On Wed, Mar 17, 2010 at 5:29 AM, Martin Pool <mbp at canonical.com> wrote:
> On 17 March 2010 01:27, Parth Malwankar <parth.malwankar at gmail.com> wrote:
>> Is there anything I can do to speedup getting the full text of
>> a revision?
>
> Well, if by commenting this line out you're grepping a 0-byte string
> it wouldn't be surprising if it's fast :-)
>
So I took 4 measurements on hot cache on the emacs tree:
grep last:1 : ~26.4s
last:1 but return just after get_file_text (no grepping) : ~22.5s
return just before get_file_text (no grepping) : ~1.1s
grep working copy: ~4.3s
The above seems consistent as the working copy grep takes
around 4s.
> You should make sure you're holding a read lock on the whole
> repository for the whole time, so that things can be cached. -Drelock
> may help.
>
> Using log+file://.... for the repository may indicate inefficient IO.
>
> Using iter_file_bytes may be faster, or even better iter_files_bytes
> will let the repository choose a more efficient order. This will also
> let you check for binaries inline with grepping.
>
> It may be faster to grep the whole thing as a string before splitting
> it into lines.
>
> Use --lsprof.
>
> Compare the time to grep a revision to the time to export it.
>
'bzr export' is surprisingly fast compared to the grep implementation.
It takes just ~7.8s to export the entire tree (-r last:10). I will look into
that to see how its doing this.
Thanks for all the pointers. I will experiment with them.
Regards,
Parth
More information about the bazaar
mailing list