fulltext searching of bzr repositories
Robert Collins
robertc at robertcollins.net
Mon Jun 9 14:08:22 BST 2008
https://edge.launchpad.net/bzr-search
In the weekend, I felt this was an interesting topic to have a stab at.
The result is a plugin which creates a search index. Currently it only
indexes the revision commit messages, but its pretty straight forward to
index additional components - all it needs is logic to generate a
posting list from them, and a Hit instance to provide reporting when the
results are found.
There are currently two caveats: The disk format is not finalised, so
users will need to remove the indices and recreate as I tweak it more.
And secondly, the reason for the disk format not being finalised - when
a posting list is more than 2K long, the bzrlib index bisection logic
will fail to parse the output index. This means that running it on bzr
itself fails :(. But it works fine on bzr-svn for instance:
plugins/svn/trunk$ time bzr index .
real 0m1.518s
user 0m1.140s
sys 0m0.100s
plugins/svn/trunk$ time bzr search workaround
Revision id 'jelmer at samba.org-20080511215646-kxxs86xvurf96nuq'. Summary:
'Fix workaround for bug in http ra backend.'
real 0m0.411s
user 0m0.324s
sys 0m0.080s
Cheers,
Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080609/5031857c/attachment.pgp
More information about the bazaar
mailing list