[MERGE] Repository.get_revision_graph improvements
Robert Collins
robertc at robertcollins.net
Tue Feb 26 22:34:22 GMT 2008
On Tue, 2008-02-26 at 16:16 -0600, John Arbash Meinel wrote:
...
Our current scaling performance is due to python internals not disk
layout. Thats what I was getting at. Anyhow re disk format see below for
my current code :).
> By having the hash table entries be fixed size, then you know how much
> space to give it at the beginning of the file. You could also compress
> it, and just reserve enough space to hold it uncompressed. You would
> still have a fixed max size, and just have to read less data when you
> actually go to read the file.
+class SortedGraphIndexBuilder(_GraphIndexBuilder):
+ """A builder that can build a SortedGraphIndex.
+
+ The resulting index has the structure:
+ _SIGNATURE OPTIONS KEYHASH COLLISIONLIST NODES NEWLINE
+ _SIGNATURE := 'Bazaar Graph Index 2' NEWLINE
+ OPTIONS := 'node_ref_lists=' DIGITS NEWLINE
+ 'key_elements=' DIGITS NEWLINE
+ 'len=' DIGITS NEWLINE
+ 'hash_bytes=' DIGITS NEWLINE
+ 'offset_bytes=' DIGITS NEWLINE
+ 'node_offset=' DIGITS NEWLINE
+ KEYHASH := An array of offset_byte length integers in network byte
+ order. Each entry in the array is the location of the key
+ that had that hash prefix in the file, and the array is
+ complete (that is, if hash_bytes=1, there are 256 entries
+ in the array). In pathological cases we may need to
+ handle collisions (so that we don't have many gigabytes
+ of index when two entries hash closely: in those cases
+ the location pointed at will be in the collision-handling
+ list. NEWLINE follows the keyhash for debugging ease.
+ COLLISIONLIST := COLLISION * (in lexographic order)
+ COLLISION := KEY NULL REFERENCE NEWLINE
+ NODES := NODE*
+ NODE := KEY NULL ABSENT? NULL REFERENCES NULL VALUE NEWLINE
+ KEY := Not-whitespace-utf8
+ ABSENT := 'a'
+ REFERENCES := REFERENCE_LIST (TAB REFERENCE_LIST){node_ref_lists - 1}
+ REFERENCE_LIST := (REFERENCE (CR REFERENCE)*)?
+ REFERENCE := DIGITS ; digits is the byte offset in the index of the
+ ; referenced key.
+ VALUE := no-newline-no-null-bytes
+ """
+
+ _SIGNATURE = _SIGNATURE_2
You should check the branch out sometime :)
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080227/eb79d895/attachment-0001.pgp
More information about the bazaar
mailing list