[MERGE] Packs. Kthxbye.

Ian Clatworthy ian.clatworthy at internode.on.net
Fri Oct 19 03:53:06 BST 2007


Robert Collins wrote:
> I think the moment has come to get the pack repository format reviewed.

Here are some review comments for the code in pack_repo.py before
RepositoryPackCollection.

The __init__ method for Pack should take a name and a pack_transport
parameter. These are used by methods at that level (e.g. access tuple)
so I feel they ought to exist at that level. If you don't want them as
constructor params, then set them to None in __init__ anyway please. I
know we have differing views on this but I think base classes need to be
semantically complete on their own, i.e. if there are abstract
attributes or methods, they ought to be declared and commented to
explain their reason for being, e.g.

  "set this in derived classes to ..." or
  "override in derived classes to ...".

BTW, NewPack subclasses Pack yet doesn't call the Pack constructor in
its own. That smells bad to me.

Along the same lines, index_name(), index_offset() and indices ought to
migrate from NewPack to Pack. Those 2 methods can be static. Likewise,
inventory_index_name(), revision_index_name(), signature_index_name()
and text_index_name() all take name as a mandatory parameter rather than
using self.name, so I'd suggest renaming them as *_for_name() and/or
making them static.

In the docstring for ExistingPack, s/exisiting/existing/.

In the __init__ method for NewPack, you do a good job of making it very
clear about the constraint required between upload_transport and
pack_transport, namely

  pack_transport = upload_transport.clone("../pack")

Given that, there appears to be a strong case for not having the
pack_transport passed in as a parameter. (I think jam raised something
like this as well IIRC.)

In this comment:

  # synchronised with reads, because its not in the transport layer, so

s/its/it's/

In the comment "1MB buffer cap" in that method, remove the size and
simply say "cap the buffer size". If the buffer size is left as zero,
every write gets flushed. I gather that's by design.

In NewPack.finish(), the naming logic scares me. If 2 people working
from the same revision of a branch/checkout make the same change/fix and
commit it, will the pack contents be the same and therefore the names?
I'm yet to check whether this is managed by the fetch logic but choosing
pack names like this sounds potentially dangerous. OTOH, it *could* be a
feature? In that case, we need to make it so and explicitly handle it as
an optimisation throughout the code, yes? "Ah yes, I have that content."
Not sure about the index implications though.

In NewPack.make_index docstring, s/packs/packs's/.

In NewPack._write_index docstring, remove the param descriptions for
index_offset and name_getter. They aren't params anymore/yet.

In AggregateIndex docstrings,

  s/An index from the pack parameter./An index for the pack./

Now for the fun bit: RepositoryPackCollection. :-)

Ian C.



More information about the bazaar mailing list