Rev 5121: Do some documentation about how the classes are supposed to work together. in http://bazaar.launchpad.net/~jameinel/bzr/2.2.0b2-pack-collection

Wed Jun 16 18:43:47 BST 2010

At http://bazaar.launchpad.net/~jameinel/bzr/2.2.0b2-pack-collection

------------------------------------------------------------
revno: 5121
revision-id: john at arbash-meinel.com-20100616174337-2o56z0n1tqnn4mm5
parent: john at arbash-meinel.com-20100616161659-s8luo9fm6h5x9c2v
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.2.0b2-pack-collection
timestamp: Wed 2010-06-16 12:43:37 -0500
message:
  Do some documentation about how the classes are supposed to work together.
-------------- next part --------------
=== modified file 'bzrlib/pack_collection.py'

--- a/bzrlib/pack_collection.py	2010-06-16 16:16:59 +0000
+++ b/bzrlib/pack_collection.py	2010-06-16 17:43:37 +0000
@@ -14,7 +14,118 @@
 # along with this program; if not, write to the Free Software
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 
-"""Code for managing a collection of pack files."""
+"""Code for managing a collection of pack files.
+
+Rough design constraints.
+
+ 1) We must be able to support the existing Repository layout and
+    functionality.
+
+ 2) We would like to be able to support a layout that is 'simpler'. For
+    example, using a single directory, rather than
+    upload/indices/obsolete_packs/packs.
+
+ 3) We would like to be able to support 'all-in-one' pack files (like
+    bzr-search), that contain the index information as part of the .pack file
+    (at the end).
+
+    a) We like the 'pack-names' meta file, as a single place where you can read
+       what you need to know about the basic structure of all the pack files.
+
+ 4) We want to support flexibility it what kinds of indexes exist, and what
+    kinds of 'full content' is stored.
+    
+    a) bzr-svn wants a key-value storage to hold the svn<=>bzr mapping
+       information. This might just be indexes with no actual data content.
+
+    b) an Annotation cache would probably contain data of the annotation
+       mapping, and indexes to make it easy to find the annotation basis keyed
+       by the flags given to the annotation function. (a memo)
+
+ 5) We would like a bit more unit-testing of the structures. Currently most
+    PackCollection testing is actually done indirectly via Repository tests.
+
+
+Current Design:
+
+ PackCollection
+    top-level structure. Somewhat analogous to a Repository. It manages a bunch
+    of pack files, combining them as necessary, allowing you to create new
+    ones, etc. It provides a way to get a VersionedFiles view on some of the
+    content, based on a given index name.
+    Should also provide a way to get access to a simple Index across multiple
+    pack files, and be able to insert new values into that Index.
+    Has a MemoTracker and a PackPolicy.
+
+ MemoTracker
+    A 'memo' is the short information (one-line, storable in an index) about a
+    pack file. This should allow you to know what indexes are available for a
+    given pack file, where they are on disk, and how big they are, etc.
+    These 'memos' are aggregated into a collection (eg 'pack-names'), so that
+    an overview of all known pack files is obtainable.
+
+    This is also the class which is responsible for handling concurrent updates
+    to the memo file. (So if a new pack is added by another process, we can see
+    and update appropriately.)
+
+    This class uses several policy objects, to allow it to support both
+    different memo formats, as well as different content layout.
+
+    LockingPolicy
+        Defines 'ensure_safe_to_cache', 'lock_write', and 'unlock'. For a
+        Repository, this is equivalent to 'is_locked()', 'lock_names()', and
+        '_unlock_names()'.
+        The idea is that we want a safety check that it is okay to hold on to
+        memory structures, and a way to ensure that concurrent updates to the
+        meta-file are properly serialized.
+        Is a Policy object, because something like an Annotation cache won't be
+        directly tied to a Repository, and will need a different locking
+        policy.
+
+    IndexPolicy
+        Used to allow different formats for the actual memo content. Decouples
+        the actual location and format of the memo file, from the logical
+        content that is stored. Older repo formats need a GraphIndex, newer
+        need a BTreeGraphIndex, potential memo storage could be in something
+        that is a simple text file, etc. We don't really need an 'index' since
+        the file is always read in its entirety.
+
+    UpdatePolicy
+        Collection of callbacks, to keep other code aware of changes to the
+        memo. For example, when syncing to the disk content, we fire
+        UpdatePolicy.memo_added to let other code know that there is a new
+        memo that needs to be addressed.
+
+        Used to decouple MemoTracker from PackCollection. So that MemoTracker
+        doesn't need to know the details of PackCollection, but can inform it
+        when it finds someone else added new packs to the record.
+
+ Pack
+    This is what PackCollection is aggregating. Essentially meant to describe
+    the '.pack' file. Also tracks the indexes associated with this pack file.
+
+     NewPack
+        An instance of Pack, where the indexes and content can be added to.
+        Has the functionality to 'finish' the content, and describe how to
+        access it in a new 'memo'. The memo should be added to the MemoTracker
+        to make the new content visible.
+
+     ExistingPack
+        The readonly version of Pack
+
+     ResumedPack
+        If a NewPack was suspended, its content gets written to disk somewhere.
+        ResumedPack is responsible for knowing where that content is, and how to
+        finish inserting that content into the collection.
+
+ PackPolicy
+    Used by PackCollection to define how to open existing pack files, restore
+    suspended files, create new files, etc. Also is the point that understands
+    what a 'memo' means, and how to turn a given memo back into a Pack object.
+
+    This is where the policy exists as to whether we put the new content into
+    'upload' and then rename it into 'packs/' when finished.
+"""
 
 import cStringIO
 import struct