Rev 2822: Minor tree & dirstate code cleanups (Ian Clatworthy) in file:///home/pqm/archives/thelove/bzr/%2Btrunk/

Canonical.com Patch Queue Manager pqm at pqm.ubuntu.com
Fri Sep 14 05:31:47 BST 2007


At file:///home/pqm/archives/thelove/bzr/%2Btrunk/

------------------------------------------------------------
revno: 2822
revision-id: pqm at pqm.ubuntu.com-20070914043128-s7ck70bf5bzefbiq
parent: pqm at pqm.ubuntu.com-20070914031609-ccdhl0ebzrohpa1t
parent: ian.clatworthy at internode.on.net-20070914025553-63arf9n0j8kisqdi
committer: Canonical.com Patch Queue Manager <pqm at pqm.ubuntu.com>
branch nick: +trunk
timestamp: Fri 2007-09-14 05:31:28 +0100
message:
  Minor tree & dirstate code cleanups (Ian Clatworthy)
modified:
  bzrlib/dirstate.py             dirstate.py-20060728012006-d6mvoihjb3je9peu-1
  bzrlib/tree.py                 tree.py-20050309040759-9d5f2496be663e77
  bzrlib/workingtree_4.py        workingtree_4.py-20070208044105-5fgpc5j3ljlh5q6c-1
    ------------------------------------------------------------
    revno: 2820.1.1
    merged: ian.clatworthy at internode.on.net-20070914025553-63arf9n0j8kisqdi
    parent: pqm at pqm.ubuntu.com-20070914020622-8ebz7llponlts1na
    parent: ian.clatworthy at internode.on.net-20070914022532-72l99l9pxew366ct
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: ianc-integration
    timestamp: Fri 2007-09-14 12:55:53 +1000
    message:
      Minor tree & dirstate code cleanups (Ian Clatworthy)
    ------------------------------------------------------------
    revno: 2818.2.2
    merged: ian.clatworthy at internode.on.net-20070914022532-72l99l9pxew366ct
    parent: ian.clatworthy at internode.on.net-20070914011426-wdtanslfmknxs1g5
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.code-cleanup
    timestamp: Fri 2007-09-14 12:25:32 +1000
    message:
      review feedback
    ------------------------------------------------------------
    revno: 2818.2.1
    merged: ian.clatworthy at internode.on.net-20070914011426-wdtanslfmknxs1g5
    parent: pqm at pqm.ubuntu.com-20070913193317-hi3rhwxhbrviw7hz
    committer: Ian Clatworthy <ian.clatworthy at internode.on.net>
    branch nick: bzr.code-cleanup
    timestamp: Fri 2007-09-14 11:14:26 +1000
    message:
      minor tree & dirstate code cleanups
=== modified file 'bzrlib/dirstate.py'
--- a/bzrlib/dirstate.py	2007-07-21 18:27:19 +0000
+++ b/bzrlib/dirstate.py	2007-09-14 02:25:32 +0000
@@ -29,9 +29,9 @@
 
 dirstate format = header line, full checksum, row count, parent details,
  ghost_details, entries;
-header line = "#bazaar dirstate flat format 2", NL;
+header line = "#bazaar dirstate flat format 3", NL;
 full checksum = "crc32: ", ["-"], WHOLE_NUMBER, NL;
-row count = "num_entries: ", digit, NL;
+row count = "num_entries: ", WHOLE_NUMBER, NL;
 parent_details = WHOLE NUMBER, {REVISION_ID}* NL;
 ghost_details = WHOLE NUMBER, {REVISION_ID}*, NL;
 entries = {entry};
@@ -118,7 +118,7 @@
 where we need id->path mapping; we also usually read the whole file, so
 I'm going to skip that for the moment, as we have the ability to locate
 via bisect any path in any tree, and if we lookup things by path, we can
-accumulate a id->path mapping as we go, which will tend to match what we
+accumulate an id->path mapping as we go, which will tend to match what we
 looked for.
 
 I plan to implement this asap, so please speak up now to alter/tweak the
@@ -143,7 +143,7 @@
 Locking:
  Eventually reuse dirstate objects across locks IFF the dirstate file has not
  been modified, but will require that we flush/ignore cached stat-hit data
- because we wont want to restat all files on disk just because a lock was
+ because we won't want to restat all files on disk just because a lock was
  acquired, yet we cannot trust the data after the previous lock was released.
 
 Memory representation:
@@ -162,13 +162,14 @@
       manageable number. Will scale badly on trees with 10K entries in a 
       single directory. compare with Inventory.InventoryDirectory which has
       a dictionary for the children. No bisect capability, can only probe for
-      exact matches, or grab all elements and sorta.
-    - Whats the risk of error here? Once we have the base format being processed
+      exact matches, or grab all elements and sort.
+    - What's the risk of error here? Once we have the base format being processed
       we should have a net win regardless of optimality. So we are going to 
-      go with what seems reasonably.
+      go with what seems reasonable.
 open questions:
 
-maybe we should do a test profile of these core structure - 10K simulated searches/lookups/etc?
+Maybe we should do a test profile of the core structure - 10K simulated
+searches/lookups/etc?
 
 Objects for each row?
 The lifetime of Dirstate objects is current per lock, but see above for
@@ -224,7 +225,7 @@
     # jam 20060614 it isn't really worth removing more entries if we
     # are going to leave it in packed form.
     # With only st_mtime and st_mode filesize is 5.5M and read time is 275ms
-    # With all entries filesize is 5.9M and read time is mabye 280ms
+    # With all entries, filesize is 5.9M and read time is maybe 280ms
     # well within the noise margin
 
     # base64 encoding always adds a final newline, so strip it off
@@ -651,7 +652,7 @@
         _bisect_dirblocks is meant to find the contents of directories, which
         differs from _bisect, which only finds individual entries.
 
-        :param dir_list: An sorted list of directory names ['', 'dir', 'foo'].
+        :param dir_list: A sorted list of directory names ['', 'dir', 'foo'].
         :return: A map from dir => entries_for_dir
         """
         # TODO: jam 20070223 A lot of the bisecting logic could be shared
@@ -1330,8 +1331,8 @@
             be attempted.
         :return: A tuple describing where the path is located, or should be
             inserted. The tuple contains four fields: the block index, the row
-            index, anda two booleans are True when the directory is present, and
-            when the entire path is present.  There is no guarantee that either
+            index, the directory is present (boolean), the entire path is
+            present (boolean).  There is no guarantee that either
             coordinate is currently reachable unless the found field for it is
             True. For instance, a directory not present in the searched tree
             may be returned with a value one greater than the current highest
@@ -1359,7 +1360,7 @@
         return block_index, entry_index, True, False
 
     def _get_entry(self, tree_index, fileid_utf8=None, path_utf8=None):
-        """Get the dirstate entry for path in tree tree_index
+        """Get the dirstate entry for path in tree tree_index.
 
         If either file_id or path is supplied, it is used as the key to lookup.
         If both are supplied, the fastest lookup is used, and an error is
@@ -1404,7 +1405,7 @@
                     continue
                 # WARNING: DO not change this code to use _get_block_entry_index
                 # as that function is not suitable: it does not use the key
-                # to lookup, and thus the wront coordinates are returned.
+                # to lookup, and thus the wrong coordinates are returned.
                 block = self._dirblocks[block_index][1]
                 entry_index, present = self._find_entry_index(key, block)
                 if present:
@@ -1511,9 +1512,9 @@
         return self._id_index
 
     def _get_output_lines(self, lines):
-        """format lines for final output.
+        """Format lines for final output.
 
-        :param lines: A sequece of lines containing the parents list and the
+        :param lines: A sequence of lines containing the parents list and the
             path lines.
         """
         output_lines = [DirState.HEADER_FORMAT_3]
@@ -1527,7 +1528,7 @@
         return output_lines
 
     def _make_deleted_row(self, fileid_utf8, parents):
-        """Return a deleted for for fileid_utf8."""
+        """Return a deleted row for fileid_utf8."""
         return ('/', 'RECYCLED.BIN', 'file', fileid_utf8, 0, DirState.NULLSTAT,
             ''), parents
 
@@ -1587,7 +1588,7 @@
             self._read_header()
 
     def _read_prelude(self):
-        """Read in the prelude header of the dirstate file
+        """Read in the prelude header of the dirstate file.
 
         This only reads in the stuff that is not connected to the crc
         checksum. The position will be correct to read in the rest of
@@ -1610,9 +1611,9 @@
 
         We reuse the existing file, because that prevents race conditions with
         file creation, and use oslocks on it to prevent concurrent modification
-        and reads - because dirstates incremental data aggretation is not
+        and reads - because dirstate's incremental data aggregation is not
         compatible with reading a modified file, and replacing a file in use by
-        another process is impossible on windows.
+        another process is impossible on Windows.
 
         A dirstate in read only mode should be smart enough though to validate
         that the file has not changed, and otherwise discard its cache and
@@ -1679,7 +1680,7 @@
             "path_id %r is not a plain string" % (new_id,)
         self._read_dirblocks_if_needed()
         if len(path):
-            # logic not written
+            # TODO: logic not written
             raise NotImplementedError(self.set_path_id)
         # TODO: check new id is unique
         entry = self._get_entry(0, path_utf8=path)
@@ -1787,7 +1788,7 @@
                         # this file id is at a different path in one of the
                         # other trees, so put absent pointers there
                         # This is the vertical axis in the matrix, all pointing
-                        # tot he real path.
+                        # to the real path.
                         by_path[entry_key][tree_index] = ('r', path_utf8, 0, False, '')
                 # by path consistency: Insert into an existing path record (trivial), or 
                 # add a new one with relocation pointers for the other tree indexes.
@@ -1862,7 +1863,7 @@
         new_iterator = new_inv.iter_entries_by_dir()
         # we will be modifying the dirstate, so we need a stable iterator. In
         # future we might write one, for now we just clone the state into a
-        # list - which is a shallow copy, so each 
+        # list - which is a shallow copy.
         old_iterator = iter(list(self._iter_entries()))
         # both must have roots so this is safe:
         current_new = new_iterator.next()
@@ -1936,14 +1937,14 @@
     def _make_absent(self, current_old):
         """Mark current_old - an entry - as absent for tree 0.
 
-        :return: True if this was the last details entry for they entry key:
+        :return: True if this was the last details entry for the entry key:
             that is, if the underlying block has had the entry removed, thus
             shrinking in length.
         """
         # build up paths that this id will be left at after the change is made,
         # so we can update their cross references in tree 0
         all_remaining_keys = set()
-        # Dont check the working tree, because its going.
+        # Dont check the working tree, because it's going.
         for details in current_old[1][1:]:
             if details[0] not in ('a', 'r'): # absent, relocated
                 all_remaining_keys.add(current_old[0])
@@ -2238,7 +2239,7 @@
         self._split_path_cache = {}
 
     def lock_read(self):
-        """Acquire a read lock on the dirstate"""
+        """Acquire a read lock on the dirstate."""
         if self._lock_token is not None:
             raise errors.LockContention(self._lock_token)
         # TODO: jam 20070301 Rather than wiping completely, if the blocks are
@@ -2251,7 +2252,7 @@
         self._wipe_state()
 
     def lock_write(self):
-        """Acquire a write lock on the dirstate"""
+        """Acquire a write lock on the dirstate."""
         if self._lock_token is not None:
             raise errors.LockContention(self._lock_token)
         # TODO: jam 20070301 Rather than wiping completely, if the blocks are
@@ -2264,7 +2265,7 @@
         self._wipe_state()
 
     def unlock(self):
-        """Drop any locks held on the dirstate"""
+        """Drop any locks held on the dirstate."""
         if self._lock_token is None:
             raise errors.LockNotHeld(self)
         # TODO: jam 20070301 Rather than wiping completely, if the blocks are
@@ -2278,7 +2279,7 @@
         self._split_path_cache = {}
 
     def _requires_lock(self):
-        """Checks that a lock is currently held by someone on the dirstate"""
+        """Check that a lock is currently held by someone on the dirstate."""
         if not self._lock_token:
             raise errors.ObjectNotLocked(self)
 

=== modified file 'bzrlib/tree.py'
--- a/bzrlib/tree.py	2007-09-05 03:20:26 +0000
+++ b/bzrlib/tree.py	2007-09-14 01:14:26 +0000
@@ -122,7 +122,7 @@
     
     def has_filename(self, filename):
         """True if the tree has given filename."""
-        raise NotImplementedError()
+        raise NotImplementedError(self.has_filename)
 
     def has_id(self, file_id):
         file_id = osutils.safe_file_id(file_id)
@@ -228,7 +228,7 @@
         raise NotImplementedError(self.get_file_mtime)
 
     def get_file_by_path(self, path):
-        return self.get_file(self._inventory.path2id(path))
+        return self.get_file(self._inventory.path2id(path), path)
 
     def iter_files_bytes(self, desired_files):
         """Iterate through file contents.
@@ -267,7 +267,7 @@
         raise NotImplementedError(self.get_symlink_target)
 
     def annotate_iter(self, file_id):
-        """Return an iterator of revision_id, line tuples
+        """Return an iterator of revision_id, line tuples.
 
         For working trees (and mutable trees in general), the special
         revision_id 'current:' will be used for lines that are new in this
@@ -277,7 +277,7 @@
         raise NotImplementedError(self.annotate_iter)
 
     def plan_file_merge(self, file_id, other):
-        """Generate a merge plan based on annotations
+        """Generate a merge plan based on annotations.
 
         If the file contains uncommitted changes in this tree, they will be
         attributed to the 'current:' pseudo-revision.  If the file contains
@@ -323,10 +323,10 @@
     def paths2ids(self, paths, trees=[], require_versioned=True):
         """Return all the ids that can be reached by walking from paths.
         
-        Each path is looked up in each this tree and any extras provided in
+        Each path is looked up in this tree and any extras provided in
         trees, and this is repeated recursively: the children in an extra tree
         of a directory that has been renamed under a provided path in this tree
-        are all returned, even if none exist until a provided path in this
+        are all returned, even if none exist under a provided path in this
         tree, and vice versa.
 
         :param paths: An iterable of paths to start converting to ids from.
@@ -405,7 +405,7 @@
            versioned_kind.
          - lstat is the stat data *if* the file was statted.
          - path_from_tree_root is the path from the root of the tree.
-         - file_id is the file_id is the entry is versioned.
+         - file_id is the file_id if the entry is versioned.
          - versioned_kind is the kind of the file as last recorded in the 
            versioning system. If 'unknown' the file is not versioned.
         One of 'kind' and 'versioned_kind' must not be 'unknown'.
@@ -545,7 +545,7 @@
     :param trees: The trees to find file_ids within
     :param require_versioned: if true, all specified filenames must occur in
         at least one tree.
-    :return: a set of (path, file ids) for the specified filenames
+    :return: a set of file ids for the specified filenames
     """
     not_versioned = []
     interesting_ids = set()
@@ -564,7 +564,7 @@
 
 
 def _find_children_across_trees(specified_ids, trees):
-    """Return a set including specified ids and their children
+    """Return a set including specified ids and their children.
     
     All matches in all trees will be used.
 
@@ -596,7 +596,7 @@
     Its instances have methods like 'compare' and contain references to the
     source and target trees these operations are to be carried out on.
 
-    clients of bzrlib should not need to use InterTree directly, rather they
+    Clients of bzrlib should not need to use InterTree directly, rather they
     should use the convenience methods on Tree such as 'Tree.compare()' which
     will pass through to InterTree as appropriate.
     """

=== modified file 'bzrlib/workingtree_4.py'
--- a/bzrlib/workingtree_4.py	2007-09-05 03:20:26 +0000
+++ b/bzrlib/workingtree_4.py	2007-09-14 02:25:32 +0000
@@ -132,7 +132,6 @@
         """
         self._format = _format
         self.bzrdir = _bzrdir
-        from bzrlib.trace import note, mutter
         assert isinstance(basedir, basestring), \
             "base directory %r is not a string" % basedir
         basedir = safe_unicode(basedir)




More information about the bazaar-commits mailing list