[MERGE] Unique roots for bzr

Fri Oct 13 07:07:26 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> 
> v- What are these changes for?

They force the inventory to start off with no entries.  Otherwise,
they'll get a root entry by default, to match the old API.

> It seems like we should have only 1 api
> for creating an inventory. And if anything this would have come from
> before when bundles were a plugin, and so they had to conform to
> multiple apis.
> 
> It would be better to just clean this up.

Okay.

> === modified file 'bzrlib/bzrdir.py'
> --- bzrlib/bzrdir.py	2006-09-29 13:09:33 +0000
> +++ bzrlib/bzrdir.py	2006-10-02 12:58:41 +0000
> @@ -680,7 +680,12 @@
>          # TODO: jam 20060426 we probably need a test in here in the
>          #       case that the newly sprouted branch is a remote one
>          if result_repo is None or result_repo.make_working_trees():
> -            result.create_workingtree()
> +            wt = result.create_workingtree()
> +            if wt.inventory.root is None:
> +                try:
> +                    wt.set_root_id(self.open_workingtree.get_root_id())
> +                except errors.NoWorkingTree:
> +                    pass
>          return result
> 
> ^- What is this doing?

It is making sure that when you branch a tree with no commits, they
still have the same root id.  This is expected by our tests.

> It certainly seems like it is doing something we
> want to avoid. (Namely, creating a new disk file, and then reading it
> back in). If this is necessary, then I think we have a problem elsewhere.

> v- How can you get a root entry after you just skipped the first entry?

This is due to bzr.dev and unique-roots converging on the same behavior
for iter_entries, but implementing it differently.  I have removed my
redundant is_root.

> @@ -1650,6 +1655,8 @@
>              entries = inv.iter_entries()
>              entries.next()
>              for path, ie in entries:
> +                if inv.is_root(ie.file_id):
> +                    continue
>                  assert getattr(ie, 'revision', None) is not None, \
>                      'no revision on {%s} in {%s}' % \
>                      (file_id, rev.revision_id)

> v- And why here are you not skipping the first one, but you skip the
> first one in other contexts?

I am doing is_root, which is equivalent to skipping the first one.  This
code is from before Robert started using entries.next() to skip the
root, and I think it's clearer.  But I'll change it just so we can avoid
further debate on this point.

> I know the overhead of a function call for
> each file is more than a single skip, but I think you mentioned
> something about 'is_root()' being able to return True for nested trees,
> or something like that.

I didn't mean to imply that.  is_root should only return true for the
current tree's root.

> 
> -        entries = inv.iter_entries()
> -        entries.next()
> -        for path, ie in entries:
> +        for file_id in inv:
> +            if inv.is_root(file_id):
> +                continue
> +            ie = inv[file_id]
>              self._convert_file_version(rev, ie, parent_invs)
> 
>      def _convert_file_version(self, rev, ie, parent_invs):

> v- This looks like a change that was done for debugging that should be
> reverted.

Yup.  Gone.

> ...
> v- Why are you updating a deprecated function?

It wasn't deprecated when I initially did this work.

> I can see updating the
> call to changes_from, but it doesn't seem like you want to update the
> compare_trees api.

Okay, gone.

> === modified file 'bzrlib/errors.py'
> --- bzrlib/errors.py	2006-09-27 20:27:46 +0000
> +++ bzrlib/errors.py	2006-10-02 12:58:02 +0000
> @@ -964,6 +964,20 @@
>      """Tree transform is malformed %(conflicts)r"""
> 
> v- I think this is better done as:
> class NoFinalPath(BzrNewError):
>     """No final name for trans_id %(trans_id)r
> file-id: %(file_id)r
> root trans-id: %(root_trans_id)r
> """

Okay.  That's a PEP8 violation, though.

> === modified file 'bzrlib/info.py'
> --- bzrlib/info.py	2006-09-08 18:52:17 +0000
> +++ bzrlib/info.py	2006-10-02 12:58:03 +0000
> @@ -210,9 +210,10 @@
>      print '  %8d ignored' % ignore_cnt
> 
> 
> v- Any reason you changed this to be a standard loop rather than just doing:
> dir_cnt = sum(1 for path, ie in entries
>               if ie.kind == 'directory' and not
> work_inv.is_root(ie.file_id))

I don't remember why I resolved this conflict in favor of my original
changes. But I think generator expressions longer than two lines are bad
style.

>      dir_cnt = 0
> -    entries = work_inv.iter_entries()
> -    entries.next()
> -    dir_cnt = sum(1 for path, ie in entries if ie.kind == 'directory')
> +    for file_id in work_inv:
> +        if (work_inv.get_file_kind(file_id) == 'directory' and
> +            not work_inv.is_root(file_id)):
> +            dir_cnt += 1
>      print '  %8d versioned %s' \
>            % (dir_cnt,
>               plural(dir_cnt, 'subdirectory', 'subdirectories'))
> 
> 
> ...
> 
> v- Is it possible for us to not default to an id with capital letters?

Sure.  But that doctest isn't testing the default.

> It means that we default to having a path which must be escaped in the
> repository, which causes problems with older versions of Apache, etc.

No, the tree root has always had capital letters in its file id, so this
doesn't change that.

> One of the nice things about bzr-0.9 is that it sanitizes new file ids
> to avoid these sorts of problems.

Unfortunately, that doesn't hold true for knit2 repositories, because
they have a knit for TREE_ROOT.

> It means that people might eventually
> have problems with their setups, but we avoid provoking it from the start.

If you really want to fix that, I'd suggest changing the way filenames
are escaped by repositories.

Anyhow, why does a doctest matter?  Especially given that it's said
'TREE_ROOT-12345678-12345678' since July 2005?

>      >>> inv = Inventory('TREE_ROOT-12345678-12345678')
>      >>> inv.add(InventoryFile('123-123', 'hello.c', ROOT_ID))
> +    Traceback (most recent call last):
> +    BzrError: parent_id {TREE_ROOT} not in inventory
> +    >>> inv.add(InventoryFile('123-123', 'hello.c',
> 'TREE_ROOT-12345678-12345678'))
>      InventoryFile('123-123', 'hello.c',
> parent_id='TREE_ROOT-12345678-12345678', sha1=None, len=None)
>      """
>      def __init__(self, root_id=ROOT_ID, revision_id=None):
> 
> ...

> v- This seems pretty weird, what are you trying to do here?

Avoid test failures from tests that assumed one blank tree was
equivalent to another blank tree.

I can't really say it better than the comment:
            # If we happen to have a tree, we'll guarantee everything
            # except for the tree root is the same.

> +            self.assertContainsRe(inventory_f.read(),
> +                                  '<inventory file_id="TREE_ROOT[^"]*"'
> +                                  ' format="5">\n</inventory>\n')
> +            inventory_f.close()
> +        except IOError, e:
> +            if e.errno != errno.ENOENT:
> +                raise
> 
> ^- I thought Knit2 would have an inventory format="6".

Knit2, sure.  This is testing working trees, which still use format 5.

> +++ bzrlib/tests/repository_implementations/test_fileid_involved.py
> 2006-10-02 12:58:13 +0000
> @@ -176,9 +176,10 @@
>      def fileids_altered_by_revision_ids(self, revision_ids):
>          """This is a wrapper to strip TREE_ROOT if it occurs"""
>          repo = self.branch.repository
> +        root_id = self.branch.basis_tree().inventory.root.file_id
>          result = repo.fileids_altered_by_revision_ids(revision_ids)
> -        if 'TREE_ROOT' in result:
> -            del result['TREE_ROOT']
> +        if root_id in result:
> +            del result[root_id]
>          return result
> 
> ^- Isn't this bogus? I would think that you need root_id to be in the
> result of 'fileids_involved_by', considering you need to merge the root
> inventory knit, or you lose information.

The result is not the same on all repositories.  On a knit1 repository,
the tree root will never be listed as altered.  On a knit2 repository,
it will sometimes be listed as altered.  This wrapper allows the tests
to work for all repository implementations.

> v- If a WorkingTree gets a root id at creation time, shouldn't a
> MemoryTree do the same? In fact, the only tree that should be allowed to
> not have a root id is the "empty" tree.

Unfortunately, MemoryTree doesn't have a persistent inventory-- it's
zapped on unlock.  So every time we locked, we'd get a different root
id.  I don't understand why it's written this way.

> So a few comments, but I think this is mergable.

I've attached a diff of my changes due to your review.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFLy0d0F+nu1YWqI0RAkHTAJ0QwjelyvJYR349kb4sPe/O8VivLgCfV9MK
uAoAKtpMuMLmnCyk3AkCDF0=
=V8HH
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unique-update.patch
Type: text/x-patch
Size: 4077 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20061013/ad3ea2f7/attachment.bin