[MERGE] add performance tweaks.

Aaron Bentley aaron.bentley at utoronto.ca
Mon Jun 12 20:55:10 BST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This looks basically good, but I have some questions.

Robert Collins wrote:
> My replacement xml serialiser halves the serialisation cost, but is not
> compatible with 0.8 [unless the bug 49224 bugfix is applied], and I've
> not yet profiled the cost of doing entity replacement to make a
> ascii-safe bytestream.

This sounds like something we should do before deciding whether we need
to bugfix 0.8.  In theory, doing entity replacement instead of utf-8
encoding shouldn't be a big loss-- might even be faster.

> +    def test_make_10824_inv_entries(self):
> +        """Making 10824 inv entries should be quick."""
> +        entries = []
> +        def make_10824_entries():
> +            for counter in xrange(10000):
> +                bzrlib.inventory.make_entry('file', 'foo',
> +                    bzrlib.inventory.ROOT_ID)

This set off my unique_root_id alarm.  Even though it's okay, I'd prefer
using 'bar' instead of bzrlib.inventory.ROOT_ID.  We should discourage
use of ROOT_ID.

>      prepared_list = _prepare_file_list(file_list)
>      mutter("smart add of %r, originally %r", prepared_list, file_list)
>      inv = tree.read_working_inventory()
>      added = []
>      ignored = {}
> -    user_files = set()
> -    files_to_add = []
> +    dirs_to_add = []
> +    user_dirs = set()

Where are you distinguishing between user-specified files and files
discovered by listing directories?  I can't see anywhere where you're
raising BadFileKind.

> +        added = __add_one_and_parent(tree, inv, None, FastPath(dirname(path.raw_path)), 'directory', action)
> +        parent_id = inv.path2id(dirname(path.raw_path))
> +        if parent_id is None:
> +            parent_ie = inv[inv.path2id(dirname(path.raw_path))]

Huh? If inv.path2id(dirname(path.raw_path)) returns None, why do you do
it again?

>  class RootEntry(InventoryEntry):
>  
> +    __slots__ = ['text_sha1', 'text_size', 'file_id', 'name', 'kind',
> +                 'text_id', 'parent_id', 'children', 'executable', 
> +                 'revision', 'symlink_target']
> +
>      def _check(self, checker, rev_id, tree):
>          """See InventoryEntry._check"""

Drat.  This will be a conflict for the nested-trees work.  Necessary, I
guess.


>  def file_kind(f, _lstat=os.lstat, _mapper=file_kind_from_stat_mode):
> +    try:
> +        return _mapper(_lstat(f).st_mode)
> +    except OSError, e:
> +        if getattr(e, 'errno', None) == errno.ENOENT:
> +            raise bzrlib.errors.NoSuchFile(f)
> +        raise
> +
>      return _mapper(_lstat(f).st_mode)

The last line looks like it will never execute.

> -def walkdirs(top):
> +def walkdirs(top, prefix=""):
>      """Yield data about all the directories in a tree.
>      
>      This yields all the data about the contents of a directory at a time.
> @@ -813,13 +820,14 @@
>      The data yielded is of the form:
>      [(relpath, basename, kind, lstat, path_from_top), ...]
>  
> +    :param prefix: Treat prefix as the initial root of the tree, rather than ""
>      :return: an iterator over the dirs.
>      """

I don't understand the difference between the top and the prefix.  Why
would I specify the top and then specify the prefix?  Is it so that the
path is still relative to top?


> === modified file 'bzrlib/tests/test_smart_add.py'
> --- bzrlib/tests/test_smart_add.py	2006-06-06 12:06:20 +0000
> +++ bzrlib/tests/test_smart_add.py	2006-06-11 12:04:22 +0000
> @@ -59,7 +59,7 @@
>          wt = self.make_branch_and_tree('.')
>          branch = wt.branch
>          child_tree = self.make_branch_and_tree('original/child')
> -        smart_add_tree(wt, (u".",))
> +        smart_add_tree(wt, (".",))

What motivates this?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEjcae0F+nu1YWqI0RArj0AJ4kcRPBpV99ZQhfwBpjxMtE8raTvgCdFrqx
n3f/A+BsJayPCOfbIYTgy7Y=
=nNgc
-----END PGP SIGNATURE-----




More information about the bazaar mailing list