[MERGE] Various hackery to reduce startup time.

John Arbash Meinel john at arbash-meinel.com
Fri Sep 12 16:40:23 BST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> This is that “faster-startup” branch I've been talking about.
> 
> Here's a summary of what it does... not all of it is necessarily a good idea, or
> well-executed:
> 
>   - earlier installation of lazy_regex
>   - many dead imports removed
>   - many imports made lazy (probably more than is usefuL)
>   - some imports made *not* lazy! (because it's a waste of effort if they are
>     going to be always used)
>   - bzrlib.user_encoding completely removed (bzrlib.osutils.get_user_encoding
>     was already preferred), which makes a bare "import bzrlib" lighter-weight.
>   - avoid importing modules we only use for win32 (e.g. ctypes, win32utils) on
>     non-win32 platforms
>   - a few contortions to avoid necessarily importing sha, ConfigObj
>   - add lazy_registry to RegistryOption
>   - use properties for _serializer attributes of various RepositoryFormats so
>     that xml5/6/7/8 modules aren't always loaded
>   - moved a private function from bzrlib.diff to bzrlib.status so that the
>     latter doesn't need to import the former.
>   - prevent configobj from importing compiler, we don't need the 'unrepr'
>     feature of configobj that uses that module.
> 
> See the diff for the gory details.
> 
> -Andrew.
> 
> 

To start with, this is the sort of cleanup that might be best when
broken up into smaller chunks. As it makes it easier to review 1 file at
a time. Also, a lot of what you've done won't actually help, because
we'll still end up loading osutils.get_user_encoding() on every 'bzr'
invocation (because we use it to process the command line.) But I think
it might put us in a direction to actually get there.

For now, I'd rather leave "bzrlib.user_encoding" alone, as we haven't
deprecated it, and I think 3rd-party tools will use it. I *really* wish
we had a way to issue a proper deprecation warning for accessing a
module member. If it was a class we can use a property. Any ideas of
what we could do here? (I worked on the ones for a DeprecatedList, etc.
The problem is we don't use an *attribute* of user_encoding, we just use
it as a string.)


Lazy importing are changes that break things which the test suite cannot
find. Because it imports *everything*. So I'd like to get something like
this in bzr.dev as early in the next cycle as possible, so we can have
real-world testing.

+
+have_pywin32 = False
+if sys.platform == 'win32':
+    try:
+        import win32con, win32file, pywintypes, winerror, msvcrt
+        have_pywin32 = True
+    except ImportError:
+        pass

...

- -if have_ctypes and sys.platform == 'win32':
+have_ctypes_win32 = False
+if sys.platform == 'win32':
+    try:
+        import ctypes, msvcrt
+        have_ctypes_win32 = True
+    except ImportError:
+        pass
+

I think it would be clearer to combine these. Also I notice that
'msvcrt' is repeated, which isn't really necessary. So instead:

have_pywin32 = False
have_ctypes_win32 = False

if sys.platform == 'win32':
  import msvcrt
  try:
    import win32con...
    have_pywin32 = True
  ...

  try:
    import ctypes
    have_ctypes_win32 = True
  ...


That gets us away from having imports spread throughout the code, and
also always imports msvcrt since we need that anyway.

...
=== modified file 'bzrlib/msgeditor.py'
- --- bzrlib/msgeditor.py	2008-04-22 09:58:15 +0000
+++ bzrlib/msgeditor.py	2008-09-08 12:59:00 +0000
@@ -23,7 +23,6 @@
 from subprocess import call
 import sys

- -import bzrlib
 import bzrlib.config as config
 from bzrlib import osutils

^- while we are here lets change it to:

from bzrlib import (
  config,
  osutils,
  )

...

=== modified file 'bzrlib/option.py'
...
+    @property
+    def registry(self):
+        if self._registry is not None:
+            return self._registry
+        else:
+            return self._lazy_registry.get_obj()
+

^- Shouldn't this do:

if self._registry is None:
  self._registry = self._lazy_registry.get_obj()
return self._registry


...

 def _win32_abspath(path):
+    global _win32_abspath
+    if win32utils.winver == 'Windows 98':
+        _win32_abspath = _win98_abspath
+    else:
+        _win32_abspath = _real_win32_abspath
+    return _win32_abspath(path)
+
+

self modifying functions seems a bit distasteful. I won't block it, as
it may be a route we really want to go. It just seems a bit icky.

Oh, and in this particular case, it is wrong. In that we do:

if sys.platform == 'win32':
  abspath = _win32_abspath

And your trickery will basically get invoked all the time anyway. (Since
we access it via the 'abspath' variable, not the _win32_abspath.


However, I feel it is better than what you did here:

+    global sha_strings
+    def sha_strings(strings, _factory=sha.new):


^- I would feel better about doing

global sha_strings
def _sha_strings...

sha_strings = _sha_strings

I realize this works, but it seems brittle that 'def foo' will work with
a 'global foo' statement.


...

=== modified file 'bzrlib/util/configobj/configobj.py'
- --- bzrlib/util/configobj/configobj.py	2008-02-27 21:36:16 +0000
+++ bzrlib/util/configobj/configobj.py	2008-03-14 17:07:55 +0000
@@ -25,11 +25,6 @@

 import os, re
 compiler = None
- -try:
- -    import compiler
- -except ImportError:
- -    # for IronPython
- -    pass

^- As we are modifying a 3rd-party library, I think we should actually
comment it out, and include a clear comment as to why we got rid of it.

That way if we upgrade to a newer version, we will have good
documentation about why that was removed.

...

=== modified file 'bzrlib/weave.py'
- --- bzrlib/weave.py	2008-07-16 18:14:23 +0000
+++ bzrlib/weave.py	2008-09-08 12:59:00 +0000
@@ -75,6 +75,10 @@
 import time
 import warnings

+from bzrlib.lazy_import import lazy_import
+lazy_import(globals(), """
+from bzrlib import tsort
+""")
 from bzrlib import (
     progress,
     )

^- You added a lazy "import tsort" but you didn't remove the "from
bzrlib.tsort import topo_sort".


So the biggest thing is bzrlib.user_encoding. I don't think we can get
rid of it yet, and in general there isn't a win here for startup time,
because we use it for command line processing. So I'd rather defer on
that for now.


BB:tweak

John
=:->

PS> Any chance that you have some nice "bzr --no-plugin st" times to
show the benefits of all this work?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjKjWYACgkQJdeBCYSNAANi4QCgqbpLOKvbOQvY+S83S/HXhqcw
tq8AnjVcrRCN+rzRPlb7jxhyxz9vL2Qu
=0Gck
-----END PGP SIGNATURE-----



More information about the bazaar mailing list