[MERGE][BUG 43731][BUG 112173][BUG 111702] Transport connection sharing

Vincent Ladeuil v.ladeuil+lp at free.fr
Sun Jul 22 18:24:31 BST 2007


>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:

    john> Vincent Ladeuil wrote:
    >>>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:

    john> ...

    >> 
    >> Above comment added in the code as a TODO.
    >> 
    >> Well comment removed. I changed from ftp to sftp. The only
    >> problem is that commands.test_cat.TestCat.test_cat now fails.
    >> The good news are that the 25 command tests pass in 2s instead of
    >> 25s, woot !
    >> 
    >> I had to update most of the tests to properly use
    >> self.get_url(relpath) instead of self.get_url() + '[/]relpath'.
    >> 
    >> I put a FIXME explaining why I skip the test now. 'cat' was
    >> issuing only one connection so there is no urgency is fixing that
    >> now, but I think that reveals a deeper problem in the sftp
    >> transport: detailless exceptions from paramiko will have to be
    >> dealt with if we want to use it as our test server. In that
    >> specific case, I think SFTPTransport.get() should be split in two
    >> try blocks, one that will raise NoSuchFile and the other that
    >> will raise ReadError.

    john> I'll have to look closer at that. It does seem to
    john> indicate a problem with our SFTPTransport code.

Ok, let's say we'll discussed it again when this patch have landed then.

    >> 
    john> I would prefer it if you gave a custom protocol, like
    john> hooked+ftp://, rather than re-using the ftp:// scheme,
    john> and assuming that your handler will win (because it was
    john> the last one registered). It *is* likely, I don't know
    john> if it will be always guaranteed. But it also means that
    john> you can end up with weird failures. Where something
    john> changes, and it causes seemingly unrelated tests to
    john> start failing.
    >> 
    >> I can't get it working right now, can you give me a +1
    >> conditional if I do that this week-end ?

    john> Well, it depends if your work is +1 conditional worthy :).

Hehe :)

    john> But yeah, I won't strictly block on this.

Good, because I tried again but failed because decorators are
meant to proxy some methods to the decorated transport. But here,
we want to intercept some calls done by the decorated transport
itself. I didn't find a good solution to do that (injecting
wrapped methods in the decorated transport does look fragile to
say the least).

But having the ability to run a bzr command with a hooked+ prefix
(bzr branch hooked+http:/host/branch) is worth investigating...

Instead I used a 'hooked' scheme to make it clear that the
underlying implementation is not the primary concern and avoid
polluting the sftp name space.

I added an _unregister_urlparse_netloc_protocol to be clean.

    john> ...

    john> These don't really seem like 'bzrlib/tests/commands' maybe
    john> 'bzrlib/tests/command_connections'. Though it is getting a bit long.
    >> 
    >> The idea was that 'tests/commands' will contain all tests about
    >> command behavior of which the connection related ones are the
    >> first instance.
    >> 
    >> So I leave it as is.

    john> I do understand your point, I was just thinking that it
    john> is something that we should discuss as a project. But
    john> who knows if any other devs will be reading the middle
    john> of such a long conversation.

Well, at least Robert is aware of it as we chose the name
together at the London sprint.

    >> 
    john> For ftp and sftp connection caches, do we need to
    john> deprecate them first?
    >> 
    >> I don't think so, they were private and even a bug have been recorded.

    john> I wouldn't be surprised if there is code out there that
    john> will now start connecting multiple times because they
    john> though the connection would be cached.  But, as long as
    john> we make it clear how to do it 'right', I'm okay with
    john> it.

Ok. Given that bzr itself should now issue *less* connections,
that should average :)

<snip/>

    john> === modified file 'bzrlib/bzrdir.py'
    john> -        return format.initialize(safe_unicode(base))
    john> +        return format.initialize(base, possible_transports)
    >> 
    john> ^- A parameter like possible_transports really seems
    john> like it should be passed in as a keyword, not as a
    john> positional argument.

<snip/>


    >> In one of my first attempts, I even made it a mandatory
    >> positional parameter where [] was a legal value, but then I
    >> realized that offering a default value of None was more
    >> developer-friendly.
    >> 
    >> <snip/>
    >> 

    john> You can't make it mandatory without breaking api
    john> compatibility. So yeah, to start with you have to make
    john> it optional.

    john> It feels like a keyword argument / hint to me. But I
    john> can understand your desire to make it more mandatory,
    john> since not using it has some (potentially strong)
    john> repercussions.

    john> One thing you could consider doing is to allow
    john> possible_transports to be None, but make that
    john> deprecated. Something like:

    john> def get_transport(base, possible_transports=None):
    john>   if possible_transports is None:
    john>     symbol_versioning.warn(
    john> 	'get_transport requires possible_transports as of 0.19, defaulting'
    john> 	' to the empty list')
    john>     possible_transports = []

    john> (don't forget the appropriate tests that use this in a deprecated manner, etc.)

Looking into it it appears that there is a lot of code that needs
to be updated. Moreover, it looks like some more transport reuse
can be achieved (but the intent of this patch is address the
multiple connections already identified) but I need more time to
understand if the transports involved are always local ones or
not. I will propose such an update in another patch.

    >> 
    john> -        # TODO: jam 20060426 we probably need a test in here in the
    john> -        #       case that the newly sprouted branch is a remote one
    john> -        if result_repo is None or result_repo.make_working_trees():
    john> +        if isinstance(target_transport, LocalTransport) and (
    john> +            result_repo is None or result_repo.make_working_trees()):
    john> wt = result.create_workingtree()
    john> wt.lock_write()
    john> try:
    >> 
    john> ^- This doesn't seem like the time/place to be fixing
    john> this.
    >> 
    >> Why ? I added failing tests for remote branch creation and
    >> diagnosing both of them leads me here, where I find (with a
    >> smile) your comment.
    >> 
    >> May be sprout tries to do too much, but it is used for branch
    >> creation, so that seems the perfect place to fix it.

    john> This place is a correct fix.  Tangling it up with your
    john> patch to remove connection caches is not. It should be
    john> a separate clean patch to apply to bzr.dev.

    john> I'm happy to see it, you seem to provide tests for it,
    john> etc. Just that it should be 2 separate submissions.

Ok, I get the point.

But the patch itself meant to address the multiple connections
problems identified by the tests.commands tests. Writing these
tests exhibit the bug 112173 and I can't submit the patch with
failing tests. Hmmm, I can submit the patch while skipping these
tests and wait for it to be merged before submitting an
additional patch, but then I will be in vacation :)

    >> 
    john> But did you at least add tests that 'bzr branch local
    john> sftp://remote' doesn't try to create a working tree?
    >> 
    >> I added tests that ensures that 'bzr branch remote remote' and
    >> 'bzr branch local remote' issues 2 and 1 connections only.

    john> If 'bzr branch remote remote' is on the same remote
    john> machine, shouldn't it only issue 1 connection?

May be. To be safe I didn't try to share the connection between
the two remote branches in case some operations use file-like
objects linked to the connection socket (now or in a close
future).

I prefer to address that with a later patch.

    >> 
    >> I will add blackbox tests to ensure that no working trees is
    >> created. (Done).

    john> Well, this seems like it should be done at the
    john> 'Branch.sprout()' testing layer, it doesn't really have
    john> to be tested at the blackbox layer.

I modified bzrdir.sprout not Branch.sprout and I had, indeed to
modify TestBzrDir.sproutOrSkip, so I think *that* address your
point.

<snip/>

    >> No tests fail *today* if this code is commented out.
    >> 
    john> It looks like something Robrert wrote, so I don't know
    john> what is going on myself.
    >> 
    john> ...
    >> 

I'll re-raise the issue in another patch then.

    john> self.assertEquals(t.base, 'file://HOST/')
    >> 
    john> +class TestConnectedTransport(TestCase):
    john> +    """Tests for connected to remote server transports"""
    john> +
    >> 
    john> ^- 2 spaces please
    >> 

<snip/>

Arf, sorry I'm dense. Done.

   >> I'll make from_transport a private parameter instead. Thoughts ?

    john> making it private seems better.

Ok.

    >> 
    john> +        if from_transport is not None:
    john> +            # Copy the password as it does not appear in base and will be lost
    john> +            # otherwise. It can appear in the _initial_split_url above if the
    john> +            # user provided it on the command line. Otherwise, daughter classes
    john> +            # will prompt the user for one when appropriate.
    john> +            self._password = from_transport._password
    john> +
    john> +        base = self._unsplit_url(self._scheme,
    john> +                                 self._user, self._password,
    john> +                                 self._host, self._port,
    john> +                                 self._path)
    >> 
    john> ^- I also thought we tended to remove 'self._password'
    john> from the official URL, so that errors/etc won't expose
    john> passwords to the terminal or to the log.
    >> 
    >> It was not done everywhere, this patch makes it for all transports.

    john> Someone mentioned on IRC (bug report?) that this causes:

    john> bzr push ftp://user:password@host/path/on/host

    john> To forget the password as part of the saved
    john> location. So you have to enter your password. (editing
    john> ~/.bazaar/locations.conf can overcome this).

    john> So there is a bit of a tension. I guess I could argue
    john> that we should have 'self._display_base' which is used
    john> when reporting errors, and another one that is used for
    john> saving parent and push locations.

I'm currently working at a different way to address the issue :

https://blueprints.launchpad.net/bzr/+spec/authentication-ring

I think what you had begin with sftp (filtering out the password
from base) and that I have extended to all connected transports,
will result in a clearer and safer code base than trying to
review all the code to make sure we use the right url at each
point.

That means: saving urls or reporting errors should never display
a password. So filtering it out of all used urls and rely on some
password manager to provide the credentials should work.


<snip/>

    john> -# TODO: The should be marked as deprecated
    john> +# TODO: They should be marked as deprecated
    john> +# vila-20070606: How should we do that ?
    john> urlescape = urlutils.escape
    john> urlunescape = urlutils.unescape
    >> 
    john> ^- Probably:
    >> 
    john> urlescape = deprecated_function(zero_XXXX, urlutils.escape)
    >> 
    >> No. deprecated_function is a decorator. Can I postponed that ?

    john> Specifically, if you do:

    john> @foo
    john> def bar():
    john>   return X

    john> That is the same as doing:

    john> def bar():
    john>   return X
    john> bar = foo(bar)


    john> You have to do it a bit differently in this case,
    john> because 'deprecated_function' is a function which
    john> returns a wrapper which then gets applied. So it is
    john> actually:

    john> urlescape = deprecated_function(zero_nineteen)(urlutils.escape)

    john> It follows straightforward from:

    john> @deprecated_function(zero_nineteen)
    john> def urlescape
    john> ...


    john> Though the syntax isn't quite as nice.

Yup. So I did that with the usual syntax, defining the functions
as wrappers instead of copying them. Since they are deprecated
anyway, there is little interest in optimizing their use.

    john> ^- 'ensure that no working tree was created remotely'

Done.

    john>      def test_cat(self):
    john> +        # FIXME: sftp raises ReadError instead of NoSuchFile when probing for
    john> +        # branch/foo/.bzr/branch-format when used with the paramiko test
    john> +        # server.
    john> +        from bzrlib.tests import TestSkipped
    john> +        raise TestSkipped

    john> You should pass a string to TestSkipped. Perhaps

    john> raise TestSkipped('SFTPTransport raises incorrect exception'
    john> 		  ' when reading from paramiko server')

Done.

    john> ...

    john> -        :param from_transport: optional transport to build from. The built
    john> +        :param _from_transport: optional transport to build from. The built
    john>              transport will share the connection with this transport.
    john>          """
    john> ^- Maybe add the comment:
    john> The caller is required to ensure that _from_transport points at the same host
    john> as the new base.

Done.

    john> -    def __init__(self, url, from_transport=None, medium=None, _client=None):
    john> +    def __init__(self, url, _from_transport=None, medium=None, _client=None):
    john>          """Constructor.

    john> ^- Having 2 private parameters and a public one inbetween seems a little weird.
    john> Probably we can't change it. It just doesn't seem correct.

The plan is to remove medium and _client from the RemoteTransport
constructor and offers a class dedicated to the tests.


    john> I believe you've satisfied the bulk of my concerns. (+1 Cond)


I'll merge it with the points mentioned above and will address
the others when I'm back.

    Vincent



More information about the bazaar mailing list