[MERGE][BUG 43731][BUG 112173][BUG 111702] Transport connection sharing
Vincent Ladeuil
v.ladeuil+lp at free.fr
Sun Jul 22 18:24:31 BST 2007
>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
john> Vincent Ladeuil wrote:
>>>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
john> ...
>>
>> Above comment added in the code as a TODO.
>>
>> Well comment removed. I changed from ftp to sftp. The only
>> problem is that commands.test_cat.TestCat.test_cat now fails.
>> The good news are that the 25 command tests pass in 2s instead of
>> 25s, woot !
>>
>> I had to update most of the tests to properly use
>> self.get_url(relpath) instead of self.get_url() + '[/]relpath'.
>>
>> I put a FIXME explaining why I skip the test now. 'cat' was
>> issuing only one connection so there is no urgency is fixing that
>> now, but I think that reveals a deeper problem in the sftp
>> transport: detailless exceptions from paramiko will have to be
>> dealt with if we want to use it as our test server. In that
>> specific case, I think SFTPTransport.get() should be split in two
>> try blocks, one that will raise NoSuchFile and the other that
>> will raise ReadError.
john> I'll have to look closer at that. It does seem to
john> indicate a problem with our SFTPTransport code.
Ok, let's say we'll discussed it again when this patch have landed then.
>>
john> I would prefer it if you gave a custom protocol, like
john> hooked+ftp://, rather than re-using the ftp:// scheme,
john> and assuming that your handler will win (because it was
john> the last one registered). It *is* likely, I don't know
john> if it will be always guaranteed. But it also means that
john> you can end up with weird failures. Where something
john> changes, and it causes seemingly unrelated tests to
john> start failing.
>>
>> I can't get it working right now, can you give me a +1
>> conditional if I do that this week-end ?
john> Well, it depends if your work is +1 conditional worthy :).
Hehe :)
john> But yeah, I won't strictly block on this.
Good, because I tried again but failed because decorators are
meant to proxy some methods to the decorated transport. But here,
we want to intercept some calls done by the decorated transport
itself. I didn't find a good solution to do that (injecting
wrapped methods in the decorated transport does look fragile to
say the least).
But having the ability to run a bzr command with a hooked+ prefix
(bzr branch hooked+http:/host/branch) is worth investigating...
Instead I used a 'hooked' scheme to make it clear that the
underlying implementation is not the primary concern and avoid
polluting the sftp name space.
I added an _unregister_urlparse_netloc_protocol to be clean.
john> ...
john> These don't really seem like 'bzrlib/tests/commands' maybe
john> 'bzrlib/tests/command_connections'. Though it is getting a bit long.
>>
>> The idea was that 'tests/commands' will contain all tests about
>> command behavior of which the connection related ones are the
>> first instance.
>>
>> So I leave it as is.
john> I do understand your point, I was just thinking that it
john> is something that we should discuss as a project. But
john> who knows if any other devs will be reading the middle
john> of such a long conversation.
Well, at least Robert is aware of it as we chose the name
together at the London sprint.
>>
john> For ftp and sftp connection caches, do we need to
john> deprecate them first?
>>
>> I don't think so, they were private and even a bug have been recorded.
john> I wouldn't be surprised if there is code out there that
john> will now start connecting multiple times because they
john> though the connection would be cached. But, as long as
john> we make it clear how to do it 'right', I'm okay with
john> it.
Ok. Given that bzr itself should now issue *less* connections,
that should average :)
<snip/>
john> === modified file 'bzrlib/bzrdir.py'
john> - return format.initialize(safe_unicode(base))
john> + return format.initialize(base, possible_transports)
>>
john> ^- A parameter like possible_transports really seems
john> like it should be passed in as a keyword, not as a
john> positional argument.
<snip/>
>> In one of my first attempts, I even made it a mandatory
>> positional parameter where [] was a legal value, but then I
>> realized that offering a default value of None was more
>> developer-friendly.
>>
>> <snip/>
>>
john> You can't make it mandatory without breaking api
john> compatibility. So yeah, to start with you have to make
john> it optional.
john> It feels like a keyword argument / hint to me. But I
john> can understand your desire to make it more mandatory,
john> since not using it has some (potentially strong)
john> repercussions.
john> One thing you could consider doing is to allow
john> possible_transports to be None, but make that
john> deprecated. Something like:
john> def get_transport(base, possible_transports=None):
john> if possible_transports is None:
john> symbol_versioning.warn(
john> 'get_transport requires possible_transports as of 0.19, defaulting'
john> ' to the empty list')
john> possible_transports = []
john> (don't forget the appropriate tests that use this in a deprecated manner, etc.)
Looking into it it appears that there is a lot of code that needs
to be updated. Moreover, it looks like some more transport reuse
can be achieved (but the intent of this patch is address the
multiple connections already identified) but I need more time to
understand if the transports involved are always local ones or
not. I will propose such an update in another patch.
>>
john> - # TODO: jam 20060426 we probably need a test in here in the
john> - # case that the newly sprouted branch is a remote one
john> - if result_repo is None or result_repo.make_working_trees():
john> + if isinstance(target_transport, LocalTransport) and (
john> + result_repo is None or result_repo.make_working_trees()):
john> wt = result.create_workingtree()
john> wt.lock_write()
john> try:
>>
john> ^- This doesn't seem like the time/place to be fixing
john> this.
>>
>> Why ? I added failing tests for remote branch creation and
>> diagnosing both of them leads me here, where I find (with a
>> smile) your comment.
>>
>> May be sprout tries to do too much, but it is used for branch
>> creation, so that seems the perfect place to fix it.
john> This place is a correct fix. Tangling it up with your
john> patch to remove connection caches is not. It should be
john> a separate clean patch to apply to bzr.dev.
john> I'm happy to see it, you seem to provide tests for it,
john> etc. Just that it should be 2 separate submissions.
Ok, I get the point.
But the patch itself meant to address the multiple connections
problems identified by the tests.commands tests. Writing these
tests exhibit the bug 112173 and I can't submit the patch with
failing tests. Hmmm, I can submit the patch while skipping these
tests and wait for it to be merged before submitting an
additional patch, but then I will be in vacation :)
>>
john> But did you at least add tests that 'bzr branch local
john> sftp://remote' doesn't try to create a working tree?
>>
>> I added tests that ensures that 'bzr branch remote remote' and
>> 'bzr branch local remote' issues 2 and 1 connections only.
john> If 'bzr branch remote remote' is on the same remote
john> machine, shouldn't it only issue 1 connection?
May be. To be safe I didn't try to share the connection between
the two remote branches in case some operations use file-like
objects linked to the connection socket (now or in a close
future).
I prefer to address that with a later patch.
>>
>> I will add blackbox tests to ensure that no working trees is
>> created. (Done).
john> Well, this seems like it should be done at the
john> 'Branch.sprout()' testing layer, it doesn't really have
john> to be tested at the blackbox layer.
I modified bzrdir.sprout not Branch.sprout and I had, indeed to
modify TestBzrDir.sproutOrSkip, so I think *that* address your
point.
<snip/>
>> No tests fail *today* if this code is commented out.
>>
john> It looks like something Robrert wrote, so I don't know
john> what is going on myself.
>>
john> ...
>>
I'll re-raise the issue in another patch then.
john> self.assertEquals(t.base, 'file://HOST/')
>>
john> +class TestConnectedTransport(TestCase):
john> + """Tests for connected to remote server transports"""
john> +
>>
john> ^- 2 spaces please
>>
<snip/>
Arf, sorry I'm dense. Done.
>> I'll make from_transport a private parameter instead. Thoughts ?
john> making it private seems better.
Ok.
>>
john> + if from_transport is not None:
john> + # Copy the password as it does not appear in base and will be lost
john> + # otherwise. It can appear in the _initial_split_url above if the
john> + # user provided it on the command line. Otherwise, daughter classes
john> + # will prompt the user for one when appropriate.
john> + self._password = from_transport._password
john> +
john> + base = self._unsplit_url(self._scheme,
john> + self._user, self._password,
john> + self._host, self._port,
john> + self._path)
>>
john> ^- I also thought we tended to remove 'self._password'
john> from the official URL, so that errors/etc won't expose
john> passwords to the terminal or to the log.
>>
>> It was not done everywhere, this patch makes it for all transports.
john> Someone mentioned on IRC (bug report?) that this causes:
john> bzr push ftp://user:password@host/path/on/host
john> To forget the password as part of the saved
john> location. So you have to enter your password. (editing
john> ~/.bazaar/locations.conf can overcome this).
john> So there is a bit of a tension. I guess I could argue
john> that we should have 'self._display_base' which is used
john> when reporting errors, and another one that is used for
john> saving parent and push locations.
I'm currently working at a different way to address the issue :
https://blueprints.launchpad.net/bzr/+spec/authentication-ring
I think what you had begin with sftp (filtering out the password
from base) and that I have extended to all connected transports,
will result in a clearer and safer code base than trying to
review all the code to make sure we use the right url at each
point.
That means: saving urls or reporting errors should never display
a password. So filtering it out of all used urls and rely on some
password manager to provide the credentials should work.
<snip/>
john> -# TODO: The should be marked as deprecated
john> +# TODO: They should be marked as deprecated
john> +# vila-20070606: How should we do that ?
john> urlescape = urlutils.escape
john> urlunescape = urlutils.unescape
>>
john> ^- Probably:
>>
john> urlescape = deprecated_function(zero_XXXX, urlutils.escape)
>>
>> No. deprecated_function is a decorator. Can I postponed that ?
john> Specifically, if you do:
john> @foo
john> def bar():
john> return X
john> That is the same as doing:
john> def bar():
john> return X
john> bar = foo(bar)
john> You have to do it a bit differently in this case,
john> because 'deprecated_function' is a function which
john> returns a wrapper which then gets applied. So it is
john> actually:
john> urlescape = deprecated_function(zero_nineteen)(urlutils.escape)
john> It follows straightforward from:
john> @deprecated_function(zero_nineteen)
john> def urlescape
john> ...
john> Though the syntax isn't quite as nice.
Yup. So I did that with the usual syntax, defining the functions
as wrappers instead of copying them. Since they are deprecated
anyway, there is little interest in optimizing their use.
john> ^- 'ensure that no working tree was created remotely'
Done.
john> def test_cat(self):
john> + # FIXME: sftp raises ReadError instead of NoSuchFile when probing for
john> + # branch/foo/.bzr/branch-format when used with the paramiko test
john> + # server.
john> + from bzrlib.tests import TestSkipped
john> + raise TestSkipped
john> You should pass a string to TestSkipped. Perhaps
john> raise TestSkipped('SFTPTransport raises incorrect exception'
john> ' when reading from paramiko server')
Done.
john> ...
john> - :param from_transport: optional transport to build from. The built
john> + :param _from_transport: optional transport to build from. The built
john> transport will share the connection with this transport.
john> """
john> ^- Maybe add the comment:
john> The caller is required to ensure that _from_transport points at the same host
john> as the new base.
Done.
john> - def __init__(self, url, from_transport=None, medium=None, _client=None):
john> + def __init__(self, url, _from_transport=None, medium=None, _client=None):
john> """Constructor.
john> ^- Having 2 private parameters and a public one inbetween seems a little weird.
john> Probably we can't change it. It just doesn't seem correct.
The plan is to remove medium and _client from the RemoteTransport
constructor and offers a class dedicated to the tests.
john> I believe you've satisfied the bulk of my concerns. (+1 Cond)
I'll merge it with the points mentioned above and will address
the others when I'm back.
Vincent
More information about the bazaar
mailing list