socket error on Windows with bzr:// and large projects

Wed Feb 7 23:05:53 GMT 2007

We managed to do a conversion of one of the Mozilla trees, and J. Paul
Reed was experimenting with doing a 'bzr branch' over the bzr:// protocol.

This is the traceback he got:
Traceback (most recent call last):
  File "bzrlib\commands.pyc", line 650, in run_bzr_catch_errors
  File "bzrlib\commands.pyc", line 612, in run_bzr
  File "bzrlib\commands.pyc", line 304, in run_argv_aliases
  File "bzrlib\builtins.pyc", line 750, in run
  File "bzrlib\bzrdir.pyc", line 685, in sprout
  File "bzrlib\repository.pyc", line 276, in fetch
  File "bzrlib\decorators.pyc", line 51, in write_locked
  File "bzrlib\repository.pyc", line 2080, in fetch
  File "bzrlib\fetch.pyc", line 110, in __init__
  File "bzrlib\fetch.pyc", line 137, in __fetch
  File "bzrlib\fetch.pyc", line 169, in _fetch_weave_texts
  File "bzrlib\repository.pyc", line 459, in fileids_altered_by_revision_ids
  File "bzrlib\knit.pyc", line 883, in
iter_lines_added_or_present_in_versions
  File "bzrlib\knit.pyc", line 1568, in read_records_iter
  File "bzrlib\transport\smart.pyc", line 1155, in readv
  File "bzrlib\transport\smart.pyc", line 1507, in read_body_bytes
  File "bzrlib\transport\smart.pyc", line 1376, in read_bytes
  File "bzrlib\transport\smart.pyc", line 1432, in _read_bytes
  File "bzrlib\transport\smart.pyc", line 1719, in _read_bytes
error: (10055, 'No buffer space available')

And that line happens to be:
return self._socket.recv(count)

So my best guess is that we are trying to download all of
inventory.knit, which is approx 150MB at this point. And 'recv()' on
Windows is asking to buffer all of that in a receive buffer.

There are 2 ways to handle this. One is to just change the simple
self._socket.recv(count) to be a loop around the requested size, and
just have it request in smaller chunks (32KB, 64KB, whatever).

Another possibility would be to improve the caller a little bit, so that
it doesn't try to read_bytes() for very large values.

We probably need the explicit chunking anyway, but it would be nice to
have the smart transport not need to buffer too much before it can start
returning data.

Specifically it would be nice to have something slightly smaller than
"data = self.read_body_bytes()" for readv to use, so that as data is
coming back, it can yield data up the stack.

John
=:->