[MERGE] Updated sftp_readv

John Arbash Meinel john at arbash-meinel.com
Thu Dec 20 17:49:13 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vincent Ladeuil wrote:
>>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
> 
>     john> Vincent Ladeuil wrote:
>     john> ...
> 
>     >> I think we should take a decision on that point and also
>     >> includes/excludes the ability to specify duplicate offsets to
>     >> readv.
>     >> 
>     >> I can't imagine use cases for neither functionality, so I'd vote
>     >> to get rid of them.I f needed, the caller can handle such cases
>     >> but at the transport level and given the complexity of the
>     >> actuals implementationS, I'd prefer that we simply exclude them.
> 
>     john> Well, I did set the default to allow_overlap=False. And
>     john> it turns out that the overlap check will also disallow
>     john> duplicate ranges.
> 
> True. So what about getting rid of that, if the need arise,
> either the caller may find a simple way to do it or we'll have a
> cleaner base to implement it.
> 
>     john> ...
> 
>     john> last_end = None
>     john> cur = _CoalescedOffset(None, None, [])
>     john> +        coalesced_offsets = []
>     >> 
>     >> Why did you delete the iterator behavior ? You still iterate the
>     >> result in the caller.
> 
>     john> Because all callers actually use a list. If you look at
>     john> the sftp code, it actually casts it into a list, and
>     john> wraps a different iterator around it. We need the data
>     john> 2 times, so we have to use a list. The second iterator
>     john> is just rather than keeping an integer of which offset
>     john> we are on.
> 
> Ok. I was just surprised, no problem with that.
> 
> <snip/>
> 
>     john> Are you talking about _coalesce_offsets or _sftp_readv?
> 
> _sftp_readv.
> 
> _coalesce_offsets is a bit surprising at first read, but after
> that, there is one big condition which is easy to read.
> 
> <snip/>
> 
>     >> Your complexity daemon warns you too I see :)
> 
>     john> Actually, I had started to try and implement it that
>     john> way. It was going to take me too long to get all the
>     john> bits right, and doing "data = buffer[start:end]" is a
>     john> whole lot simpler than trying to do the same thing from
>     john> a list.
> 
> Sure. But l.append() / s = ''.join(l) is mentioned in several
> places as the most optimal way to handle such things, but until
> the method become far simpler I will not require that change.
> 

The problem is that it doesn't break at boundaries. So what I actually need is:

data = ''.join([buffer[10][12:]] + buffer[11:15] + [buffer[15][:18]])

And the hard part is figuring out what all of those numbers should be. It might
be something like:

start_block = start_offset = None
end_block = end_offset = None
bytes_so_far = 0
for block_idx, block in enumerate(buffer):
  next_bytes_so_far = bytes_so_far + len(block)
  if start_block is None:
    if next_bytes_so_far > start:
      start_block = block_idx
      start_offset = start - bytes_so_far
  if end_block is None:
    if next_bytes_so_far > end:
      end_block = block_idx
      end_offset = end - bytes_so_far
      break # We know we are done

if end_block == start_block:
  data = buffer[start_block][start_offset:end_offset]
else:
  data = ''.join([buffer[start_block][start_offset:]]
                 + buffer[start_block+1:end_block]
                 + buffer[end_block][:end_offset])


Which I think is correct, but it certainly doesn't fall under the "simple"
definition.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHaqsZJdeBCYSNAAMRAi8sAJ9O2whpb3H74o+uIVPj6UC2kckHJQCgyA2Y
Mpfl+X+SF2OKLSWXlREm4Yk=
=X4Vy
-----END PGP SIGNATURE-----



More information about the bazaar mailing list