[MERGE] Updated sftp_readv
John Arbash Meinel
john at arbash-meinel.com
Thu Dec 20 17:49:13 GMT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Vincent Ladeuil wrote:
>>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:
>
> john> Vincent Ladeuil wrote:
> john> ...
>
> >> I think we should take a decision on that point and also
> >> includes/excludes the ability to specify duplicate offsets to
> >> readv.
> >>
> >> I can't imagine use cases for neither functionality, so I'd vote
> >> to get rid of them.I f needed, the caller can handle such cases
> >> but at the transport level and given the complexity of the
> >> actuals implementationS, I'd prefer that we simply exclude them.
>
> john> Well, I did set the default to allow_overlap=False. And
> john> it turns out that the overlap check will also disallow
> john> duplicate ranges.
>
> True. So what about getting rid of that, if the need arise,
> either the caller may find a simple way to do it or we'll have a
> cleaner base to implement it.
>
> john> ...
>
> john> last_end = None
> john> cur = _CoalescedOffset(None, None, [])
> john> + coalesced_offsets = []
> >>
> >> Why did you delete the iterator behavior ? You still iterate the
> >> result in the caller.
>
> john> Because all callers actually use a list. If you look at
> john> the sftp code, it actually casts it into a list, and
> john> wraps a different iterator around it. We need the data
> john> 2 times, so we have to use a list. The second iterator
> john> is just rather than keeping an integer of which offset
> john> we are on.
>
> Ok. I was just surprised, no problem with that.
>
> <snip/>
>
> john> Are you talking about _coalesce_offsets or _sftp_readv?
>
> _sftp_readv.
>
> _coalesce_offsets is a bit surprising at first read, but after
> that, there is one big condition which is easy to read.
>
> <snip/>
>
> >> Your complexity daemon warns you too I see :)
>
> john> Actually, I had started to try and implement it that
> john> way. It was going to take me too long to get all the
> john> bits right, and doing "data = buffer[start:end]" is a
> john> whole lot simpler than trying to do the same thing from
> john> a list.
>
> Sure. But l.append() / s = ''.join(l) is mentioned in several
> places as the most optimal way to handle such things, but until
> the method become far simpler I will not require that change.
>
The problem is that it doesn't break at boundaries. So what I actually need is:
data = ''.join([buffer[10][12:]] + buffer[11:15] + [buffer[15][:18]])
And the hard part is figuring out what all of those numbers should be. It might
be something like:
start_block = start_offset = None
end_block = end_offset = None
bytes_so_far = 0
for block_idx, block in enumerate(buffer):
next_bytes_so_far = bytes_so_far + len(block)
if start_block is None:
if next_bytes_so_far > start:
start_block = block_idx
start_offset = start - bytes_so_far
if end_block is None:
if next_bytes_so_far > end:
end_block = block_idx
end_offset = end - bytes_so_far
break # We know we are done
if end_block == start_block:
data = buffer[start_block][start_offset:end_offset]
else:
data = ''.join([buffer[start_block][start_offset:]]
+ buffer[start_block+1:end_block]
+ buffer[end_block][:end_offset])
Which I think is correct, but it certainly doesn't fall under the "simple"
definition.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHaqsZJdeBCYSNAAMRAi8sAJ9O2whpb3H74o+uIVPj6UC2kckHJQCgyA2Y
Mpfl+X+SF2OKLSWXlREm4Yk=
=X4Vy
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list