[RFC] TreeTransform and 'iter_files_bytes()'
John Arbash Meinel
john at arbash-meinel.com
Tue Mar 24 22:12:12 GMT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins wrote:
> On Tue, 2009-03-24 at 16:30 -0500, John Arbash Meinel wrote:
>
>
>> Either that, or TT.create_file() should do:
>>
>> if isinstance(content, str):
>> out.write(content)
>
>> out.writelines(content)
>>
>> This is a rather huge difference for 'bzr co' times....
>>
>> At least, I think that is the easier fix, rather than changing the api
>> of "iter_files_bytes()" to return a 'chunked' format. Mostly because it
>> is a more significant API change.
>>
>> Thoughts?
>
> I'd be incined to have TT do type(content) is str:, rather than require
> chunked; OTOH if API docs for iter_files_bytes say that chunked is
> valid, chunked may avoid some double handling with gc.
>
> Rob
So "Repository.iter_files_bytes()" says it returns an iterator of
bytestrings, of which a string is unfortunately valid, but awfully slow.
(it returns 1 character at a time).
Even further:
x = 'a string'
''.join([x]) is x # True
''.join(x) is x # False :(
As for GC, gc actually creates fulltext records now that we have the C
extension. So get_bytes_as('chunked') returns [fulltext].
Now, RevisionTree assumes that iter_files_bytes() is returning a simple
string, since get_file_text() returns the exact object returned from
iter_files_bytes.
(And even passes that value to a StringIO() in get_file() so we know it
thinks it is exactly a string.)
I can track these down, I'm just concerned about who else is using
iter_files_bytes. I'm actually tempted to force it to chunked, just to
find any other callers that are actually broken wrt what an "iterator of
bytestrings" is.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEUEARECAAYFAknJWrwACgkQJdeBCYSNAANY8gCferp6P6eeR8SJvQDU+PrluJht
obIAmNfPqhZSlkXqAjP2j4G8aMUy11I=
=Qy0f
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list