[RFC] TreeTransform and 'iter_files_bytes()'

John Arbash Meinel john at arbash-meinel.com
Tue Mar 24 22:12:12 GMT 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Tue, 2009-03-24 at 16:30 -0500, John Arbash Meinel wrote:
> 
> 
>> Either that, or TT.create_file() should do:
>>
>> if isinstance(content, str):
>>   out.write(content)
> 
>>   out.writelines(content)
>>
>> This is a rather huge difference for 'bzr co' times....
>>
>> At least, I think that is the easier fix, rather than changing the api
>> of "iter_files_bytes()" to return a 'chunked' format. Mostly because it
>> is a more significant API change.
>>
>> Thoughts?
> 
> I'd be incined to have TT do type(content) is str:, rather than require
> chunked; OTOH if API docs for iter_files_bytes say that chunked is
> valid, chunked may avoid some double handling with gc.
> 
> Rob

So "Repository.iter_files_bytes()" says it returns an iterator of
bytestrings, of which a string is unfortunately valid, but awfully slow.
(it returns 1 character at a time).

Even further:
  x = 'a string'
  ''.join([x]) is x	# True
  ''.join(x) is x	# False :(

As for GC, gc actually creates fulltext records now that we have the C
extension. So get_bytes_as('chunked') returns [fulltext].

Now, RevisionTree assumes that iter_files_bytes() is returning a simple
string, since get_file_text() returns the exact object returned from
iter_files_bytes.

(And even passes that value to a StringIO() in get_file() so we know it
thinks it is exactly a string.)

I can track these down, I'm just concerned about who else is using
iter_files_bytes. I'm actually tempted to force it to chunked, just to
find any other callers that are actually broken wrt what an "iterator of
bytestrings" is.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEUEARECAAYFAknJWrwACgkQJdeBCYSNAANY8gCferp6P6eeR8SJvQDU+PrluJht
obIAmNfPqhZSlkXqAjP2j4G8aMUy11I=
=Qy0f
-----END PGP SIGNATURE-----



More information about the bazaar mailing list