[MERGE] Faster 'build_tree'
Aaron Bentley
aaron.bentley at utoronto.ca
Thu Jul 26 18:04:33 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John Arbash Meinel wrote:
>> If it was essentially doing this already, it seems questionable to add
>> extra code. Saving a function call isn't a very big win for these
>> operations.
> Well, compare:
>
> lines = rt.get_file(file_id).readlines()
>
> to
>
> lines = rt.get_file_lines(file_id)
Sure, I can see how avoiding double-handling is helpful there.
But for rt.get_file_text(), you're just choosing where you're doing
''.join().
> So propagating the callers needs down the stack means that the low-level
> implementation can do what in needs to to return it. Rather than having all the
> higher-level apis massaging the data repeatedly.
That's as may be. To me, it seems pretty silly to have three different
methods to get the content of a file.
And frankly, all of them are wrong:
- get_file requires you to return a file-like object, which is
frequently pointless overhead.
- get_lines() can exhaust memory when dealing with large binary files,
because the files may not contain \n.
- get_text() can exhaust memory when dealing with large files
> I don't know any current users of get_file_text() offhand. I'm fine with not
> messing with the function at this point.
I had no idea it existed.
> I believe that it can ultimately be better for us to work in file blocks,
> rather than as strictly lines, except for in places where we actually care
> about lines (like in annotations).
Full ACK.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGqNQh0F+nu1YWqI0RAmLeAJ9WBoixWCH9/J5w14e9goXSyhk03gCfQj6M
dpt5UcSGemiU5IbAph44//E=
=Uwo6
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list