BigString (reducing peak memory)

Aaron Bentley aaron at aaronbentley.com
Wed Nov 16 16:49:06 UTC 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11-11-16 11:26 AM, Gordon Tyler wrote:
> This is just a wild idea, but would using Python generator
> expressions help with keeping only a small part of a large data
> file in memory?

Not really.  The point is that the data needs to be dealt with in
smaller chunks, whether they're read() from a file directly or
iterated through from a generator.  Dealing with the interface
difference between files and iterables is the easy bit.  It's avoiding
reading entire files at once that seems to be problematic.

Now, if you want to avoid reading files into memory, you could mmap
them instead, but this is still subject to 32-bit memory limitations.

> It would have the advantage of not having to make code aware of the
> fact that the data is being chunked.

Generators produce iterables.  Data that is chunked is already in an
iterable, frequently created by a generator.  Code that deals with
iterables will need to be the same, either way.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk7D6YIACgkQ0F+nu1YWqI3b0QCeImt1fM5emgymU3m8WZxe/Lgc
GZYAn3viWbyvC9n8/OE7DJnu/gCUTay3
=0idH
-----END PGP SIGNATURE-----



More information about the bazaar mailing list