[MERGE/RFC] Working tree content filtering
Aaron Bentley
aaron at aaronbentley.com
Fri Apr 18 13:39:30 BST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Ian Clatworthy wrote:
>> John Arbash Meinel wrote:
> [...]
>> Having spent time today trying to write a filter, I now think ...
>>
>> text = f.read()
>>
>> is definitely better. Asking filter writers to process a sequence of
>> chunks makes their life *much* harder. With the exception of filters
>> that match a *single* character with no other context, the filters
>> basically need to always do
>>
>> ''.join(chunks)
>
> I think just dealing with the full file all at once is the right approach, at
> least for the first implementation of this feature. We already buffer entire
> file texts in memory.
We do have some problems in this area, but I don't think we should be
actively making them worse. Right now, creating a working tree *is* a
memory-efficient operation, and if filters are whole-file operations, it
will cease to be so.
> We'd like to move away from that, but we're not there
> yet.
Adding new APIs that don't support doing this efficiently will make it
harder to move away. I have been making an effort to ensure all my new
APIs can support memory-efficient operation.
> When we do start making a serious effort to reduce our memory footprint
> when dealing with large files, I expect that would be a good time to consider a
> more efficient chunking/streaming API for content filters.
We don't have to do it as a flag day change. I have been taking the
approach of gradually improving our memory use. The pack repo work has
helped a lot, as will my work on making file text retrieval fast.
> Presumably the
> general work would have an impact on how best to do it here. Doing it now is
> premature IMO.
It is not premature to adopt an API that is efficient now.
It is pointless to make an inefficient API, because it is trivial to
convert chunked text into a fulltext, and vice versal.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFICJaC0F+nu1YWqI0RAlULAJ9lszKiqAAp+z1XUctuCzy/yw1J/wCePLRu
xrAnZQWd2aY4MRMQKD/tQEk=
=/0gV
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list