twisted much?

Fredrik Lundh fredrik at pythonware.com
Tue Jun 14 20:26:01 BST 2005


Aaron Bentley wrote:

> Fredrik Lundh wrote:
> | if anyone's interested, I have an enhanced version of the HTTP client and
> | download manager used in this article:
> |
> |     http://effbot.org/zone/effnews-1.htm
>
> That looks like it might be just the ticket.  If you're interested in
> adding support yourself, the RemoteBranch and RemoteStore classes in
> remotebranch.py would be the place to look.  Just be warned, they're
> designed for a synchronous API that produces file-like objects.

I'm afraid I have zero time to spend on this before mid-July or so :-(

a few assorted notes, after a very quick look at the code:

- a simple solution that could work would be to create a manager class
with a get_url() method which adds the URI to the manager queue, and
returns a file-like object.  when someone tries to read from this object,
let it poll the manager until (at least) the stream for that URL has been
fetched.

by default, each file-like object would have a corresponding consumer
object that adds data to an internal buffer (which is then wrapped in
a (c)StringIO object).  by making the consumer "configurable after the
fact", you can use optimized paths for code that needs to parse XML
data or copy data to local storage.

- you can parse incoming XML incrementally, via (c)ElementTree's XML-
Parser class.  instead of using read_xml(), create a consumer that holds
an XMLParser instance, feed data to it as it arrives, and pass the result
to the from_element().  or put that logic in the file-like object described
above, and teach read_xml() to deal with such objects:

    def read_xml(cls, f):
        if isinstance(f, RemoteFile):
            f.activate_xml_parser()
            f.sync() # poll network, parse data as it arrives
            tree = f.get_xml_tree()
        else:
            tree = ElementTree().parse(f)
        return cls.from_element(tree)

where the activate_xml_parser method creates an XMLParser instance
and attaches it to the consumer object (feeding all buffered data through
the parser first, of course), and get_xml_tree calls the parser's "close"
method to get the element root.

- you can add similar methods for downloading -- or, more flexible, make
it possible to plug in arbitrary "consumer sinks":


    def read_xml(cls, f):
        if isinstance(f, RemoteFile):
            f.set_sink(XMLParser())
            f.sync() # poll network, parse data as it arrives
            tree = f.get_sink_result()
        else:
            tree = ElementTree().parse(f)
        return cls.from_element(tree)

where XMLParser is an ElementTree.XMLParser, and a hypothetical data
sink might look something like:

    class DataSink:
        def __init__(self, localfile):
            self.file = open(localfile, "wb")
        def feed(self, data):
            self.file.write(data)
        def close(self):
            # remote file wrapper saves the return value as the
            # "sink result"
            self.file.close()
            return open(self.file.name, "rb") # reopen for reading

</F>







More information about the bazaar mailing list