twisted much?
Fredrik Lundh
fredrik at pythonware.com
Tue Jun 14 20:26:01 BST 2005
Aaron Bentley wrote:
> Fredrik Lundh wrote:
> | if anyone's interested, I have an enhanced version of the HTTP client and
> | download manager used in this article:
> |
> | http://effbot.org/zone/effnews-1.htm
>
> That looks like it might be just the ticket. If you're interested in
> adding support yourself, the RemoteBranch and RemoteStore classes in
> remotebranch.py would be the place to look. Just be warned, they're
> designed for a synchronous API that produces file-like objects.
I'm afraid I have zero time to spend on this before mid-July or so :-(
a few assorted notes, after a very quick look at the code:
- a simple solution that could work would be to create a manager class
with a get_url() method which adds the URI to the manager queue, and
returns a file-like object. when someone tries to read from this object,
let it poll the manager until (at least) the stream for that URL has been
fetched.
by default, each file-like object would have a corresponding consumer
object that adds data to an internal buffer (which is then wrapped in
a (c)StringIO object). by making the consumer "configurable after the
fact", you can use optimized paths for code that needs to parse XML
data or copy data to local storage.
- you can parse incoming XML incrementally, via (c)ElementTree's XML-
Parser class. instead of using read_xml(), create a consumer that holds
an XMLParser instance, feed data to it as it arrives, and pass the result
to the from_element(). or put that logic in the file-like object described
above, and teach read_xml() to deal with such objects:
def read_xml(cls, f):
if isinstance(f, RemoteFile):
f.activate_xml_parser()
f.sync() # poll network, parse data as it arrives
tree = f.get_xml_tree()
else:
tree = ElementTree().parse(f)
return cls.from_element(tree)
where the activate_xml_parser method creates an XMLParser instance
and attaches it to the consumer object (feeding all buffered data through
the parser first, of course), and get_xml_tree calls the parser's "close"
method to get the element root.
- you can add similar methods for downloading -- or, more flexible, make
it possible to plug in arbitrary "consumer sinks":
def read_xml(cls, f):
if isinstance(f, RemoteFile):
f.set_sink(XMLParser())
f.sync() # poll network, parse data as it arrives
tree = f.get_sink_result()
else:
tree = ElementTree().parse(f)
return cls.from_element(tree)
where XMLParser is an ElementTree.XMLParser, and a hypothetical data
sink might look something like:
class DataSink:
def __init__(self, localfile):
self.file = open(localfile, "wb")
def feed(self, data):
self.file.write(data)
def close(self):
# remote file wrapper saves the return value as the
# "sink result"
self.file.close()
return open(self.file.name, "rb") # reopen for reading
</F>
More information about the bazaar
mailing list