storage branch - remaining issues?

Thu Jan 19 01:19:12 GMT 2006

Aaron Bentley wrote:
> Robert Collins wrote:
> |>Ultimately, we should not be doing: 't = get_transport(unicode(base))',
> |>but I think that can wait for my encoding branch, since part of my work
> |>is to make sure that transports really do require and return urls. Which
> |>they will then internally translate into unicode as needed.
> |
> |
> | We should remove the unicode cast minimally though.
> 
> I think what we need is a UI function that interprets a path or URL and
> produces a URL.
> 

That is a possibility. But having LocalTransport translate, and none of
the other transports translate also has that effect.

> 
> |>Inside 'put_utf8' we are iterating over the input file object, and
> |>encoding it into the output.
> |>However, file objects should only be bytestreams. They should not be a
> |>unicode string.
> 
> Yes, it's messy that some files produce bytestrings and some produce
> unicodestrings, but I don't know where you get "should".

Without using a codec.get_reader() adapter, all files produce
bytestreams (because all files *are* bytestreams). The fact that
StringIO.StringIO() can be unicode inside is usually thought of as a
misfeature in our codebase.

> 
> |>I would say that put_utf8 should *only* accept strings not files. In
> |>fact, it should only accept unicode strings. But I would allow ascii
> |>regular strings as well.
> |
> |
> | That works for me
> 
> It does mean that in order to put unicode into a file, you have to have
> the whole thing in memory.  I'd prefer accepting an iterable (which a
> string is), but I can live with this.

But where did you get unicode without decoding it from somewhere else?
And if you decoded it from somewhere else, why not skip the decode and
re-encode steps, and just pass the original file to put()?

put_utf8() is really only meant to be (I have some stuff in memory which
is already in unicode, and I want you to encode it, and put it into this
control file).

> 
> |>Which also means that we don't need 'file_iterator' since this is the
> |>only place it is used. We might still need IterableFile for other
> |>things, though.
> 
> I planned on using file_iterator in the TreeTransform branch.
> IterableFile isn't used elsewhere, so we can nuke it now, and revert it
> back when we need it.
> 
> Aaron

I was going to use it in my changeset branch as well (if that ever gets
anywhere). So I don't mind if it stays around unused for a little while.

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060118/34aaeff4/attachment.pgp