storage branch - remaining issues?

John Arbash Meinel john at arbash-meinel.com
Wed Jan 18 23:11:32 GMT 2006


Robert Collins wrote:
> On Wed, 2006-01-18 at 11:35 -0600, John Arbash Meinel wrote:
> 
>>Robert Collins wrote:
>>
>>>the following things in the diff look like things we should address
>>>before merging:
>>>
>>>+        elif mode == 'r':
>>>+            # XXX: Do we really want errors='replace'?   Perhaps it
>>>should be
>>>+            # an error, or at least reported, if there's
>>>incorrectly-encoded
>>>+            # data inside a file.
>>>+            # <https://launchpad.net/products/bzr/+bug/3823>
>>>+            return
>>>codecs.getreader('utf-8')(self._transport.get(relpath),
>>>errors='replace')
>>>
>>>There were a couple of warts left, that I have fixed up: the repository
>>>was not taking out any read or write locks on most of its methods, and
>>>LockableFiles did not take a write lock out for put[_utf8].
>>>
>>>Other than that, I think this is good to go. I'm pushing my tweaks now.
>>>
>>>Rob
>>
>>I would like to see us get rid of errors='replace'. It just covers up
>>bugs, which should be fixed elsewhere.
> 
> 
> Yup. I think dying is appropriate - if someone has altered a text utf8
> file to make it smoething else - it should be repaired, not ignored.
> 
> 
>>Also, rather than using controlfile() I really think we should use get()
>>and get_utf8(), to be consistent with put() and put_utf8().
> 
> 
> I agree.
> 

Plus it makes things shorter and less redundant, rather than having
branch.control_files.controlfile()

> 
>>I don't mind switching to the lftp spec so that we have:
>>sftp://user:pass@host:port/~/relative/path
>>and
>>sftp://user:pass@host:port/absolute/path
>>
>>But I would like to keep:
>>sftp://user:pass@host:port//absolute/path
>>
>>I know that I've started using it, and it would be really nice if I
>>didn't have every branch complaining on me all the time.
> 
> 
> I've done that already on my integration branch after martins request
> last week. I think that keeping ..port//abs.. working is unneeded - we
> have not done a release with the double slash syntax.
> 

Well, 0.7rc* supports double slash paths. And bzr.dev has been around
quite a while supporting them.

We could deprecate it, or automatically switch that path to one without
the double slash. But it is unambiguous. You wouldn't use 2 slashes to
indicate you wanted a local path. So I am tempted to leave it in.

(Plus I just got a co-worker up and running with the '//' form, and I
don't want to confuse him since he just started.)

> 
>>It seems that 'find_branch' has been deprecated long enough (especially
>>since it raises a full exception). It should just be removed.
> 
> 
> Ack.
> 
> 
>>needs_read_lock and needs_write_lock should get the same treatment that
>>the 'deprecated' code has (where we set func.__name__, etc)
> 
> 
> Ack.
> 
> 
>>Ultimately, we should not be doing: 't = get_transport(unicode(base))',
>>but I think that can wait for my encoding branch, since part of my work
>>is to make sure that transports really do require and return urls. Which
>>they will then internally translate into unicode as needed.
> 
> 
> We should remove the unicode cast minimally though.
> 

I'm not sure what you mean by 'minimally'. Are you just saying it
shouldn't exist in storage? I don't mind keeping it in, because
otherwise until my encoding branch is merged, we have to go around and
change Branch.open_containing(u'.') all over the place, rather than
open_containing('.'), which is more natural.

I think Branch.open_containing() should just take a regular path. Which
may be a url, or may be local. Which makes me think that only
LocalTransport should be the one doing the translating. Since it is the
only one that might be getting an invalid url. In fact, I would almost
state that LocalTransport *shouldn't* get a url as its base parameter.
Since http:// and sftp:// are required to be urls, we don't want to have
Branch translate if given one of those, but we do want to translate for
'/' paths (maybe/maybe not for file:/// style paths).
Local is the only time we want to translate from a path into a url, so
it should be LocalTransport's responsibility. Is this crazy, or reasonable?

> 
>>We have the same bug in:
>>+    def peek_lock_mode(self):
>>+        """Return lock mode for the Branch: 'r', 'w' or None"""
>>+        raise NotImplementedError(self.is_locked)
>>+
>>
>>This needs to raise NotImplementedError(self.peek_lock_mode).
>>
>>It looks like there is a PEP8 violation in branch.py before 'def
>>update_revisions' an extra space was added.
> 
> 
> Thanks.
> 
> 
>>If LockableFiles is not going to know what files are going to be created
>>there (and it can't since Repository has a different set than Branch),
>>we need a better api for creating stuff than:
>>self.controlfiles._transport.mkdir_multi()
>>
>>Also, BzrBranch is creating the 'revision-store' and 'weaves'
>>directories, but really these should be part of the Repository creation.
>>So probably Repository needs some initialization functions as well.
> 
> 
> I agree but would like to leave that for the branch-formats work.
> 

I'm fine with that. I would request a TODO: so we don't forget.

> 
>>Branch.print_file() which calls Repository.print_file() which calls
>>Tree.print_file() is a bad api.
>>It writes directly to sys.stdout, which doesn't give us any opportunity
>>to write to a new handle from the command line.
>>It ties deeply nested code into assuming that sys.stdout is valid.
> 
> 
> Yup, however that is just moved - we should add a TODO to fix it but not
> a BRANCH.TODO.
> 

I agree, plain TODO in the code is fine.

> 
>>Inside 'put_utf8' we are iterating over the input file object, and
>>encoding it into the output.
>>However, file objects should only be bytestreams. They should not be a
>>unicode string.
>>I would say that put_utf8 should *only* accept strings not files. In
>>fact, it should only accept unicode strings. But I would allow ascii
>>regular strings as well.
> 
> 
> That works for me
> 
> 
> 
>>Which also means that we don't need 'file_iterator' since this is the
>>only place it is used. We might still need IterableFile for other
>>things, though.
>>
>>There are the FIXME in bzrlib/store/__init__.py which say that allowing
>>gz files should be part of TextStore, not a generic part of the store
>>interface. Though right now only TextStore inherits and actually
>>conforms to the Store interface. WeaveStore is a very different beast.
>>(And it would be good if WeaveStore allowed gzipped files anyway).
>>
>>Not all of these are critical, but I would at least like them made into
>>a TODO if they aren't changed.
>>
>>Overall, good job on factoring out Storage. Just some cleanup work left.
> 
> 
> Sweet.
> Rob
> 

Probably a bit of changes still to come if we change controlfile() to
get(), but mostly that is just search & replace.

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060118/c36a4778/attachment.pgp 


More information about the bazaar mailing list