storage branch - remaining issues?

Robert Collins robertc at robertcollins.net
Wed Jan 18 22:56:46 GMT 2006


On Wed, 2006-01-18 at 11:35 -0600, John Arbash Meinel wrote:
> Robert Collins wrote:
> > the following things in the diff look like things we should address
> > before merging:
> > 
> > +        elif mode == 'r':
> > +            # XXX: Do we really want errors='replace'?   Perhaps it
> > should be
> > +            # an error, or at least reported, if there's
> > incorrectly-encoded
> > +            # data inside a file.
> > +            # <https://launchpad.net/products/bzr/+bug/3823>
> > +            return
> > codecs.getreader('utf-8')(self._transport.get(relpath),
> > errors='replace')
> > 
> > There were a couple of warts left, that I have fixed up: the repository
> > was not taking out any read or write locks on most of its methods, and
> > LockableFiles did not take a write lock out for put[_utf8].
> > 
> > Other than that, I think this is good to go. I'm pushing my tweaks now.
> > 
> > Rob
> 
> I would like to see us get rid of errors='replace'. It just covers up
> bugs, which should be fixed elsewhere.

Yup. I think dying is appropriate - if someone has altered a text utf8
file to make it smoething else - it should be repaired, not ignored.

> Also, rather than using controlfile() I really think we should use get()
> and get_utf8(), to be consistent with put() and put_utf8().

I agree.

> I don't mind switching to the lftp spec so that we have:
> sftp://user:pass@host:port/~/relative/path
> and
> sftp://user:pass@host:port/absolute/path
> 
> But I would like to keep:
> sftp://user:pass@host:port//absolute/path
> 
> I know that I've started using it, and it would be really nice if I
> didn't have every branch complaining on me all the time.

I've done that already on my integration branch after martins request
last week. I think that keeping ..port//abs.. working is unneeded - we
have not done a release with the double slash syntax.

> It seems that 'find_branch' has been deprecated long enough (especially
> since it raises a full exception). It should just be removed.

Ack.

> needs_read_lock and needs_write_lock should get the same treatment that
> the 'deprecated' code has (where we set func.__name__, etc)

Ack.

> Ultimately, we should not be doing: 't = get_transport(unicode(base))',
> but I think that can wait for my encoding branch, since part of my work
> is to make sure that transports really do require and return urls. Which
> they will then internally translate into unicode as needed.

We should remove the unicode cast minimally though.

> We have the same bug in:
> +    def peek_lock_mode(self):
> +        """Return lock mode for the Branch: 'r', 'w' or None"""
> +        raise NotImplementedError(self.is_locked)
> +
> 
> This needs to raise NotImplementedError(self.peek_lock_mode).
> 
> It looks like there is a PEP8 violation in branch.py before 'def
> update_revisions' an extra space was added.

Thanks.

> If LockableFiles is not going to know what files are going to be created
> there (and it can't since Repository has a different set than Branch),
> we need a better api for creating stuff than:
> self.controlfiles._transport.mkdir_multi()
> 
> Also, BzrBranch is creating the 'revision-store' and 'weaves'
> directories, but really these should be part of the Repository creation.
> So probably Repository needs some initialization functions as well.

I agree but would like to leave that for the branch-formats work.

> Branch.print_file() which calls Repository.print_file() which calls
> Tree.print_file() is a bad api.
> It writes directly to sys.stdout, which doesn't give us any opportunity
> to write to a new handle from the command line.
> It ties deeply nested code into assuming that sys.stdout is valid.

Yup, however that is just moved - we should add a TODO to fix it but not
a BRANCH.TODO.

> Inside 'put_utf8' we are iterating over the input file object, and
> encoding it into the output.
> However, file objects should only be bytestreams. They should not be a
> unicode string.
> I would say that put_utf8 should *only* accept strings not files. In
> fact, it should only accept unicode strings. But I would allow ascii
> regular strings as well.

That works for me


> Which also means that we don't need 'file_iterator' since this is the
> only place it is used. We might still need IterableFile for other
> things, though.
> 
> There are the FIXME in bzrlib/store/__init__.py which say that allowing
> gz files should be part of TextStore, not a generic part of the store
> interface. Though right now only TextStore inherits and actually
> conforms to the Store interface. WeaveStore is a very different beast.
> (And it would be good if WeaveStore allowed gzipped files anyway).
> 
> Not all of these are critical, but I would at least like them made into
> a TODO if they aren't changed.
> 
> Overall, good job on factoring out Storage. Just some cleanup work left.

Sweet.
Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060119/6cd4227a/attachment.pgp 


More information about the bazaar mailing list