New format checklist

Mon Jan 2 23:28:57 GMT 2006

On Mon, 2006-01-02 at 10:15 -0600, John Arbash Meinel wrote:
> I'm trying to put together all of the things which require a new branch
> format. So please add on if I have missed anything. Remembering the pain
> of the last format upgrade, it would be nice to have a single upgrade,
> or perhaps more support for older formats.

http://bazaar.canonical.com/SuperMirrorFormats

Martin has agreed to ensure that we meet that guideline - which is to
say full support for older formats for 2 months *after* obsolescence,
and thats a minimum, not an actual :). So I think its safe to aim for
'upgrades are easy and small, and we support lots of formats'.

> 1) knits
>    These obviously require a new format. I'm not sure what the timeline
>    for them being completed is.

Storage needs to land, then the knit code is basically ready AFAICT.

> 2) bound branches
>    A new format is not strictly required, but it prevents old clients
>    from committing to a bound branch, and not respecting its bound state

Should be a format bump.

> 3) Encoding revision & file ids
>    The final decision seemed to be that we should allow most unicode
>    characters in revision and file ids. Which means we need a mapping
>    between ids and valid filesystem names.
>    I'm not sure how to map unicode into filenames. I know there is
>    urlencoding which should handle a lot of the bad characters. But
>    does it map from unicode? A lot of filesystems have their own
>    encoding (on windows it is UTF-16/mbcs, a lot of other systems use
>    utf-8). Do we want to do something like:
>       path = urlencode(revision_id.encode('utf-8'))

To get from unicode to URL formats, a URL producer SHOULD do:
urlencode(unicode.encode('utf8'))

And URL consumers should only decode for UI purposes, as URLs are a
round trip mechanism. 

In this case, we're trying to ensure that URLs for our files are
predicatable by trimming the input character set to ascii. So yes,
following your line of code there is both reasonable and reliable.

> 4) Refactoring of metadata. We would like to split things under .bzr
>    into:
> 	.bzr/checkout/
> 	.bzr/branch/
> 	.bzr/repository/
>    Which will help ease the transition when we start separating where
>    these objects reside.
> 
> Anything else?

offhand - tags, working tree format introduction (subset of 4?)

I'd like us to get into the habit of small, frequent format changes
rather than big painful experiences. I figure the first one will be
traumatic *code wise* as we put in place the needed facilities to select
formats for new branches, convert remote branches etc. And for that
reason I think the first format change should be a No-op change - that
is, a format number bump with no actual changes to the disk format - to
let us get the infrastructure right.

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060103/c5114b43/attachment.pgp