[merge] cache encoding

Mon Aug 14 14:55:17 BST 2006

Lele Gaifax wrote:
> John Arbash Meinel wrote:
>> The problem, though is code like 'Tailor', et al. I know we are safe for
>> baz=>bzr, because Arch never supported anything other than ASCII (at
>> least not officially).
> 
> 
> Hi John,
> 
> care to expand a little on that ref to Tailor? AFAICT it uses almost the
> same algorithm to build a bzr revid (and thus suffers of the same
> problem with the potentially non-ascii email addresses).
> 
> thank you,
> bye, lele.

I don't know the conversion algorithms of other systems. I don't know if
tailor uses deterministic ids or not. It sounds like it does not, just
letting the target system create new ids as it goes.
Which sort of makes me wonder how it achieves the ability to restart
from a given point (I assume it keeps a pointer somewhere).

Other converters (bzr-svn and baz2bzr) use deterministic file-ids and
revision-ids, based on the source revisions. (bzr-svn uses the uuid of
the repository + revnum + path to the branch, baz2bzr uses an
Arch-1:gnu-arch at foobar--2005%category--branch--0.1, which is just the
revision id from Arch).

Anyway, I don't know the conversion algorithms for all sources, or what
the source allows. I assume SVN supports unicode filenames, which means
bzr-svn can create unicode revision ids. I know Arch never supported
non-ascii, so we don't have any problems with baz2bzr branches.

If Tailor is generating new revision ids, regardless of the source, then
they should all be mostly ascii safe. (modulo the email address stuff).

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060814/775e788c/attachment.pgp