Entry IDs for Atom feeds generated from Bazaar branches

John Arbash Meinel john at arbash-meinel.com
Mon Apr 2 14:13:31 BST 2007


James Henstridge wrote:
> On 30/03/07, James Henstridge <james at jamesh.id.au> wrote:
>> > 2) I would also mention that they should be url escaped. (@ => %40,
>> > etc). Just so that people don't forget about it.
>>
>> Yep.  The Atom spec recommends only percent encoding characters that
>> need to be encoded, so I'll take a look at what characters these are
>> for fragment identifiers (I am not sure that @ needs to be, for
>> instance).
> 
> Looking at RFC 3986, we have:
> 
> fragment      = *( pchar / "/" / "?" )
> pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
> unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
> sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
>               / "*" / "+" / "," / ";" / "="
> pct-encoded   = "%" HEXDIG HEXDIG
> 
> So "@" is allowed in a fragment identifier without percent-escaping
> (and hence should not be escaped in an Atom entry ID).
> 
> James.
> 

It also seems to not define the character encoding... I assume UTF-8 is
sufficient. (Since they only allow 2 HEXDIG you obviously have to have
an 8-bit string).

The only characters I definitely see that need to be escaped are '+' and
',', which could occur in a revision id.

Another possibility is to try to use tag_uri's, and fall back in a
predictable way. (Maybe a recommendation for Arch, and for bzr-svn?)

John
=:->




More information about the bazaar mailing list