Entry IDs for Atom feeds generated from Bazaar branches
John Arbash Meinel
john at arbash-meinel.com
Mon Apr 2 14:13:31 BST 2007
James Henstridge wrote:
> On 30/03/07, James Henstridge <james at jamesh.id.au> wrote:
>> > 2) I would also mention that they should be url escaped. (@ => %40,
>> > etc). Just so that people don't forget about it.
>>
>> Yep. The Atom spec recommends only percent encoding characters that
>> need to be encoded, so I'll take a look at what characters these are
>> for fragment identifiers (I am not sure that @ needs to be, for
>> instance).
>
> Looking at RFC 3986, we have:
>
> fragment = *( pchar / "/" / "?" )
> pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
> unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
> sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
> / "*" / "+" / "," / ";" / "="
> pct-encoded = "%" HEXDIG HEXDIG
>
> So "@" is allowed in a fragment identifier without percent-escaping
> (and hence should not be escaped in an Atom entry ID).
>
> James.
>
It also seems to not define the character encoding... I assume UTF-8 is
sufficient. (Since they only allow 2 HEXDIG you obviously have to have
an 8-bit string).
The only characters I definitely see that need to be escaped are '+' and
',', which could occur in a revision id.
Another possibility is to try to use tag_uri's, and fall back in a
predictable way. (Maybe a recommendation for Arch, and for bzr-svn?)
John
=:->
More information about the bazaar
mailing list