what if tracker url changes?

Stephen J. Turnbull stephen at xemacs.org
Thu Nov 18 05:16:40 GMT 2010


Martin Pool writes:

 > There is a ton of theory about URLs, but basically they are a string
 > that specifies where a given resource can be found.  One could argue
 > we should instead store a URN

In fact, URLs (Uniform Resource Locator) and URNs (Uniform Resource
Name) are syntactically both instances of URIs (Uniform Resource
Identifier).  So Bazaar can't tell the difference just by looking at
them.

Semantically, as you note, the basic differences are that (1) there is
a promise that a URN will not change as long as the resource it refers
to exists (whatever that might mean for growing collections and other
dynamic content), and (2) a URL refers to a specific physical location
for a resource (whatever that might mean for a distributed resource
such as netnews), whereas a URN does not.  There is no reason in
principle why a URI can't be used as both a URL and a URN for a
particular resource, until the physical location changes.  In many
cases, even after the physical location changes, the original URL==URN
continues to be used as the URN (by definition!), and the server
automatically translates to the new URL identifying the new physical
location.

In other words, whether some URI is an URN, an URL, or both is
determined by the authority.  A software client like Bazaar cannot
reliably tell the difference, syntactically or semantically.  It must
ask the authority, or be told by the user.

It is often the case that URNs do not use the standard
"scheme://authority/path#fragment?query" syntax, and this may be
convenient for human recognition that "this URI is an URN", of course.
But this is also typical usage for distributed-by-design resources
like netnews (and a "news:" URI is an URL!)

Because of

 > It's a classic case of data that changes over time, but not
 > necessarily on the same timeline as the versioned data.  If you look
 > at an old tree, and the bugtracker has moved, you probably still want
 > the new url rules.

it's not clear to me how URNs can be transparently implemented here,
except by fixing an URN->URL mapping at first, then ensuring that the
URN remains valid by translating the original URL to the new URL in
the server forever after, in case of any movement of physical storage,
or reimplementation of a single host server as a distributed server,
or whatever.  IMO, you may as well use the initial URL as URN in this
case, but some users may prefer something whose syntax suggests URN
more strongly, and there may be use cases where starting from URN==URL
is ill-advised.

Using URNs has the disadvantage that once you've decided on a semantic
mapping for your URN syntax, you're stuck with it forever.  In many
cases I've seen, people experiment with various URLs, converge on a
"canonical URL", and *only then* define that canonical URL as the URN
for the resource.  Dealing with that scenario is hard, but necessary I
think.



More information about the bazaar mailing list