Storage internals: UUID
Daniel Carrera
dcarrera at hush.com
Mon Jun 4 22:18:49 UTC 2012
On Monday, June 04, 2012 at 10:01 PM, Max Bowsher <_ at maxb.eu> wrote:
> Sorry, but it is incorrect. Bazaar uses IDs, and those IDs are
> constructed such that there there's good justification to believe
> them to be universally unique, but they're not UUIDs in the sense of the
> Internet Draft.
Ok. Thanks. Can you give me details? Links to docs are welcome.
> I can't speak to the initial design decision, which was many years
> before my time, but a couple of advantages that come to mind:
>
> * the ID is not inextricably tied to the binary format of the revision
Why is that an advantage?
> * and plugins which integrate with foreign VCSes have used this
> to map foreign revisions into Bazaar in various interesting ways
>
> * the IDs are more recognizable as such, and in the common case of
> non-foreign revisions, provide some minimal information (author,
> date) about the commit by mere inspection
...
>There's no hashing in Bazaar's IDs, so this isn't particularly
>relevant.
How can bzr guarantee that a commit is not tampered if the ID doesn't contain a hash? This discussion would be easier if I had documentation that I could read, but I assume you have some sort of index file that maps IDs to hashes and locations where the revision data can be found. It seems trivial to change the revision data and change the hash in the index without altering the ID.
I think Monotone had a good idea in using revision numbers that use a cryptographic hash that depends on the revision contents, metadata and history. I think Git and Mercurial did well in copying that idea. I am under the impression that bzr is relatively willing to make changes to its storage format if there seems to be a benefit. Do you think that there is any chance that a future version of bzr will expand the ID to additionally contain a hash? For example:
<author>-<date>-<branch>-<secure-hash>
So the new ID could have whatever information it has today, and it'd just be made longer by adding a secure hash at the end. The ID would still provide useful information by inspection, but it'd still allow the same security guarantees of Monotone, Mercurial and Git.
Or am I barking up the wrong tree?
Cheers,
Daniel.
More information about the bazaar
mailing list