[RFC] versioned properties: explicit namespaces: to be or not to be?
Alexander Belchenko
bialix at ukr.net
Sun Apr 13 10:29:40 BST 2008
When I'm trying to wrote specification for versioned properties in August-September 2007
Aaron Bentley asked me to add explicit namespaces support. And I did it then. Mostly because
in that time I planned to use different implementation of VP storage.
Currently I have working code that based on ConfigObj library for VP storage. Because
ConfigObj does not support explicit namespaces I did not implement any namespace support
for versioned properties. And I prefer to not rewrite my code again, but I just realized
that actually ConfigObj has one feature that could be used to support namespaces in VP.
I talk about subsections.
Currently my code use .bzrprop file in the working/revision tree as VP storage. This file
should be versioned of course, and in fact it's just plain config-like text file.
So current VP storage format does not have any namespaces, e.g.:
[*.txt]
eol = native
[*.bin]
bin = yes
[bar-file-id]
eol = CRLF
encoding = cp1251
In this case .bzrprop file is still human-readable.
By analogy with svn properties the namespaces might look like this:
svn:ignore
svn:mime
bzr:eol
bzr:encoding
In this form it's still human readable or at least looks familiar. In this case content of
.bzrprop file could looks like this:
[*.txt]
bzr:eol = native
[*.bin]
bzr:bin = yes
[bar-file-id]
bzr:eol = CRLF
bzr:encoding = cp1251
But such format is hard for serializing/deserializing versioned properties.
This is main reason why currently I don't have any explicit namespaces.
So there is the only one variant I can use: subsections support in ConfigObj.
In this case I don't have to write any complicated serialization and rely on
ConfigObj. But it has less human-readable representation, e.g.:
[*.txt]
[[bzr]]
eol = native
[*.bin]
[[bzr]]
bin = yes
[bar-file-id]
[[bzr]]
eol = CRLF
encoding = cp1251
Looking at this file it's not very obvious that *.txt actually has property bzr:eol = native.
The VP storage (.bzrprop file) is not necessary should be human-readable, but because it
resides in working tree the user at least should understand what's inside.
I can admit that namepsaces is good idea, but it has some cons. And heartily to say I'd like
to not implement explicit support for them. But I remember my discussion with Aaron on this
topic, so I suppose other people may think different.
I just want to note that explicit namespaces will impact performance, because they will
require additional dictionary lookup, i.e. as pseudocode:
def get_property(self, file_id, propname):
section = self.vpdict.get(file_id)
if section:
return section.get(propname)
return None
vs.
def get_property(self, file_id, propnamespace, propname):
section = self.vpdict.get(file_id)
if section:
namespace = section.get(propnamespace)
if namespace:
return namespace.get(propname)
return None
Obviously, if I need to get_property for every file in working tree with 50000 files it might have
big difference in execution time. That's why I think it's better to avoid explicit namespaces.
Thoughts?
More information about the bazaar
mailing list