[RFC] versioned properties: explicit namespaces: to be or not to be?

Alexander Belchenko bialix at ukr.net
Sun Apr 13 10:29:40 BST 2008


When I'm trying to wrote specification for versioned properties in August-September 2007
Aaron Bentley asked me to add explicit namespaces support. And I did it then. Mostly because
in that time I planned to use different implementation of VP storage.

Currently I have working code that based on ConfigObj library for VP storage. Because
ConfigObj does not support explicit namespaces I did not implement any namespace support
for versioned properties. And I prefer to not rewrite my code again, but I just realized
that actually ConfigObj has one feature that could be used to support namespaces in VP.
I talk about subsections.

Currently my code use .bzrprop file in the working/revision tree as VP storage. This file
should be versioned of course, and in fact it's just plain config-like text file.
So current VP storage format does not have any namespaces, e.g.:

     [*.txt]
     eol = native

     [*.bin]
     bin = yes

     [bar-file-id]
     eol = CRLF
     encoding = cp1251

In this case .bzrprop file is still human-readable.

By analogy with svn properties the namespaces might look like this:

	svn:ignore
	svn:mime

	bzr:eol
	bzr:encoding

In this form it's still human readable or at least looks familiar. In this case content of
.bzrprop file could looks like this:

     [*.txt]
     bzr:eol = native

     [*.bin]
     bzr:bin = yes

     [bar-file-id]
     bzr:eol = CRLF
     bzr:encoding = cp1251

But such format is hard for serializing/deserializing versioned properties.
This is main reason why currently I don't have any explicit namespaces.

So there is the only one variant I can use: subsections support in ConfigObj.
In this case I don't have to write any complicated serialization and rely on
ConfigObj. But it has less human-readable representation, e.g.:

     [*.txt]
     [[bzr]]
     eol = native

     [*.bin]
     [[bzr]]
     bin = yes

     [bar-file-id]
     [[bzr]]
     eol = CRLF
     encoding = cp1251

Looking at this file it's not very obvious that *.txt actually has property bzr:eol = native.

The VP storage (.bzrprop file) is not necessary should be human-readable, but because it
resides in working tree the user at least should understand what's inside.

I can admit that namepsaces is good idea, but it has some cons. And heartily to say I'd like
to not implement explicit support for them. But I remember my discussion with Aaron on this
topic, so I suppose other people may think different.

I just want to note that explicit namespaces will impact performance, because they will
require additional dictionary lookup, i.e. as pseudocode:

	def get_property(self, file_id, propname):
		section = self.vpdict.get(file_id)
		if section:
			return section.get(propname)
		return None

vs.

	def get_property(self, file_id, propnamespace, propname):
		section = self.vpdict.get(file_id)
		if section:
			namespace = section.get(propnamespace)
			if namespace:
				return namespace.get(propname)
		return None


Obviously, if I need to get_property for every file in working tree with 50000 files it might have
big difference in execution time. That's why I think it's better to avoid explicit namespaces.

Thoughts?



More information about the bazaar mailing list