newformat format change

Martin Pool martinpool at gmail.com
Tue Oct 4 04:12:35 BST 2005


On 01/10/05, John A Meinel <john at arbash-meinel.com> wrote:

> Well, back in the beginning of the project, it was determined to use XML
> for data storage. It was a really big benefit for the inventory, perhaps
> not quite as much for the Revision information.

I'm not sure it was such a good idea.  Reading and writing XML turns
up as a hot spot in profiles, even though we're using a highly
optimized xml library and the other code is not very optimized.  One
part of the problem is that cElementTree needs to build everything
into an in-memory structure which is then discarded; with a simple
text form we can just convert directly.

There is also the issue that the XML representation is not completely
determined by the document tree, which is problematic for computing
hashes.  We can work around this by canonicalizing the xml but we
could also just avoid it.

The hashcache was specifically not done in XML because it's
particularly important it be fast, and it's not long-term storage.

I think I'd prefer to change to a plain text format, but it doesn't
seem urgent enough to do it right now.

There's still a good place for XML in say "log --format=xml" or
xmlrpc, where we expect to interface with non-Python code.

--
Martin




More information about the bazaar mailing list