[merge] cache encoding
holger krekel
holger at merlinux.de
Sat Aug 12 07:14:12 BST 2006
On Thu, Aug 10, 2006 at 09:07 -0500, John Arbash Meinel wrote:
> Attached is a bundle that caches encode/decode from and to utf8. The
> biggest application for this is the fact that when you commit a new
> kernel tree, it has to annotate every line in the tree with the current
> revision. The specific location that I saw was this line in knit
>
> return ['%s %s' % (o.encode('utf-8'), t) for o, t in content._lines]
>
> So basically, it was doing a new encode for *every* line. Which with a
> new kernel tree, you have 7.7M lines. This doesn't account for a huge
> portion of the overall time (only about 45s/10min). But it doesn't hurt
> to do it faster.
Ouch. Btw, is there documentation on the general strategy how
bzr deals with unicode? It does not use the somewhat common scheme
of "always use unicode, only convert at specified barriers", does it?
best,
holger
More information about the bazaar
mailing list