Rethinking conversions to rich-root data
Jelmer Vernooij
jelmer at vernstok.nl
Sat Mar 21 03:43:49 GMT 2009
John Arbash Meinel wrote:
> I've been looking at our index, etc code, and I noticed that we have a
> surprisingly large number of records that are associated with no
> content. I then realized that this was at least partially because
> *every* revision for a conversion is now generating a new root node.
>
> As a specific example, python.org's repository has 256k entries in the
> per-file graph. Of those, 141k of them refer to no content. That is more
> than half (55%). Some of those are because of various directories,
> renames, etc. It is also a bzr-svn conversion, which may effect things.
> But launchpad also has a similar 140k versus 240k entries in the .tix
> that refer to no actual changes, given that lp has 54k revs, a large
> portion of those are just the root being noted on every revision.
>
Is this Python conversion a bzr-svn 0.4.x or 0.5.x conversion? The
latter should create (significantly?) fewer changed directories fwiw.
> Do we *really* need to fake all of these root changes? I know there were
> like 2 branches out there that actually had a root id for about 3
> revisions. Couldn't we just force that all revision roots from non-rr
> trees are fixed, and avoid preserving this bloat for the rest of history?
Not sure I follow this; Are you suggesting upgrading a non-rich-root
revision to a rich-root revision should *not* mark the root entry as
changed? This would break consistency with existing upgrades that were
done independently.
Cheers,
Jelmer
More information about the bazaar
mailing list