Reusable fastimport parser

Robert Collins robert.collins at canonical.com
Wed Apr 22 00:49:48 BST 2009


On Tue, 2009-04-21 at 11:30 -0400, Greg Ward wrote:
> Hi all --
> 
> no doubt you folks are aware that there is well designed, high
> performance, low overhead, pure Python parser for git's "fastimport"
> format in bzr-fastimport.  The original author of hg-fastimport
> clearly liked it enough that now we have two copies of Ian
> Clatworthy's fastimport parser kicking around the net: hg-fastimport
> is basically a fork of bzr-fastimport with the backend rewritten to
> produce a Mercurial repository.

:).

> Well, I'm now the maintainer of hg-fastimport, since I committed the
> fatal error of sending patches to the original author after he lost
> interest.  D'ohhh.  And I really don't like fixing bugs that other
> people have already fixed, nor do I like it when my fixes do not
> benefit the widest possible audience.

Oh the trials and tribulations of open source :) (Really, its good that
people *can* take up the burden).

> So... I'd like to heal the fork before it gets bad.  IMHO the best
> option would be factor out the non-Bazaar-specific bits of
> bzr-fastimport and put them in a common library that both
> bzr-fastimport and hg-fastimport depend on.  Does that sound
> agreeable?  Any better ideas?
> 
> If that sounds good, what about Dulwich?  That's an implementation of
> of Git's protocols and file formats in pure Python; seems like a good
> place for a common, reusable fastimport parser to live.  I haven't
> asked the Dulwich maintainers what they think of this idea, but
> whatever.  Thought I'd start here.  Feedback?

Dulwich is being written as part of bzr-git, and Jelmer is on this list
too, so I'm sure you'll get some feedback from him.

For my part, I don't think fastimport is that tightly linked to git -
its getting extended and tweaked to make it quite general-purpose. I
think it would be fine to have it be a generic library, as long as there
are no performance implications in doing so - when you are processing
millions of entries extra function calls and the like can start to show
up surprisingly high in timing measurements.

Ian, what do you think?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090422/658e4236/attachment.pgp 


More information about the bazaar mailing list