Brief article on benchmarks of Python repository with leading DVCSen

Fri Feb 13 16:19:13 GMT 2009

On Fri, 2009-02-13 at 18:06 +0200, Teemu Likonen wrote:
> On 2009-02-13 09:14 (-0600), John Arbash Meinel wrote:
> 
> > I'd just like to point out that if you do the natural thing of:
> >
> >  git init
> >  echo content >> foo
> >  git add foo
> >  git commit -m "create" foo
> >  git mv foo bar
> >  echo more content >> bar
> >  git commit -m "move and add" bar
> >
> > I believe git's auto-detection becomes even less reliable. I realize
> > the "workaround" is to commit inbetween. However, consider a
> > refactoring, where you then need to change things like "#include
> > <foo.h>" to now be "#include <bar.h>", etc. It seems pretty natural to
> > modify the *content* at the same time that you modify the *tree
> > shape*.
> 
> Git's rename detection does not work with this kind of toy examples
> where the content is only a couple of bytes. However, Git calculates the
> similarity and if the change is more than 50% (I think) of the smaller
> file then it is detected as a rename. So even though you are correct
> that the detection gets "less reliable" it works very nicely in the real
> world where there is real content in the files. I think that in practice
> there is rarely need to worry about that.
It doesn't work on relatively small *and* large renames. We tried
renaming one of the top-level directories in Samba (with ~1000 files
underneath it) and git just gave up and didn't give us any history
beyond the point of the rename.

Cheers,

Jelmerh
-- 
Jelmer Vernooij <jelmer at samba.org> - http://samba.org/~jelmer/
Jabber: jelmer at jabber.fsfe.org