[ANNOUNCE] Example Cogito Addon - cogito-bundle
Linus Torvalds
torvalds at osdl.org
Fri Oct 20 18:48:58 BST 2006
On Fri, 20 Oct 2006, Shawn Pearce wrote:
>
> I renamed hundreds of small files in one shot and also did a few
> hundered adds and deletes of other small XML files. Git generated
> a lot of those unrelated adds/deletes as rename/modifies, as their
> content was very similiar. Some people involved in the project
> freaked as the files actually had nothing in common with one
> another... except for a lot of XML elements (as they shared the
> same DTD).
Heh. We can probably tweak the heuristics (one of the _great_ things about
content detection is that you can fix it after the fact, unlike the
alternative).
That said, I've personally actually found the content-based similarity
analysis to often be quite informative, even when (and perhaps
_especially_ when) it ended up showing something that the actual author of
the thing didn't intend.
So yeah, I've seen a few strange cases myself, but they've actually been
interesting. Like seeing how much of a file was just a copyright license,
and then a file being considered a "copy" just because it didn't actually
introduce any real new code.
Linus
More information about the bazaar
mailing list