line identity and regression suites

Wed Apr 26 19:09:48 BST 2006

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Yates wrote:
> Fair enough.  I can imagine how to map code coverage results
> to this form.  But I do not see how to carry out the second,
> more interesting step, namely, given line-numbers of affected
> lines within a later revision, how I recover line identity.

A brute-force approach would be to diff the current revision with the
introducing-revision and use the diff offsets to recalculate this.  The
original bzr annotate used this approach, and I can dig up that code if
it's of interest.

Greater accuracy could be achieved by diffing the annotated version, but
theoretically, it's still not completely accurate, in that it doesn't
guarantee that the line-identify matches the knit's notion of line
identity.  Corner cases here would probably involve adding the same text
twice in a given revision.

It's worth noting that the knit's notion of line identity isn't entirely
accurate-- line identity isn't an objective reality, but a convenient
fiction.  It is deduced from file snapshots.  Knits using different
sequence matchers will have different notions of line identity.

That said, I think diffing or SequenceMatching the annotated version
will be plenty good enough for your purposes.

> My hope is that a notion of line identity might be sufficiently
> interesting that bzr would see fit to acknowledge it as a
> valid form of annotation.  

I would like that also.  It wouldn't cost a lot of space, and it would
strengthen the next version of knit merge.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFET7ds0F+nu1YWqI0RAm0gAJ0Y1FexKtBxbgdEiXD9AsBQuNq0twCfd0iH
3lpBs6jciOPfKFkoioUsHUo=
=InVH
-----END PGP SIGNATURE-----