[MERGE] push handles file-ids containing quotes correctly

Aaron Bentley aaron.bentley at utoronto.ca
Mon Jul 10 20:07:21 BST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> When a function is pretty broken, I prefer to clean it up correctly. I
> understand your conservative approach. But doing it correctly in this
> case is just as easy (creating a regex to assert that we aren't missing
> anything is the same work as creating a regex to fix it correctly.).

More work, really, because these negative lookahead assertions don't
come naturally.  But you know, I just want the bug fixed.

> I believe I also wrote this broken code. So I'm more likely to actually
> want it fixed to hide my shame. :) (I believe I just copied it out of
> xml because at the time we had an xml.py so I couldn't just import it,
> and it was a big dependency for a single function).

Gannotate attributes it to Martin.

>>>There is that.  But this approach already assumes that there are no
>>>comments or processing instructions or CDATA sections in the file
>>>version, so assuming the data contains no entities except for the
>>>predefined ones doesn't seem like much more of a leap.  I'd be more
>>>concerned about mangled entities like '&amp' or '&&aammpp;;'.
> 
> 
> Well, this is just reading the XML <file .../> record, which can only
> have attributes, and I didn't think CDATA was allowed in attributes.

A comment, processing instruction or CDATA section may contain a line
that looks like a file element.

>>>I hope the argument for testing push is self evident.  It was broken,
>>>we're fixing it, we don't want regressions.
> 
> So testing push/fetch is at best 'iffy', because they just expected the
> underlying code to do something it wasn't doing (unescaping &apos;).
>
> It was exposed because somebody tried to do a push, and it failed, but
> the bug isn't in 'push' it is in something that push relies upon.

It doesn't really make sense to me.  We weren't guaranteeing that 'push'
worked with branches containing file-ids with quote characters.  I think
that we should, because we do want it to work in that situation.  Unless
we prove that it does, we should assume that it doesn't.  The next bug
may come from something totally unrelated to fetch, like, perhaps,
WorkingTree.

Similarly, I think we do want to guarantee that fetch works with funky
file-ids, no matter how fetch is implemented.  Particularly, say, our hg
repositories and svn repositories.

I think a test suite that doesn't test those edge cases has inadequate
coverage.

> I'm 0 about a push test. I won't block on it being there. I know my
> personal fault is testing too high (look at test_bundle, what was I
> thinking ...) So I'm trying to find the right balance still.

I think more tests can never hurt.  But I agree with you that I wasn't
testing low enough.

>>>But I'd be happy to test file_ids_altered_by_revision_ids as well.

> We actually already have tests that look for <>, and a couple others.
> Just add " and ' to the file-id, and I think it should flush out the bug.

Cool.

> I think it is important to know when it breaks, on something that is
> semi-critical. It is the tradeoff of adding more tests to
> file_ids_affected_by, since you aren't asserting anything about the
> functions it calls, you have to do everything though testing it.

Say we refactored file_ids_affected_by so that it used a different XML
parsing function.  It would suck if that caused it to break again and we
didn't notice.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD4DBQFEsqVp0F+nu1YWqI0RAiuxAJ94eqwg5Vbrbb/aOtgRfUmeKYEvIQCY/RL+
8AB17D0KRVZq09qg3HIsBQ==
=sk9M
-----END PGP SIGNATURE-----




More information about the bazaar mailing list