[MERGE] pre_commit hook (was: [RFC] pre_commit hook)

Wed Aug 15 16:02:50 BST 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nam Nguyen wrote:
> On 8/15/07, John Arbash Meinel <john at arbash-meinel.com> wrote:
>> John Arbash Meinel has voted approve.
>> Status is now: Semi-approved
>> Comment:
>> To start with, I think this is reasonable as is, I just have a few
>> comments to discuss.
>>
>> +        if change != 'unchanged' and path != '':
>> +            self.affected_ids.setdefault(change, []).append(ie.file_id)
>>
>> ^- I wonder if it makes more sense to fill the dictionary with the
>> different types pointing to empty lists at the start. That way we don't
>> have to worry about setdefault().
> 
> I'm worry that as changes aren't fixed yet, and there is no single
> place to find all inventory entry change types, a hard coded list of
> change types here is not welcome. When change types are fixed,
> initializing the dictionary before using it is definitely the way to
> go.
> 

I would guess that changes are actually fixed, we just haven't actually
written it down yet.
And if someone introduces a new change type, it will actually cause your
code to fail, because the dictionary lookup will miss. (And you fill the
dictionary unconditionally during commit, so it will be pretty obvious.)

Also, I would ask that you *not* suppress recording when the root changes.
+        if change != 'unchanged' and path != '':
+            self.affected_ids.setdefault(change, []).append(ie.file_id)

should become:
+        if change != 'unchanged':
+            self.affected_ids.setdefault(change, []).append(ie.file_id)

The NestedTree code specifically is working on allowing root ids to
change, since it is part of how trees merge together. All new code
should be written with that in mind. If they want, plugins can filter by
path, but we should be presenting it to them.

...
>> ^- I would have preferred these as separate tests, because as it stands
>> the test is a little bit hard to read.
> 
> But that would result in more duplicated code. As it is right now, all
> change types share a same imaginary working session.

If it is obviously duplicated code, then you could factor it into a
helper function. But generally when writing tests, it is a lot better if
you write obvious simple tests. That way one piece failing doesn't
cascade to the rest of the test not running. And when a test starts
failing, it is more obvious what was actually broken.

Also, in this specific case, there is very little redundant code.
Because you don't need to do all the other renames, etc when you are
just deleting an entry.

> 
>> Also, you really should have try/finally for:
>> tree.lock_write():
>> try:
>>    do stuff
>> finally:
>>    tree.unlock()
>>
>> Otherwise, if something fails in the middle, you will get a mixed
>> traceback, when it also warns that the tree wasn't unlocked. Oh, and if
>> you are going to do them one after the other, you should be able to just
>> lock the tree across the whole set. No need to keep unlocking it. (If
>> there is, let us know, because that would be a sign of a problem).
> 
> In fact, I was surprised to find unlocking is required. If you remove
> the unlock/lock lines in the middle, there will be errors regarding
> dirty flag or some thing.

Yeah... I realize now what is happening. Our new "dirstate" code keeps
track of files in a slightly different way than the old code. We have a
way for it to fake the old code paths, but we don't let you update it
twice. We flush to disk on an unlock(), so we know that it is clean.
As we switch to more code that uses the better apis, this will become
less of a problem.

I'm guessing we should probably have a way to force a flush rather than
triggering it on unlock. And maybe commit() should always flush?

> 
>> +            ('pre_commit', 4, rev4, 5, rev5,
>> +             {'modified and renamed': ['rootfile_id']} )
>> +            ],
>>
>> ^- This one specifically makes me wonder if we want to be using the
>> simple "change" value. 'modified', 'renamed', 'added', 'deleted' seem
>> reasonable, 'modified and renamed' seems a little weird to use as a key.
> 
> IIRC, there is a comment in InventoryEntry that says we are going to
> have an enumeration of change types. Until then, this does sound
> weird.

Well, we could just do it now. I'm pretty sure you know all the cases
having worked with it.

> 
>> Not enough to say we shouldn't do it. But just a little.
>>
>> Anyway, these are just thoughts, not anything that I would merge
>> without. Someone else can decide if they want to be more strict about
>> it. Good job overall.
> 
> I'm just scratching my itch. Thank you for guiding me along.
> 
> Cheers
> Nam
> 
> 

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwxWaJdeBCYSNAAMRAhQLAKCh9QgyW9oOths1R+/zJbCA7nFQPACgg1ZT
LzQ5DsXJegw0fnbBN5znSv4=
=fwJB
-----END PGP SIGNATURE-----