[MERGE] DirState pyrex helpers

John Arbash Meinel john at arbash-meinel.com
Fri Jul 20 18:00:37 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> On 7/18/07, John Arbash Meinel <john at arbash-meinel.com> wrote:

...

> C99 doesn't say that it's either of them (iirc).  It can even be a
> nonstandard type like long long on some platforms.  I think unsigned long
> would be safest.

Good enough.

...

> I meant to say, "or at least the type of the parameter".  Some builtin
> errors say nothing about what was passed -- not even whether it was None
> or not -- which seems to put in an extra debugging round trip.

I did go ahead and generally do:

raise TypeError("Parameter 'foo' must be of type String not: %s %r"
		% (type(foo), foo))

It is a bit risky in that 'foo' could be most of a dirblock, and contain
several thousand lines worth of data. But if the type check is failing,
then it is likely not.

> 
>> > +    void *PyTuple_GetItem_void_void "PyTuple_GET_ITEM" (void* tpl, int
>> > index)
>> >
>> > Why is this not _void_int?
>>
>> Because the nomenclature is to indicate that it takes a void* instead
>> of an
>> <object> and *returns* a void* rather than an <object>.

...

>> What I'm really trying to do here is work in terms of the PyObject
>> pointers,
>> rather than a higher level "<object>" abstraction.
>>
>> I could change these to use PyObject* instead of void*, but then I
>> feel it is a
>> bit harder to tell the difference between PyObject *foo, and object
>> foo. And
>> when I first wrote it, I didn't realize you could do PyObject*.
>>
>> If you feel like it would be more obvious/better/something a different
>> way, I'm
>> willing to change it.
> 
> Oh I see.  I think using voids is fine.  Maybe again just a comment
> explaining why it's so would be enough.  Is this a standard Pyrex
> convention?

Honestly, I have no idea what standard Pyrex convention is. I just
picked up Pyrex recently (I haven't hung around in a pyrex community :).

It has been a convention that *I* started using, which is also why I'm
happy to change it if you feel something would be more obvious.

I tried to document this a bit more.

...

> 
> Yes, that's what I meant.  We've had a few bugs because the sort order is
> not what you directly get by comparing the key fields, and all other
> things being equal having the structure such that people don't have to be
> careful to get it right would be better.  I was thinking more about the
> safety than the speed.  It might not be worth changing in dirstate
> now but it's something to consider next time.

Yeah. It has other implications, though. Because of how you would build
up a path back into a full string, etc. I think you would have to do:

path = '/'.join(dirname + [basename])

rather than
path = dirname + '/' + basename


> 
>> > Uh, also you can't rely on ints being 4 characters.  (Unless I'm
>> > missing something and they reliably are in pyrex?)
>>
>> I don't care if they are 4-chars. As long as they are a faster way to
>> iterate
>> through memory. I realized I have a typo, in that I used '%4' instead
>> of '%
>> sizeof(int)'. But I went ahead and fixed that.
> 
> I meant '4 bytes'.

Sure. I just switched the code to use 'sizeof(int)' which is more
correct anyway.

> 
>> > More importantly: for python are we sure we're comparing the utf-8
>> > values, not the interpreted unicode.  I'm pretty sure we are.  But in
>> > utf-8 we will have some characters with the high bit set, and those
>> > will compare the wrong way if you compare them as signed ints and
>> > chars.  I think you definitely need some tests of this on utf-8 paths.
>>
>> Good points. I'll add the tests and switch them over to 'unsigned
>> char' to be
>> explicit. (good catch, on my Mac at least, char* is a signed char, which
>> evaluates incorrectly)
>> I'll also add a PyString_CheckExact() call for the cmp_by_dirs()
>> function,
>> which ensures that you have a real PyString, and not a PyUnicode.
> 
> Thanks, the tests look ok.
> 

So does this count as a +1 or +1 (conditional). I would like to get this
merged earlier rather than later for 0.19.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGoOo1JdeBCYSNAAMRAjlzAKDTFTghRmETPm6/zaofUMTkcH7NjgCgxyv/
cFG6c/RfzG9ACSvRdaFEsS0=
=YDAE
-----END PGP SIGNATURE-----



More information about the bazaar mailing list