Solving the commit-editor-locks-stuff-up problem.
Vincent Ladeuil
v.ladeuil+lp at free.fr
Sat Mar 21 13:27:28 GMT 2009
>>>>> "robert" == Robert Collins <robert.collins at canonical.com> writes:
robert> On Sat, 2009-03-21 at 11:41 +0100, Vincent Ladeuil wrote:
>> >>>>> "robert" == Robert Collins <robert.collins at canonical.com> writes:
>>
>>
robert> Certainly write operations currently have to write
robert> the entire thing always because its checksum needs
robert> updating.
>>
>> Two cents here:
>>
>> 1) There is a potential bug with the checksum if the working tree
>> is shared (mounted file system for example) between a 32bits host
>> and a 64bits host. The 32 bits host will write an signed 32bits
>> checksum while the 64bits host will write a unsigned one.
robert> I think you're wrong; we had that bug during development; the fixed
robert> checksum validation tests ensure we get the same checksum on all
robert> platforms.
32bits, somewhere with lots of branches:
find . -type -name dirstate -print | xargs head -n2 | grep crc32 | wc -l
211
find . -type -name dirstate -print | xargs head -n2 | grep crc32.*- | wc -l
87
Not 50% but close enough.
64bits, somewhere with lots of branches even if a bit less:
find . -type -name dirstate -print | xargs head -n2 | grep crc32.*- | wc -l
51
find . -type -name dirstate -print | xargs head -n2 | grep crc32.*- | wc -l
0
Very far from 50%.
Not a proof, but I'd be very surprised if statistics just play
tricks with me here.
Regression ? I didn't investigate very deeply but it seems to me
the checksum stopped to be used at some point...
>> 2) We don't really care so far because nobody use that checksum.
robert> Really?
I couldn't find any, pointers welcome.
This was a concern in bbc since we use that in search key
functions that was addressed with:
bzrlib/_chk_map_py.py:
def _crc32(bit):
# Depending on python version and platform, zlib.crc32 will return either a
# signed (<= 2.5 >= 3.0) or an unsigned (2.5, 2.6).
# http://docs.python.org/library/zlib.html recommends using a mask to force
# an unsigned value to ensure the same numeric value (unsigned) is obtained
# across all python versions and platforms.
# Note: However, on 32-bit platforms this causes an upcast to PyLong, which
# are generally slower than PyInts. However, if performance becomes
# critical, we should probably write the whole thing as an extension
# anyway.
# Though we really don't need that 32nd bit of accuracy. (even 2**24
# is probably enough node fan out for realistic trees.)
return zlib.crc32(bit)&0xFFFFFFFF
So it may be that you tested it with a combination were the
returned value was unsigned.
Since I couldn't find any user of the dirstate checksum, I
thought it wasn't worth fixing in bzr, if I'm wrong, then, this
can be pretty serious.
Vincent
More information about the bazaar
mailing list