KnitCorrupt error

John Arbash Meinel john at arbash-meinel.com
Tue Aug 28 16:29:28 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Straw wrote:
> Looking a little further, doing "bzr up" on the computer where things
> work (the Mac) results in no errors. The md5sum of the file reported to
> be corrupt (the .knit file) is identical on the Mac and the linux machine.
> 
> How are the number of lines counted? Could the be different on the Mac
> vs. linux machines?
> 
> -Andrew
> 
> Andrew Straw wrote:

...

>>   File "/usr/lib/python2.5/site-packages/bzrlib/knit.py", line 1510, in _parse_record
>>     version_id))
>> KnitCorrupt: Knit 8e/2007_switzerland_fir-20070827204501-y2ffrzg5l7179dtu-1.knit corrupt: incorrect number of lines 365741 != 365742 for version {astraw at hummer.local-20070827204526-ce7sc2vv3j29clmt}
>>
>> bzr 0.15.0 on python 2.5.1.final.0 (linux2)
>> arguments: ['/usr/bin/bzr', 'pull']
>>
>> ** please send this report to bazaar at lists.ubuntu.com

Well, to start with, we would recommend upgrading from 0.15 to something newer.
0.15 has some known bugs with renaming directories. (It won't lose any
committed data, but the working tree can get confused.)

The error is occurring because we were expecting to read 365742 lines, and we
only found 365741.

If you are willing to send the specific file, I would be willing to look at it.
Though it will contain whatever you committed. I promise not to divulge your
secrets, but you would have to trust me on that.

If you run 'bzr check' both on your Mac or on Ubuntu it fails in the same way?

I don't know of any reason for incompatibility between 0.18 and 0.15 for this.

If you want to do the debugging on your end, I can walk you through some things
to try. Basically, it involves manually processing the file, and seeing if your
values compare to what bzr is expecting.

If you just want to send me the file, I should only need the .kndx and .knit files.

I'll try to help you fix the issue, but you should realize that if the number
of lines don't match, then it is possible that the sha1sum is going to fail as
well. Which means we need to figure out what the original text should be.

If it is an entry with 365k lines, I imagine it is a rather large file. Which
might be too big to email. (I can take email messages up to 10MB.)

As a basic overview, the .kndx tells us what byte range we need to read.
Then we read the .knit starting at the given offset, and reading the specified
number of bytes. That is a gzip hunk, which we then decompress.

You can do this with standard unix tools like "cut", "zcat", and "wc".

If you look at the .kndx you can see the start offset, and number of bytes. If
it is "revision-id fulltext 10 100  :", that means it starts at byte 10 and
goes for 100 bytes. So you could do:

cut -b 10,110 file.knit | zcat | wc -l

Which should give you
365742 + 2 = 365744 lines. (we have one header line, and one trailer line)

I would also be curious to see the output of:
cut -b 10,110 file.knit | zcat | head -n1
and
cut -b 10,110 file.knit | zcat | tail -n1

John
=:->

PS> If you do send the file, obviously don't send it to the list. I'll be
offline for a couple of hours from now. But I'll be back afterwards.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG1D9XJdeBCYSNAAMRAkChAJ4t+Ccc67qbrL810s8F94qNf6lUjACePTQ+
V5jV+AkXussd80H/GMgspg0=
=pDQs
-----END PGP SIGNATURE-----



More information about the bazaar mailing list