repository corruptions; possible causes?
John Arbash Meinel
john at arbash-meinel.com
Wed Aug 1 16:14:05 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Sabin Iacob wrote:
> Hello
> I've been experiencing some strange issues with the trunk bzr in the
> last week:
>
> * the project I am working on for my employer is a branch stored in svn;
> after a while, bzr [diff|missing|push] and generally anything that had
> to do with diff raised a MemoryError; I was able to branch from svn
> again and use (unix) diff to recover my changes and commit them; it
> still won't push ("branches have diverged", but merge says "nothing to
> do"), but that's another issue
So you are saying that 'bzr diff' in a bzr tree is giving you MemoryError?
That sounds really weird, and is certainly something to investigate.
I guess if you had some really large files, or something like that. I'm pretty
sure unix diff does a disk-based diff, so it doesn't need to load the whole
file into memory. While we do. (But I might be wrong about unix diff).
> * on another (dirstate-tags) branch bzr would segfault (?), or say
> something about a missing revision and abort (this actually happened to
> bzr.dev, too, I couldn't branch an earlier revision from it)
Well, a 'segfault' is a bit different than raising an exception and dying. I
realize it may feel the same to you, but for us it is very different. For a
normal installation of bzr (pre-pyrex at least) a segfault generally means that
your OS/Hardware something is having a problem. In my personal experience, I
only get segfaults with bzr when Mac OSX is going crazy. (And at that time
other programs are randomly segfaulting, too, and I need to reboot).
The only other real avenues for a segfault are a bad C/Pyrex extension. I know
bzr-svn is using an svn wrapper which has had some issues in the past. (It at
least had a memory leak in one of the fairly important functions).
I can't say for sure about "missing revision and abort". It sounds like your
Branch claimed to be at revision 10, but your repository only contained
information up to revision 9. Without seeing the tracebacks/errors/etc it is
hard to say much more than that. (Either way, though, I would expect you to be
able to start at an earlier revision and branch off of it).
It is also possible that you had physical disk corruption (we've encountered
hardware that ended up putting string of 100+ NULLs right in the middle of a
file. And we are quite good about only appending to those files)
> * and finally, I had some changes to bzr-svn I had committed (tests
> didn't run on my system unless some names were fully qualified), and
> trying to merge this morning I got
>
> bzr: ERROR: Revisions have no common ancestor:
> iacobs at m0n5t3r.info-20070727163435-gbf1rp10ri60a63l
> jelmer at samba.org-20070727111824-wnyzsuwiowm12553
So you did a branch from bzr-svn, did some changes and committed, and then you
were trying to merge and it was failing. (Just to make sure I understand)
As this is an open-source project, would you be willing to post your branches?
I can say something weird is happening, I don't know what yet.
>
> and trying to branch from it would say
>
> bzr: ERROR: Could not install revisions:
> iacobs at m0n5t3r.info-20070727163435-gbf1rp10ri60a63l
This certainly
>
>
> I kept the offending branch this time; where can I look for some
> forensic info? for starters, I'm curious if this is disk/filesystem
> related (which would mean that I have to back up my laptop) or just
> something to be expected from trunk/bleeding edge software (in which
> case I don't mind :) )
Most of the primary devs (at least myself included) run the latest tip of
bzr.dev all the time. (I will bzr update multiple times per day). Our trunk
should be extremely stable. (It is only updated after the PQM has run 7k unit
tests on it).
It has been that way for more than a year now, and I've never had a true data
corruption like you are seeing. And we have only had a couple small regressions
in a long time. (0.15 introduced a few working tree regressions, 0.12 had a
rather small regression because of a missing 'import' that wasn't noticed if
you had bzrtools installed, I don't remember many more)
>
> Dell 1300, Linux wireless-dev.git 2.6.22-rc3 (used in the hope they
> actually fix bcm43xx sometimes), xfs, using suspend to ram unless I
> really have to shutdown/reboot.
I hate to say that it sounds like hardware issues, but it pretty much does. If
you want to post the tarballs somewhere, or mail them to me, I'll be willing to
give them a look.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGsKM9JdeBCYSNAAMRAgSeAJ9YyuRmescgoMPbCWqdu7cUNIPALwCfYNhY
7lOCv4CgTBBUp8dXEDRYQVU=
=QhOz
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list