[Bug 840641] Re: nova ftbfs (Sphinx? Segmentation fault)
Bug Watch Updater
840641 at bugs.launchpad.net
Sat Oct 28 03:28:09 UTC 2017
Launchpad has imported 28 comments from the remote bug at
https://bugzilla.redhat.com/show_bug.cgi?id=746771.
If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.
------------------------------------------------------------------------
On 2011-10-17T18:19:48+00:00 Sergey wrote:
Description of problem:
openstack-nova-* services fail to start except openstack-nova-volume
server1 kernel: [ 1139.822573] nova-network[1634]: segfault at bf995000 ip 00c7c6b7 sp bf9936a8 error 6 in libc-2.14.90.so[b36000+1a7000]
server1 systemd[1]: openstack-nova-network.service: main process exited, code=killed, status=11
server1 systemd[1]: Unit openstack-nova-network.service entered failed state.
server1 kernel: [ 1140.217545] nova-api[1601]: segfault at bfd9a000 ip 0040b6a9 sp bfd964d8 error 6 in libc-2.14.90.so[2c5000+1a7000]
server1 systemd[1]: openstack-nova-api.service: main process exited, code=killed, status=11
server1 systemd[1]: Unit openstack-nova-api.service entered failed state.
server1 kernel: [ 1141.045279] nova-scheduler[1656]: segfault at 6e692f61 ip 00255950 sp bfb22778 error 4 in libc-2.14.90.so[110000+1a7000]
server1 systemd[1]: openstack-nova-scheduler.service: main process exited, code=killed, status=11
server1 systemd[1]: Unit openstack-nova-scheduler.service entered failed state.
Version-Release number of selected component (if applicable):
1) glibc-2.14.90-12.i686
glibc-common-2.14.90-12.i686
2) Linux server1.example.com 3.1.0-0.rc9.git0.0.fc16.i686.PAE #1 SMP Wed Oct 5 15:51:55 UTC 2011 i686 i686 i386 GNU/Linux
3) openstack-swift-doc-1.4.0-2.fc16.noarch
openstack-swift-proxy-1.4.0-2.fc16.noarch
openstack-swift-container-1.4.0-2.fc16.noarch
openstack-swift-1.4.0-2.fc16.noarch
openstack-glance-2011.3-1.fc16.noarch
openstack-keystone-1.0-0.3.d4.1213.fc16.noarch
openstack-swift-auth-1.4.0-2.fc16.noarch
openstack-swift-account-1.4.0-2.fc16.noarch
openstack-nova-2011.3-3.fc16.noarch
openstack-swift-object-1.4.0-2.fc16.noarch
How reproducible:
every time
Steps to Reproduce:
1. for svc in api objectstore compute network volume scheduler; do sudo service openstack-nova-$svc start; done
Actual results:
Segfault
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/4
------------------------------------------------------------------------
On 2011-10-17T18:28:31+00:00 Sergey wrote:
Service openstack-nova-volume is in running state
Service openstack-nova-network is in running state after "Create user, project and network" step from https://fedoraproject.org/wiki/QA:Testcase_create_OpenStack_user_project_and_network
$> sudo nova-manage user admin markmc
$> sudo nova-manage project create markmc markmc
$> sudo nova-manage network create markmc 10.0.0.0/24 1 256 --bridge=br0
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/5
------------------------------------------------------------------------
On 2011-10-18T16:30:21+00:00 Mark wrote:
Thanks for the report Sergey
I wonder is this because you're on i686? I do my testing on x86_64
Could you try update to
https://admin.fedoraproject.org/updates/FEDORA-2011-14504
If you still see the segfault, try installing glibc-debuginfo and see if
you can get a stack trace
http://fedoraproject.org/wiki/StackTraces
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/6
------------------------------------------------------------------------
On 2011-10-18T16:31:37+00:00 Mark wrote:
*** Bug 746767 has been marked as a duplicate of this bug. ***
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/7
------------------------------------------------------------------------
On 2011-10-18T19:14:39+00:00 Sergey wrote:
Thanks for the replay,
The latest update didn't help, so I try to do some debugging.
glibc-common-2.14.90-12.999.i686
glibc-debuginfo-2.14.90-12.999.i686
glibc-2.14.90-12.999.i686
glibc-debuginfo-common-2.14.90-12.999.i686
I'm not strong with gdb... Especially we have python code here, So here
are my steps, please help me if I'm doing wrong:
[serg at server1 glance]$ gdb --args python
GNU gdb (GDB) Fedora (7.3.50.20110722-9.fc16)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python...Reading symbols from /usr/lib/debug/usr/bin/python.debug...done.
done.
(gdb) run /usr/bin/glance
Starting program: /usr/bin/python /usr/bin/glance index
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Detaching after fork from child process 2350.
Detaching after fork from child process 2352.
Program received signal SIGSEGV, Segmentation fault.
__memcpy_ssse3_rep () at ../sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S:158
158 movdqu (%eax), %xmm0
this is with nova:
[serg at server1 glance]$ sudo gdb --args python
GNU gdb (GDB) Fedora (7.3.50.20110722-9.fc16)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python...Reading symbols from /usr/lib/debug/usr/bin/python.debug...done.
done.
(gdb) run /usr/bin/nova-api
Starting program: /usr/bin/python /usr/bin/nova-api
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Detaching after fork from child process 2605.
Detaching after fork from child process 2607.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 336, in fire_timers
timer()
File "/usr/lib/python2.7/site-packages/eventlet/hubs/timer.py", line 56, in __call__
cb(*args, **kw)
SystemError: error return without exception set
Program received signal SIGSEGV, Segmentation fault.
__memcpy_ssse3_rep () at ../sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S:1169
1169 movzbl -1(%eax), %ecx
I found a lot of information about printing stack etc., please let me know what information you want and the best way to get it
Thank you
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/8
------------------------------------------------------------------------
On 2011-10-18T19:25:13+00:00 Sergey wrote:
Created attachment 528877
I hope right debug info for starting nova-api
I hope I've made the right dump for nova-api starting service
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/9
------------------------------------------------------------------------
On 2011-10-19T00:29:30+00:00 Pádraig wrote:
Hmm, memcpy. That reminds me of the infamous bug 638477
I wonder is something in python land doing a memcpy on overlapping regions?
I saw this crash in python (not related to nova) in the x86_64 F16 TC1 build.
Note that was using an earlier version of glibc (2.14.90-10).
I also noticed bug 737765 which may be related?
I've not noticed any issues with a fully updated system (glibc-2.14.90-12).
The last package update to libpython on my functioning system was Jul 8th,
so that is an unlikely source of the issue (given its ubiquity also).
Hmm I wonder is there some lib used by openstack using memcpy incorrectly.
This is a long shot, but this gives a hit:
readelf -Ws /usr/lib64/python2.7/site-packages/greenlet.so | grep memcpy
I had a quick look at the greenlet code and it seemed OK but I'm not sure.
BTW Sergey it would help if you could in gdb: thread apply all bt full
:) Scratch that, you've already attached that in comment 5.
Well would you look at that. greenlet.c
So it goes from there to python and back and then through some assembly
and finally to the memcpy in slp_restore_state().
I'd put a breakpoint on that function and step through,
to see if there were overlapping regions passed to memcpy.
If so I'd change both instances of memcpy in that file to memmove.
It could also be greenlet messing up the stack or heap or something.
For kicks here is the crashing line. That's one crazy complicated memcpy:
http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S;hb=HEAD#l1169
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/10
------------------------------------------------------------------------
On 2011-10-19T12:26:24+00:00 Mark wrote:
Wow, this is an interesting bug alright
It definitely looks like a greenlet bug to me. See this thread:
http://groups.google.com/group/gevent/browse_thread/thread/fee2097e2f3bae5e
Moving to greenlet. I'll see if I can isolate what the upstream fix was,
or whether we can just update to a newer version with the fix
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/11
------------------------------------------------------------------------
On 2011-10-19T14:05:41+00:00 Mark wrote:
Okay, I asked the Ubuntu maintainer (Dave Walker) and he pointed me to
this:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641
and they used this patch to fix their issue:
https://bitbucket.org/ambroff/greenlet/changeset/2d5b17472757/raw/
certainly seems like the right area of code
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/12
------------------------------------------------------------------------
On 2011-10-19T14:10:39+00:00 Mark wrote:
Also, as an aside - it looks like 0.3.1 is over 18 months old and
there's a bunch of unreleased stuff in hg upstream. Some anonymous
coward :) just asked about this yesterday:
https://bitbucket.org/ambroff/greenlet/issue/32/greenlet-release-cycle
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/13
------------------------------------------------------------------------
On 2011-10-19T14:22:09+00:00 Pádraig wrote:
Oh patch in comment 8 looks promising. I was reviewing the latest
greenlet code last night rather than 0.3.1 :(
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/14
------------------------------------------------------------------------
On 2011-10-19T15:20:45+00:00 Mark wrote:
btw - dmalcolm is looking into this now and can reproduce the issue on
i686
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/15
------------------------------------------------------------------------
On 2011-10-19T15:21:56+00:00 Dave wrote:
Created attachment 529032
Patch to add %check section to python-greenlet.spec
Frame #2 of attachment 528877 is within slp_restore_state (), which is
heavily dependent on CPU architecture and compiler version (calling
conventions too, I believe).
I noticed that python-greenlet.spec doesn't have a %check section.
Attached is a patch to that file to add one, using upstream's test and
benchmarking suite, so that we get some automatic test coverage on
different architectures.
$ koji build --scratch f16 /home/david/coding/dist-git-new/python-greenlet/python-greenlet-0.3.1-5.fc17.src.rpm
Task info: http://koji.fedoraproject.org/koji/taskinfo?taskID=3444123
Upon running that, I see a segfault in the i686 build within the %check section:
http://koji.fedoraproject.org/koji/getfile?taskID=3444125&name=build.log
test_generator (tests.test_generator.GeneratorTests) ... /var/tmp/rpm-
tmp.ZnHd7D: line 32: 24482 Segmentation fault /usr/bin/python
setup.py test
Am bringing up an i686 test box to investigate further
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/16
------------------------------------------------------------------------
On 2011-10-19T15:53:31+00:00 Pádraig wrote:
So i686 is the common factor.
There have been recent changes there
https://bitbucket.org/ambroff/greenlet/history/platform/switch_x86_unix.h
Though x86_64 has fixes too
https://bitbucket.org/ambroff/greenlet/history/platform/switch_amd64_unix.h
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/17
------------------------------------------------------------------------
On 2011-10-19T18:30:12+00:00 Dave wrote:
Created attachment 529078
Proposed changes to package
FWIW, I tried applying
https://bitbucket.org/ambroff/greenlet/changeset/2d5b17472757
to our build, and the test suite then ran to completion on both architectures:
http://koji.fedoraproject.org/koji/taskinfo?taskID=3444511
I'm attaching a patch to git python-greenlet, which adds that patch
(slightly fixed up to apply cleanly) and modifies the specfile to apply
it, and run the upstream test suite.
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/18
------------------------------------------------------------------------
On 2011-10-19T20:48:54+00:00 Pádraig wrote:
*** Bug 746330 has been marked as a duplicate of this bug. ***
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/19
------------------------------------------------------------------------
On 2011-10-24T15:59:13+00:00 Pádraig wrote:
Created attachment 529915
upstream i686 assembly fix
As I suspected, the i686 assembly patch from upstream also fixes the
issue (independently of the other patch). I've included the previous
upstream patch too, as that has been tested extensively on other
systems.
Note ppc64 is crashing with both of the above.
I applied the upstream ppc_linux.asm file too to no avail.
So for the moment I've done ExcludeArch ppc64 in the spec file.
This will bork dependencies for ppc64 though right?
Builds of theh attached patch available here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=3456678
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/20
------------------------------------------------------------------------
On 2011-10-24T17:39:32+00:00 Pádraig wrote:
Created attachment 529934
upstream i686 assembly fix 2
I've updated the patch to build on ppc64 and just exclude the checks.
http://koji.fedoraproject.org/koji/taskinfo?taskID=3456815
You can apply this patch with `git am`
Note I've not got commit access so can't apply this.
Note F16 submission closes today :(
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/21
------------------------------------------------------------------------
On 2011-10-30T16:55:01+00:00 TR wrote:
Any news on when this will be pushed to F16 testing?
Thanks
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/22
------------------------------------------------------------------------
On 2011-10-30T18:16:38+00:00 Pádraig wrote:
Well I've not received commit access yet, but I expect the resultant
build to be the same as that in comment 17 which you can install/test
directly. I'll escalate getting commit access next week
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/23
------------------------------------------------------------------------
On 2011-10-30T18:32:43+00:00 Kevin wrote:
Sorry, I missed the commit request. I've approved it in pkgdb...
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/24
------------------------------------------------------------------------
On 2011-11-01T02:56:55+00:00 Fedora wrote:
python-greenlet-0.3.1-6.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/python-greenlet-0.3.1-6.fc15
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/25
------------------------------------------------------------------------
On 2011-11-01T02:57:04+00:00 Fedora wrote:
python-greenlet-0.3.1-6.el6 has been submitted as an update for Fedora EPEL 6.
https://admin.fedoraproject.org/updates/python-greenlet-0.3.1-6.el6
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/26
------------------------------------------------------------------------
On 2011-11-01T02:57:12+00:00 Fedora wrote:
python-greenlet-0.3.1-6.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/python-greenlet-0.3.1-6.fc16
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/27
------------------------------------------------------------------------
On 2011-11-02T00:04:53+00:00 Fedora wrote:
Package python-greenlet-0.3.1-6.el6:
* should fix your issue,
* was pushed to the Fedora EPEL 6 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=epel-testing python-greenlet-0.3.1-6.el6'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-EPEL-2011-4822
then log in and leave karma (feedback).
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/28
------------------------------------------------------------------------
On 2011-11-12T03:24:33+00:00 Fedora wrote:
python-greenlet-0.3.1-6.fc16 has been pushed to the Fedora 16 stable
repository. If problems still persist, please make note of it in this
bug report.
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/29
------------------------------------------------------------------------
On 2011-11-25T02:03:53+00:00 Fedora wrote:
python-greenlet-0.3.1-6.el6 has been pushed to the Fedora EPEL 6 stable
repository. If problems still persist, please make note of it in this
bug report.
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/30
------------------------------------------------------------------------
On 2011-11-25T02:04:00+00:00 Fedora wrote:
python-greenlet-0.3.1-6.fc15 has been pushed to the Fedora 15 stable
repository. If problems still persist, please make note of it in this
bug report.
Reply at:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/comments/31
** Changed in: nova (Fedora)
Status: Unknown => Fix Released
** Changed in: nova (Fedora)
Importance: Unknown => Undecided
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/840641
Title:
nova ftbfs (Sphinx? Segmentation fault)
Status in nova package in Ubuntu:
Invalid
Status in python-greenlet package in Ubuntu:
Fix Released
Status in nova source package in Oneiric:
Invalid
Status in python-greenlet source package in Oneiric:
Fix Released
Status in nova package in Fedora:
Fix Released
Bug description:
Can be reproduced in sbuild but not pbuilder.
creating doc/build/doctrees
creating doc/build/html
Running Sphinx v1.0.7
make[1]: *** [override_dh_auto_build] Segmentation fault
make[1]: Leaving directory `/build/buildd/nova-2011.3~rc~20110901.1523'
make: *** [build] Error 2
dpkg-buildpackage: error: debian/rules build gave error exit status 2
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nova/+bug/840641/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list