[Bug 1518430] Re: liberty: ~busy loop on epoll_wait being called with zero timeout

OpenStack Infra 1518430 at bugs.launchpad.net
Fri Nov 11 03:12:34 UTC 2016


Reviewed:  https://review.openstack.org/394963
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=f4bf955879747338b0ab7a6ca41d0dea74508353
Submitter: Jenkins
Branch:    stable/newton

commit f4bf955879747338b0ab7a6ca41d0dea74508353
Author: John Eckersberg <jeckersb at redhat.com>
Date:   Fri Oct 14 11:02:47 2016 -0400

    rabbit: Avoid busy loop on epoll_wait with heartbeat+eventlet
    
    Calling threading.Event.wait() when using eventlet results in a busy
    loop calling epoll_wait, because the Python 2.x
    threading.Condition.wait() implementation busy-waits by calling
    sleep() with very small values (0.0005..0.05s).  Because sleep() is
    monkey-patched by eventlet, this results in many very short timers
    being added to the eventlet hub, and forces eventlet to constantly
    epoll_wait looking for new data unecessarily.
    
    This utilizes a new Event from eventletutils which conditionalizes the
    event primitive depending on whether or not eventlet is being used.
    If it is, eventlet.event.Event is used instead of threading.Event.
    The eventlet.event.Event implementation does not suffer from the same
    busy-wait sleep problem.  If eventlet is not used, the previous
    behavior is retained.
    
    For Newton backport, this bundles the Event from eventletutils
    directly in oslo.messaging under the _utils module.  It is taken from:
    
    https://review.openstack.org/#/c/389739/
    
    combined with the followup fix:
    
    https://review.openstack.org/#/c/394460/
    
    Change-Id: I5c211092d282e724d1c87ce4d06b6c44b592e764
    Depends-On: Id33c9f8c17102ba1fe24c12b053c336b6d265501
    Closes-bug: #1518430
    (cherry picked from commit a6c193f3eba62cdcbfe04d0fa93e95352bcfb1c3)


** Changed in: cloud-archive/newton
       Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1518430

Title:
  liberty: ~busy loop on epoll_wait being called with zero timeout

Status in Ubuntu Cloud Archive:
  Fix Committed
Status in Ubuntu Cloud Archive icehouse series:
  New
Status in Ubuntu Cloud Archive liberty series:
  New
Status in Ubuntu Cloud Archive mitaka series:
  New
Status in Ubuntu Cloud Archive newton series:
  Fix Committed
Status in OpenStack Compute (nova):
  Invalid
Status in oslo.messaging:
  Fix Released
Status in nova package in Ubuntu:
  Invalid
Status in python-oslo.messaging package in Ubuntu:
  New
Status in python-oslo.messaging source package in Trusty:
  New
Status in python-oslo.messaging source package in Xenial:
  New
Status in python-oslo.messaging source package in Yakkety:
  New
Status in python-oslo.messaging source package in Zesty:
  New

Bug description:
  Context: openstack juju/maas deploy using 1510 charms release
  on trusty, with:
    openstack-origin: "cloud:trusty-liberty"
    source: "cloud:trusty-updates/liberty

  * Several openstack nova- and neutron- services, at least:
  nova-compute, neutron-server, nova-conductor,
  neutron-openvswitch-agent,neutron-vpn-agent
  show almost busy looping on epoll_wait() calls, with zero timeout set
  most frequently.
  - nova-compute (chose it b/cos single proc'd) strace and ltrace captures:
    http://paste.ubuntu.com/13371248/ (ltrace, strace)

  As comparison, this is how it looks on a kilo deploy:
  - http://paste.ubuntu.com/13371635/

  * 'top' sample from a nova-cloud-controller unit from
     this completely idle stack:
    http://paste.ubuntu.com/13371809/

  FYI *not* seeing this behavior on keystone, glance, cinder,
  ceilometer-api.

  As this issue is present on several components, it likely comes
  from common libraries (oslo concurrency?), fyi filed the bug to
  nova itself as a starting point for debugging.

  Note: The description in the following bug gives a good overview of
  the issue and points to a possible fix for oslo.messaging:
  https://bugs.launchpad.net/mos/+bug/1380220

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1518430/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list