[Bug 1448650] Re: rpc.server do not consume messages after message acknowledge failure

James Page james.page at ubuntu.com
Thu Jun 25 08:50:54 UTC 2015


** Also affects: oslo.messaging (Ubuntu Trusty)
   Importance: Undecided
       Status: New

** Also affects: oslo.messaging (Ubuntu Wily)
   Importance: Undecided
       Status: New

** Also affects: oslo.messaging (Ubuntu Vivid)
   Importance: Undecided
       Status: New

** Changed in: oslo.messaging (Ubuntu Wily)
       Status: New => Fix Released

** Changed in: oslo.messaging (Ubuntu Vivid)
   Importance: Undecided => High

** Changed in: oslo.messaging (Ubuntu Trusty)
   Importance: Undecided => High

** Changed in: oslo.messaging (Ubuntu Wily)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1448650

Title:
  rpc.server do not consume messages after message acknowledge failure

Status in Messaging API for OpenStack:
  Fix Released
Status in oslo.messaging package in Ubuntu:
  Fix Released
Status in oslo.messaging source package in Trusty:
  New
Status in oslo.messaging source package in Vivid:
  New
Status in oslo.messaging source package in Wily:
  Fix Released

Bug description:
  def start(self):

      @excutils.forever_retry_uncaught_exceptions
      def _executor_thread():
          try:
           while self._running:
           incoming = self.listener.poll()
           if incoming is not None:
        self._dispatch(incoming)
          except greenlet.GreenletExit:
       return

  class Connection did not a lot work to ensure the operation on a connection can recovered after a reconnection. But after we get the incoming message, connection error on  message acknowledgement can be raised and caught by the excutils.forever_retry_uncaught_exceptions. At this time, do_consume will be False, which means connection will drain_events acrocss "registering" consumer on the queues.  kombu.Connection.drain_events establish a connection instead of raising a connection error.
  Kombu related code is listed below.
  def drain_events(self, **kwargs):
      return self.transport.drain_events(self.connection, **kwargs)

  @property
  def connection(self):
      if not self._closed:
          if not self.connected:
              self.declared_entities.clear()
              self._default_channel = None
              self._connection = self._establish_connection()
              self._closed = False
          return self._connection

  ---------------------------

  [Impact]

  This patch addresses an issue where the underlying kombu library disconnects from the rabbitmq-servers, which prevents oslo.messaging
  from properly going through the reconnect sequence including the recreation of expected queues. This causes messages to be lost and a generally dysfunctional cloud without restarting services.

  [Test Case]

  Note steps are for trusty-icehouse, including latest oslo.messaging
  library (1.3.0-0ubuntu1.1 at the time of this writing).

  Deploy an OpenStack cloud w/ multiple rabbit nodes and then abruptly
  kill one of the rabbit nodes (e.g. force panic, etc). Observe that the
  nova services do detect that the node went down and report that they
  are reconnected, but messages are still reporting as timed out, nova
  service-list still reports compute nodes as down, etc.

  [Regression Potential]

  There is the possibility that there will be more reconnect attempts
  from the oslo.messaging library if there is a false positive in the
  underlying kombu connection reported as disconnected. This should be
  unlikely since this is bringing the oslo.messaging code into sync with
  the underlying library, but it is a possibility.

  [Other Info]

  The attempt to drive reconnection logic was fixed in a recent SRU of
  oslo.messaging (version 1.3.0-0ubuntu1.1). This is an additional fix
  that is required in order to allow the oslo.messaging library to not
  go into a zombie-fied connection state.

To manage notifications about this bug go to:
https://bugs.launchpad.net/oslo.messaging/+bug/1448650/+subscriptions



More information about the Ubuntu-sponsors mailing list