[Bug 1521958] Re: rabbit: starvation of connections for reply

Wed Apr 20 13:28:39 UTC 2016

This bug was fixed in the package python-oslo.messaging - 2.5.0-1ubuntu2~cloud0
---------------

 python-oslo.messaging (2.5.0-1ubuntu2~cloud0) trusty-liberty; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 python-oslo.messaging (2.5.0-1ubuntu2) wily; urgency=medium
 .
   [ Jorge Niedbalski ]
   * d/p/make-reply-and-fanout-queues-expire-instead-of-auto-delete.patch:
     Make reply and fanout queues expire instead of auto-delete (LP: #1515278).
 .
   [ Corey Bryant ]
   * d/p/dont-hold-connection-when-reply-fail.patch: Cherry-picked
     patch from upstream VCS to fix the amqp reply logic when
     connections are lost (LP: #1521958).

** Changed in: cloud-archive/liberty
       Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-oslo.messaging in Ubuntu.
https://bugs.launchpad.net/bugs/1521958

Title:
  rabbit: starvation of connections for reply

Status in Ubuntu Cloud Archive:
  Invalid
Status in Ubuntu Cloud Archive juno series:
  New
Status in Ubuntu Cloud Archive liberty series:
  Fix Released
Status in oslo.messaging:
  Fix Released
Status in python-oslo.messaging package in Ubuntu:
  Invalid
Status in oslo.messaging source package in Trusty:
  New
Status in oslo.messaging source package in Vivid:
  Won't Fix
Status in python-oslo.messaging source package in Wily:
  Fix Released

Bug description:
  Hi,

  When a client died/restart/stop but was waiting more replies that the rpc_connection_pool size,
  the server will hold all connections from the pool during the retry logic in case of the client come back with the same reply_queue_id (that occurs only is rabbit is restart, not the client).

  Cheers,

  ---------------------------

  [Impact]

  This patch addresses an issue when multiple clients lost
  networks/died/restart/stop, the server will hold all connections from
  the pool(rpc_connection_pool size) during the retry logic in case of
  the client come back with the same reply_queue_id (that occurs only is
  rabbit is restart, not the client), and which cause nova-conductor
  infinitely reconnects to rabbit if large nova-compute hosts are
  deployment until all of the connections of the old reply messages are
  expired, for the quite large scaled cloud, high availability is
  broken.

  [Test Case]

  Note steps are for trusty-icehouse, including latest oslo.messaging
  library (1.3.0-0ubuntu1.2 at the time of this writing).

  Deploy an OpenStack cloud w/ multiple rabbit nodes and multiple nova
  compute hosts then cut off the network between OpenStack services and
  RabbitMQ. Observe that the nova-conductor is infinitely reconnecting
  to rabbit nodes.

  [Regression Potential]

  None.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1521958/+subscriptions