[Bug 1391784] Re: HA failure when no IP address is bound to the VIP interface

Thu Feb 23 19:24:54 UTC 2017

** Changed in: charm-swift-proxy
   Importance: Undecided => High

** Changed in: charm-swift-proxy
       Status: New => Triaged

** Changed in: charm-swift-proxy
     Assignee: (unassigned) => James Page (james-page)

** Changed in: swift-proxy (Juju Charms Collection)
       Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to cinder in Juju Charms Collection.
Matching subscriptions: charm-bugs
https://bugs.launchpad.net/bugs/1391784

Title:
  HA failure when no IP address is bound to the VIP interface

Status in OpenStack swift-proxy charm:
  Triaged
Status in cinder package in Juju Charms Collection:
  Fix Released
Status in glance package in Juju Charms Collection:
  Fix Released
Status in keystone package in Juju Charms Collection:
  Fix Released
Status in neutron-api package in Juju Charms Collection:
  Fix Released
Status in nova-cloud-controller package in Juju Charms Collection:
  Fix Released
Status in openstack-dashboard package in Juju Charms Collection:
  Fix Released
Status in percona-cluster package in Juju Charms Collection:
  Invalid
Status in swift-proxy package in Juju Charms Collection:
  Invalid

Bug description:
  Proxying from juju ML:

  We've been working on setting up an Openstack cluster on Trusty for a
  few months now using Juju and MAAS, although we've yet to go into
  production. I had everything working fine, including HA deployments of
  Keystone, Glance, Percona etc.

  The older versions of the charms supported HA using the config
  settings vip, vip_cidr and vip_iface. Without me making any
  modifications to these charms, I successfully deployed all of the
  above charms with the bog-standard hacluster charm.

  Over the weekend I've been updating to Juno, and I naturally updated
  to the latest stable charms from the Charm store. Breaking changes
  have been introduced to these charms such that they no longer support
  my deployment. My Openstack cluster promptly broke in a nasty way. I'm
  *really* glad this isn't a production environment, but these kinds of
  non-backward compatible breakages do give me cause for concern going
  forward.

  To explain how this broke, I'll first need to explain how our network
  was deployed:

      In order to not burn through many public IPs, we assign RFC1918 IPs to *every server* by DHCP.
      We run at least two instances of critical services
      Public IPs are assigned primarily by Pacemaker
      Public and Private subnets coexist on a single Layer-2 network.
      Nodes that do not directly participate in the Public subnet still have direct access (not via a router) to the Public IPs courtesy of the DHCP option (rfc3442-classless-static-routes). It turns out that a Linux hosts in different subnets can directly communicate with one-another on the same layer-2 network without the need of a router.

  This set-up was highly efficient in terms of consumption of valuable
  public IP addresses, without forcing inter-subnet communications via
  an unnecessary hop. The only trick that we had to pull-off was getting
  the DHCP server to give out the rfc3442-classless-static-routes, which
  was simple.

  The old OpenStack charms with their simple vip, vip_cidr and vip_iface
  options worked perfectly with this set-up. The new charms cannot
  support this at all, as they have become, in my view, "too clever".
  They now insist that the vip can only be bound to an interface that
  already has an IP in the same subnet.

  If I have to bind public IPs to every server (IPs that they will never
  use) just in order to have Pacemaker assign the vip, I'll burn through
  a lot of IPs in the most pointless way imaginable.

  I've modified the keystone and openstack-dashboard charms to re-
  introduce the old functionality in a way that doesn't break the new
  multiple-IP functionality. I'll paste my keystone patch below to give
  you an idea what I think is needed. This hasn't been thoroughly
  tested, but it seems to work. Pacemaker can at least set the public IP
  address again.

  If there is some other (better) way to achieve the same level of IP
  address allocation efficiency and performance without patching the
  Openstack charms, please point me in the right direction.

  Thanks,
  John

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-swift-proxy/+bug/1391784/+subscriptions