[Bug 1827690] Re: [19.04][stein] barbican-worker is down: Requested revision 1a0c2cdafb38 overlaps with other requested revisions 39cf2e645cba

Márton Kiss 1827690 at bugs.launchpad.net
Thu Oct 8 10:07:24 UTC 2020


After additional debug and research I can confirm the issue is the
result of a race-condition situation when multiple barbican-worker units
trying to populate the database with schema data simultaneously and
upgrade the alembic version.

As a result the barbican-worker service will stop because it is trying
to recreate tables already created by other worker units.

I applied the following manual steps as workaround:

1, stop all barbican-workers, stop jujud agent
2, drop db tables
3, clean up alembic db change states if present: rm -rf /usr/lib/python3/dist-packages/barbican/model/migration/alembic_migrations/versions/*_db_change.py files
4, start barbican-worker on a single unit (populates the db tables to head revisions)
5, restart all barbican-workers, start jujud agent

as a result each units must be on the same db revision:
$ juju run --application barbican 'barbican-db-manage current -V | grep Revision'
- Stdout: |2
        Revision ID: 39cf2e645cba
  UnitId: barbican/0
- Stdout: |2
        Revision ID: 39cf2e645cba
  UnitId: barbican/1
- Stdout: |2
        Revision ID: 39cf2e645cba
  UnitId: barbican/2

The root cause of the problem is the default implementation of barbican-
worker services, because they are populating the data after service
start, this can lead to the race condition situation.

A proper permanent charm fix should be:
1, As [1] mentions in the Install and configure components / 3. Populate the Key Manager service database section, the charm should set the db_auto_create to false in the /etc/barbican/barbican.conf file.
2, the leader charm populates the database
3, start the barbican-worker services after the leader finished with db schema upgrade.

[1] https://docs.openstack.org/barbican/stein/install/install-
ubuntu.html

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to barbican in Ubuntu.
https://bugs.launchpad.net/bugs/1827690

Title:
  [19.04][stein] barbican-worker is down: Requested revision
  1a0c2cdafb38 overlaps with other requested revisions 39cf2e645cba

Status in OpenStack Barbican Charm:
  In Progress
Status in barbican package in Ubuntu:
  New

Bug description:
  After deploying barbican (stein, 19.04 charms) I got into a situation
  where both workers (two-node HA) are down due to a failed alembic
  migration:

  https://private-fileshare.canonical.com/~dima/charm-dumps/04-05-2019-barbican-0-var-log.tar.gz
  https://private-fileshare.canonical.com/~dima/charm-dumps/04-05-2019-barbican-1-var-log.tar.gz

  https://paste.ubuntu.com/p/J6GvwkXMWc/ - bundle.yaml

  May 03 20:51:11 juju-35b20e-3-lxd-0 systemd[1]: Started OpenStack Barbican Key Management Workers.
  May 03 20:51:39 juju-35b20e-3-lxd-0 systemd[1]: barbican-worker.service: Main process exited, code=exited, status=1/FAILURE
  May 03 20:51:39 juju-35b20e-3-lxd-0 systemd[1]: barbican-worker.service: Failed with result 'exit-code'.

  juju status barbican
  Model      Controller  Cloud/Region  Version  SLA          Timestamp
  openstack  samaas      samaas        2.6-rc1  unsupported  23:27:23Z

  App                 Version  Status   Scale  Charm           Store       Rev  OS      Notes
  barbican            8.0.0    blocked      2  barbican        jujucharms   18  ubuntu
  barbican-vault      1.2.2    active       2  barbican-vault  jujucharms    2  ubuntu
  hacluster-barbican           active       2  hacluster       jujucharms   54  ubuntu

  Unit                     Workload  Agent  Machine  Public address  Ports              Message
  barbican/0               blocked   idle   3/lxd/0  10.232.46.135   9311/tcp,9312/tcp  Services not running that should be: barbican-worker
    barbican-vault/1       active    idle            10.232.46.135                      Unit is ready
    hacluster-barbican/1   active    idle            10.232.46.135                      Unit is ready and clustered
  barbican/1*              blocked   idle   2/lxd/0  10.232.46.130   9311/tcp,9312/tcp  Services not running that should be: barbican-worker
    barbican-vault/0*      active    idle            10.232.46.130                      Unit is ready
    hacluster-barbican/0*  active    idle            10.232.46.130                      Unit is ready and clustered

  https://paste.ubuntu.com/p/BS3fHw287r/
  2019-05-03 21:13:46.582 115638 ERROR barbican alembic.util.exc.CommandError: Requested revision 1a0c2cdafb38 overlaps with other requested revisions 39cf2e645cba

  mysql> use barbican;
  mysql> select * from alembic_version;
  +--------------+
  | version_num  |
  +--------------+
  | 1a0c2cdafb38 |
  | 39cf2e645cba |
  +--------------+
  2 rows in set (0.00 sec)

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-barbican/+bug/1827690/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list