[Bug 1940976] Re: Race condition in zone serial generation on concurrent changes to recordsets
Michael Johnson
1940976 at bugs.launchpad.net
Tue Oct 12 00:10:46 UTC 2021
Yes, that is what I am asking about.
The configuration setting:
[coordination]
backend_url = <DLM URL>
Such that the threads are using the distributed lock manager.
You can also look for this warning message:
https://github.com/openstack/designate/blob/05343d4226822da8b9776201ea18e000d366573d/designate/coordination.py#L72
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1940976
Title:
Race condition in zone serial generation on concurrent changes to
recordsets
Status in Ubuntu Cloud Archive:
New
Status in Designate:
New
Status in designate package in Ubuntu:
New
Bug description:
I discovered a reproducible race condition when updating multiple
recordsets of a single zone at the same time. There was an issue
https://bugs.launchpad.net/bugs/1871332 about multiple designate
instances and their coordination / distributed locking, but I also
observe the issue with just a single instance and its multiple worker
threads targeting the same zone ... and this quite easily happens when
using IaC tooling like terraform which utilize multiple threads and
multiple connections when talking to a cloud API.
To trigger the race condition I used this piece of terraform to create three recordsets:
--- cut ---
resource "openstack_dns_recordset_v2" "testrecords" {
count = 3
zone_id = data.openstack_dns_zone_v2.myzone.id
name = "record-${count.index}.${data.openstack_dns_zone_v2.myzone.name}"
description = "test-${count.index}"
ttl = 60
type = "A"
records = ["127.0.0.1"]
}
--- cut ---
those 3 records will be created independently / concurrently and in the end the zone one the nameserver does not contain all the records. When creating just one more record afterwards all the records are written / updated in the zonefile properly - so this is due to the serial being updated inconsistently.
Looking at the code one how the serial is created:
https://opendev.org/openstack/designate/src/branch/master/designate/utils.py#L137,
it appears to clearly be subject to race conditions when multiple
threads are updating the zone currently and use the previously current
zone timestamp read from the database and increment it "in code".
There is a not yet merged patchset by Nicolas Bock which does not refer to a bug, but apparently changes the way the serial is created and uses an update statement in the database to increase the serial: https://review.opendev.org/c/openstack/designate/+/776173
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1940976/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list