[kolla] [train] [designate] Terraform "repave" causes DNS records to become orphaned
We have customers who use Terraform to build their clusters. They do a thing that they call “repave” where they run an ansible playbook that calls “terraform destroy” and then immediately calls “terraform apply” to rebuild the cluster. It looks like Designate is not able to keep up, and it fails too delete one or more of the DNS records. We have 3 records, IPv4 forward (A) and reverse (PTR) and IPv6 forward (AAAA). When Designate fails to delete a record, it becomes orphaned. On the next “repave” the record is not deleted, because it’s not associated with the new VM, and we see errors in designate-sink.log: 2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher [parameters: {'id': '1282a6780f2f493c81ed20bc62ef370f', 'version': 1, 'created_at': datetime.datetime(2023, 2, 13, 2, 49, 40, 814726), 'zone_shard': 97, 'tenant_id': '130b797392d24b408e73c2be545d0a20', 'zone_id': '0616b8e0852540e59fd383cfb678af32', 'recordset_id': '1fc5a9eaea824d0f8b53eb91ea9ff6e2', 'data': '10.22.0.210', 'hash': 'e3270256501fceb97a14d4133d394880', 'managed': 1, 'managed_plugin_type': 'handler', 'managed_plugin_name': 'our_nova_fixed', 'managed_resource_type': 'instance', 'managed_resource_id': '842833cb9410404bbd5009eb6e0bf90a', 'status': 'PENDING', 'action': 'UPDATE', 'serial': 1676256582}] … 2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher designate.exceptions.DuplicateRecord: Duplicate Record The orphaned record is causing a mariadb collision because a record with that name and IP already exists. When this happens with an IPv6 record, it looks like Designate tries to create the IPv6 record, and fails, and then does not try to create an IPv4 record, which causes trouble because Terraform waits for the name resolution to work. The obvious solution is to tell TF users to introduce a delay between “destroy” and “apply” but that would be non-trivial for them, and we would prefer to fix it on our end. What can I do, to make Designate gracefully manage cases where a cluster is deleted and then immediately rebuilt with the same names and IPs? Also, how can I clean up these orphaned records. I’ve been asking the customer to destroy, and then deleting the record, and then asking them to rebuild, but that is a manual process for them. Is it possible to link the orphaned record to the new VM so that it will be deleted on the next “repave?” Example: This VM was built today: $ os server show f5e75688-5fa9-41b6-876f-289e0ebc04b9|grep launched_at | OS-SRV-USG:launched_at | 2023-02-16T02:48:49.000000 The A record was created in January: $ os recordset show 0616b8e0852540e59fd383cfb678af32 1fc5a9ea-ea82-4d0f-8b53-eb91ea9ff6e2|grep created_at | created_at | 2023-01-25T02:48:52.000000 |
On Thu, Feb 16, 2023 at 12:57 PM Albert Braden <ozzzo@yahoo.com> wrote:
We have customers who use Terraform to build their clusters. They do a thing that they call “repave” where they run an ansible playbook that calls “terraform destroy” and then immediately calls “terraform apply” to rebuild the cluster. It looks like Designate is not able to keep up, and it fails too delete one or more of the DNS records. We have 3 records, IPv4 forward (A) and reverse (PTR) and IPv6 forward (AAAA).
When Designate fails to delete a record, it becomes orphaned. On the next “repave” the record is not deleted, because it’s not associated with the new VM, and we see errors in designate-sink.log:
2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher [parameters: {'id': '1282a6780f2f493c81ed20bc62ef370f', 'version': 1, 'created_at': datetime.datetime(2023, 2, 13, 2, 49, 40, 814726), 'zone_shard': 97, 'tenant_id': '130b797392d24b408e73c2be545d0a20', 'zone_id': '0616b8e0852540e59fd383cfb678af32', 'recordset_id': '1fc5a9eaea824d0f8b53eb91ea9ff6e2', 'data': '10.22.0.210', 'hash': 'e3270256501fceb97a14d4133d394880', 'managed': 1, 'managed_plugin_type': 'handler', 'managed_plugin_name': 'our_nova_fixed', 'managed_resource_type': 'instance', 'managed_resource_id': '842833cb9410404bbd5009eb6e0bf90a', 'status': 'PENDING', 'action': 'UPDATE', 'serial': 1676256582}] … 2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher designate.exceptions.DuplicateRecord: Duplicate Record
The orphaned record is causing a mariadb collision because a record with that name and IP already exists. When this happens with an IPv6 record, it looks like Designate tries to create the IPv6 record, and fails, and then does not try to create an IPv4 record, which causes trouble because Terraform waits for the name resolution to work.
The obvious solution is to tell TF users to introduce a delay between “destroy” and “apply” but that would be non-trivial for them, and we would prefer to fix it on our end. What can I do, to make Designate gracefully manage cases where a cluster is deleted and then immediately rebuilt with the same names and IPs? Also, how can I clean up these orphaned records. I’ve been asking the customer to destroy, and then deleting the record, and then asking them to rebuild, but that is a manual process for them. Is it possible to link the orphaned record to the new VM so that it will be deleted on the next “repave?”
or perhaps the Terraform module should wait until the resource is fully gone in case the delete is actually asynchronus? same way that a VM delete is asynchronus
Example:
This VM was built today: $ os server show f5e75688-5fa9-41b6-876f-289e0ebc04b9|grep launched_at | OS-SRV-USG:launched_at | 2023-02-16T02:48:49.000000
The A record was created in January: $ os recordset show 0616b8e0852540e59fd383cfb678af32 1fc5a9ea-ea82-4d0f-8b53-eb91ea9ff6e2|grep created_at | created_at | 2023-01-25T02:48:52.000000 |
-- Mohammed Naser VEXXHOST, Inc.
I wonder if it’s the same (or similar) issue I asked about in November [1]. Do you have a HA cloud with multiple control nodes? One of our customers also uses terraform to deploy clusters and they have to enable a sleep between destroy and create commands, otherwise a wrong (deleted) project ID will be applied. We figured out it was the keystone role cache but still haven’t found a way to achieve both a reasonable performance (tried different cache settings) and quicker terraform redeployments. [1] https://lists.openstack.org/pipermail/openstack-discuss/2022-November/031122... Zitat von Mohammed Naser <mnaser@vexxhost.com>:
On Thu, Feb 16, 2023 at 12:57 PM Albert Braden <ozzzo@yahoo.com> wrote:
We have customers who use Terraform to build their clusters. They do a thing that they call “repave” where they run an ansible playbook that calls “terraform destroy” and then immediately calls “terraform apply” to rebuild the cluster. It looks like Designate is not able to keep up, and it fails too delete one or more of the DNS records. We have 3 records, IPv4 forward (A) and reverse (PTR) and IPv6 forward (AAAA).
When Designate fails to delete a record, it becomes orphaned. On the next “repave” the record is not deleted, because it’s not associated with the new VM, and we see errors in designate-sink.log:
2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher [parameters: {'id': '1282a6780f2f493c81ed20bc62ef370f', 'version': 1, 'created_at': datetime.datetime(2023, 2, 13, 2, 49, 40, 814726), 'zone_shard': 97, 'tenant_id': '130b797392d24b408e73c2be545d0a20', 'zone_id': '0616b8e0852540e59fd383cfb678af32', 'recordset_id': '1fc5a9eaea824d0f8b53eb91ea9ff6e2', 'data': '10.22.0.210', 'hash': 'e3270256501fceb97a14d4133d394880', 'managed': 1, 'managed_plugin_type': 'handler', 'managed_plugin_name': 'our_nova_fixed', 'managed_resource_type': 'instance', 'managed_resource_id': '842833cb9410404bbd5009eb6e0bf90a', 'status': 'PENDING', 'action': 'UPDATE', 'serial': 1676256582}] … 2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher designate.exceptions.DuplicateRecord: Duplicate Record
The orphaned record is causing a mariadb collision because a record with that name and IP already exists. When this happens with an IPv6 record, it looks like Designate tries to create the IPv6 record, and fails, and then does not try to create an IPv4 record, which causes trouble because Terraform waits for the name resolution to work.
The obvious solution is to tell TF users to introduce a delay between “destroy” and “apply” but that would be non-trivial for them, and we would prefer to fix it on our end. What can I do, to make Designate gracefully manage cases where a cluster is deleted and then immediately rebuilt with the same names and IPs? Also, how can I clean up these orphaned records. I’ve been asking the customer to destroy, and then deleting the record, and then asking them to rebuild, but that is a manual process for them. Is it possible to link the orphaned record to the new VM so that it will be deleted on the next “repave?”
or perhaps the Terraform module should wait until the resource is fully gone in case the delete is actually asynchronus? same way that a VM delete is asynchronus
Example:
This VM was built today: $ os server show f5e75688-5fa9-41b6-876f-289e0ebc04b9|grep launched_at | OS-SRV-USG:launched_at | 2023-02-16T02:48:49.000000
The A record was created in January: $ os recordset show 0616b8e0852540e59fd383cfb678af32 1fc5a9ea-ea82-4d0f-8b53-eb91ea9ff6e2|grep created_at | created_at | 2023-01-25T02:48:52.000000 |
-- Mohammed Naser VEXXHOST, Inc.
Yes, we have 3 controllers per region. Theoretically we could write some TF code that would wait for the deletions to finish before rebuilding; the hard part would be getting our customers to deploy it. For them TF is just a thing that builds servers so that they can work, and asking them to change it would be a heavy burden. I'm hoping to find a way to fix it in Openstack. On Thursday, February 16, 2023, 03:14:30 PM EST, Eugen Block <eblock@nde.ag> wrote: I wonder if it’s the same (or similar) issue I asked about in November [1]. Do you have a HA cloud with multiple control nodes? One of our customers also uses terraform to deploy clusters and they have to enable a sleep between destroy and create commands, otherwise a wrong (deleted) project ID will be applied. We figured out it was the keystone role cache but still haven’t found a way to achieve both a reasonable performance (tried different cache settings) and quicker terraform redeployments. [1] https://lists.openstack.org/pipermail/openstack-discuss/2022-November/031122... Zitat von Mohammed Naser <mnaser@vexxhost.com>:
On Thu, Feb 16, 2023 at 12:57 PM Albert Braden <ozzzo@yahoo.com> wrote:
We have customers who use Terraform to build their clusters. They do a thing that they call “repave” where they run an ansible playbook that calls “terraform destroy” and then immediately calls “terraform apply” to rebuild the cluster. It looks like Designate is not able to keep up, and it fails too delete one or more of the DNS records. We have 3 records, IPv4 forward (A) and reverse (PTR) and IPv6 forward (AAAA).
When Designate fails to delete a record, it becomes orphaned. On the next “repave” the record is not deleted, because it’s not associated with the new VM, and we see errors in designate-sink.log:
2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher [parameters: {'id': '1282a6780f2f493c81ed20bc62ef370f', 'version': 1, 'created_at': datetime.datetime(2023, 2, 13, 2, 49, 40, 814726), 'zone_shard': 97, 'tenant_id': '130b797392d24b408e73c2be545d0a20', 'zone_id': '0616b8e0852540e59fd383cfb678af32', 'recordset_id': '1fc5a9eaea824d0f8b53eb91ea9ff6e2', 'data': '10.22.0.210', 'hash': 'e3270256501fceb97a14d4133d394880', 'managed': 1, 'managed_plugin_type': 'handler', 'managed_plugin_name': 'our_nova_fixed', 'managed_resource_type': 'instance', 'managed_resource_id': '842833cb9410404bbd5009eb6e0bf90a', 'status': 'PENDING', 'action': 'UPDATE', 'serial': 1676256582}] … 2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher designate.exceptions.DuplicateRecord: Duplicate Record
The orphaned record is causing a mariadb collision because a record with that name and IP already exists. When this happens with an IPv6 record, it looks like Designate tries to create the IPv6 record, and fails, and then does not try to create an IPv4 record, which causes trouble because Terraform waits for the name resolution to work.
The obvious solution is to tell TF users to introduce a delay between “destroy” and “apply” but that would be non-trivial for them, and we would prefer to fix it on our end. What can I do, to make Designate gracefully manage cases where a cluster is deleted and then immediately rebuilt with the same names and IPs? Also, how can I clean up these orphaned records. I’ve been asking the customer to destroy, and then deleting the record, and then asking them to rebuild, but that is a manual process for them. Is it possible to link the orphaned record to the new VM so that it will be deleted on the next “repave?”
or perhaps the Terraform module should wait until the resource is fully gone in case the delete is actually asynchronus? same way that a VM delete is asynchronus
Example:
This VM was built today: $ os server show f5e75688-5fa9-41b6-876f-289e0ebc04b9|grep launched_at | OS-SRV-USG:launched_at | 2023-02-16T02:48:49.000000
The A record was created in January: $ os recordset show 0616b8e0852540e59fd383cfb678af32 1fc5a9ea-ea82-4d0f-8b53-eb91ea9ff6e2|grep created_at | created_at | 2023-01-25T02:48:52.000000 |
-- Mohammed Naser VEXXHOST, Inc.
I agree, I had also hoped to get some more insights here on this list but got no response yet. Maybe I should create a bug report for this role cache issue, that could draw some attention to it. Zitat von Albert Braden <ozzzo@yahoo.com>:
Yes, we have 3 controllers per region. Theoretically we could write some TF code that would wait for the deletions to finish before rebuilding; the hard part would be getting our customers to deploy it. For them TF is just a thing that builds servers so that they can work, and asking them to change it would be a heavy burden. I'm hoping to find a way to fix it in Openstack. On Thursday, February 16, 2023, 03:14:30 PM EST, Eugen Block <eblock@nde.ag> wrote:
I wonder if it’s the same (or similar) issue I asked about in November [1]. Do you have a HA cloud with multiple control nodes? One of our customers also uses terraform to deploy clusters and they have to enable a sleep between destroy and create commands, otherwise a wrong (deleted) project ID will be applied. We figured out it was the keystone role cache but still haven’t found a way to achieve both a reasonable performance (tried different cache settings) and quicker terraform redeployments.
[1] https://lists.openstack.org/pipermail/openstack-discuss/2022-November/031122...
Zitat von Mohammed Naser <mnaser@vexxhost.com>:
On Thu, Feb 16, 2023 at 12:57 PM Albert Braden <ozzzo@yahoo.com> wrote:
We have customers who use Terraform to build their clusters. They do a thing that they call “repave” where they run an ansible playbook that calls “terraform destroy” and then immediately calls “terraform apply” to rebuild the cluster. It looks like Designate is not able to keep up, and it fails too delete one or more of the DNS records. We have 3 records, IPv4 forward (A) and reverse (PTR) and IPv6 forward (AAAA).
When Designate fails to delete a record, it becomes orphaned. On the next “repave” the record is not deleted, because it’s not associated with the new VM, and we see errors in designate-sink.log:
2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher [parameters: {'id': '1282a6780f2f493c81ed20bc62ef370f', 'version': 1, 'created_at': datetime.datetime(2023, 2, 13, 2, 49, 40, 814726), 'zone_shard': 97, 'tenant_id': '130b797392d24b408e73c2be545d0a20', 'zone_id': '0616b8e0852540e59fd383cfb678af32', 'recordset_id': '1fc5a9eaea824d0f8b53eb91ea9ff6e2', 'data': '10.22.0.210', 'hash': 'e3270256501fceb97a14d4133d394880', 'managed': 1, 'managed_plugin_type': 'handler', 'managed_plugin_name': 'our_nova_fixed', 'managed_resource_type': 'instance', 'managed_resource_id': '842833cb9410404bbd5009eb6e0bf90a', 'status': 'PENDING', 'action': 'UPDATE', 'serial': 1676256582}] … 2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher designate.exceptions.DuplicateRecord: Duplicate Record
The orphaned record is causing a mariadb collision because a record with that name and IP already exists. When this happens with an IPv6 record, it looks like Designate tries to create the IPv6 record, and fails, and then does not try to create an IPv4 record, which causes trouble because Terraform waits for the name resolution to work.
The obvious solution is to tell TF users to introduce a delay between “destroy” and “apply” but that would be non-trivial for them, and we would prefer to fix it on our end. What can I do, to make Designate gracefully manage cases where a cluster is deleted and then immediately rebuilt with the same names and IPs? Also, how can I clean up these orphaned records. I’ve been asking the customer to destroy, and then deleting the record, and then asking them to rebuild, but that is a manual process for them. Is it possible to link the orphaned record to the new VM so that it will be deleted on the next “repave?”
or perhaps the Terraform module should wait until the resource is fully gone in case the delete is actually asynchronus? same way that a VM delete is asynchronus
Example:
This VM was built today: $ os server show f5e75688-5fa9-41b6-876f-289e0ebc04b9|grep launched_at | OS-SRV-USG:launched_at | 2023-02-16T02:48:49.000000
The A record was created in January: $ os recordset show 0616b8e0852540e59fd383cfb678af32 1fc5a9ea-ea82-4d0f-8b53-eb91ea9ff6e2|grep created_at | created_at | 2023-01-25T02:48:52.000000 |
-- Mohammed Naser VEXXHOST, Inc.
I created https://bugs.launchpad.net/keystone/+bug/2007982 Zitat von Eugen Block <eblock@nde.ag>:
I agree, I had also hoped to get some more insights here on this list but got no response yet. Maybe I should create a bug report for this role cache issue, that could draw some attention to it.
Zitat von Albert Braden <ozzzo@yahoo.com>:
Yes, we have 3 controllers per region. Theoretically we could write some TF code that would wait for the deletions to finish before rebuilding; the hard part would be getting our customers to deploy it. For them TF is just a thing that builds servers so that they can work, and asking them to change it would be a heavy burden. I'm hoping to find a way to fix it in Openstack. On Thursday, February 16, 2023, 03:14:30 PM EST, Eugen Block <eblock@nde.ag> wrote:
I wonder if it’s the same (or similar) issue I asked about in November [1]. Do you have a HA cloud with multiple control nodes? One of our customers also uses terraform to deploy clusters and they have to enable a sleep between destroy and create commands, otherwise a wrong (deleted) project ID will be applied. We figured out it was the keystone role cache but still haven’t found a way to achieve both a reasonable performance (tried different cache settings) and quicker terraform redeployments.
[1] https://lists.openstack.org/pipermail/openstack-discuss/2022-November/031122...
Zitat von Mohammed Naser <mnaser@vexxhost.com>:
On Thu, Feb 16, 2023 at 12:57 PM Albert Braden <ozzzo@yahoo.com> wrote:
We have customers who use Terraform to build their clusters. They do a thing that they call “repave” where they run an ansible playbook that calls “terraform destroy” and then immediately calls “terraform apply” to rebuild the cluster. It looks like Designate is not able to keep up, and it fails too delete one or more of the DNS records. We have 3 records, IPv4 forward (A) and reverse (PTR) and IPv6 forward (AAAA).
When Designate fails to delete a record, it becomes orphaned. On the next “repave” the record is not deleted, because it’s not associated with the new VM, and we see errors in designate-sink.log:
2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher [parameters: {'id': '1282a6780f2f493c81ed20bc62ef370f', 'version': 1, 'created_at': datetime.datetime(2023, 2, 13, 2, 49, 40, 814726), 'zone_shard': 97, 'tenant_id': '130b797392d24b408e73c2be545d0a20', 'zone_id': '0616b8e0852540e59fd383cfb678af32', 'recordset_id': '1fc5a9eaea824d0f8b53eb91ea9ff6e2', 'data': '10.22.0.210', 'hash': 'e3270256501fceb97a14d4133d394880', 'managed': 1, 'managed_plugin_type': 'handler', 'managed_plugin_name': 'our_nova_fixed', 'managed_resource_type': 'instance', 'managed_resource_id': '842833cb9410404bbd5009eb6e0bf90a', 'status': 'PENDING', 'action': 'UPDATE', 'serial': 1676256582}] … 2023-02-13 02:49:40.824 27 ERROR oslo_messaging.notify.dispatcher designate.exceptions.DuplicateRecord: Duplicate Record
The orphaned record is causing a mariadb collision because a record with that name and IP already exists. When this happens with an IPv6 record, it looks like Designate tries to create the IPv6 record, and fails, and then does not try to create an IPv4 record, which causes trouble because Terraform waits for the name resolution to work.
The obvious solution is to tell TF users to introduce a delay between “destroy” and “apply” but that would be non-trivial for them, and we would prefer to fix it on our end. What can I do, to make Designate gracefully manage cases where a cluster is deleted and then immediately rebuilt with the same names and IPs? Also, how can I clean up these orphaned records. I’ve been asking the customer to destroy, and then deleting the record, and then asking them to rebuild, but that is a manual process for them. Is it possible to link the orphaned record to the new VM so that it will be deleted on the next “repave?”
or perhaps the Terraform module should wait until the resource is fully gone in case the delete is actually asynchronus? same way that a VM delete is asynchronus
Example:
This VM was built today: $ os server show f5e75688-5fa9-41b6-876f-289e0ebc04b9|grep launched_at | OS-SRV-USG:launched_at | 2023-02-16T02:48:49.000000
The A record was created in January: $ os recordset show 0616b8e0852540e59fd383cfb678af32 1fc5a9ea-ea82-4d0f-8b53-eb91ea9ff6e2|grep created_at | created_at | 2023-01-25T02:48:52.000000 |
-- Mohammed Naser VEXXHOST, Inc.
participants (3)
-
Albert Braden
-
Eugen Block
-
Mohammed Naser