Open Stack

Wed Jun 26 13:45:12 UTC 2019

On Tue, Jun 25, 2019 at 10:14 PM Matt Riedemann <mriedemos at gmail.com> wrote:

> There are still quite a few TODOs in the code [1][2][3] from a kilo era
> blueprint [4]. At this point I'm pretty sure you can't startup the
> nova-compute service without having a ComputeNode record without a host
> and hypervisor_hostname field set (we don't set the
> ComputeNode.service_id anywhere anymore as far as I can tell, except in
> some ComputeNode RPC compat code [5]).
>
> I've stumbled across all of this code before, but was looking at it
> again today because I have a very simple change I need to make which is
> going from a ComputeNode object and getting the related nova-compute
> Service object for that node.
>
> Looking at the code one might think this is reasonable:
>
> service = objects.Service.get_by_id(ctxt, compute_node.service_id)
>
> But compute_node.service_id is likely None. Or how about:
>
> service = objects.Service.get_by_compute_host(ctxt, compute_node.host)
>
> But ComputeNode.host is also nullable (though likely should have a value
> as noted above).
>
>
Yeah basically, before this blueprint, a ComputeNode record was only having
an hypervisor_hostname value and a service_id which was a FK from the
Service record.
Given we preferred to have a tuple (host, hypervisor_hostname) key for the
CN record, we deprecated the service_id and wanted to add a new field named
'host'.

For that, we were looking at the existing Service record to know the field
value. After this, we were directly providing the 'host' field value.

That said, since it was possible to have compute and conductor services
having different release versions, that's why we were wanting to still be
able to look at the backward compability.
Now, we're waaaaay after Kilo, so I think we no longer need to support the
compatibility ;-)

This is a long way of me saying this code is all gross and we should
> clean it up, which means making sure all of this Kilo era compat code
> for old records is no longer necessary, which means all of those records
> should be migrated by now but how should we check?
>
> I *think* this might just be as simple as a "nova-status upgrade check"
> check which scans the cells looking for (non-deleted) compute_nodes
> records where host is NULL and report an error if any are found. I
> believe the recovery action for an operator that hits this is to delete
> the busted compute_nodes record and restart the nova-compute service so
> a new compute node record is created. I would really think that anything
> this scan would find would be orphaned compute_nodes records that could
> just be deleted since another compute_nodes record probably already
> exists for the same hypervisor_hostname value. IOW, I don't think we
> need an online data migration routine for this.
>
>
Yeah, agreed with the above. I don't think we need an online data migration
for this and I'm pretty sure an nova-status upgrade check should be enough.

-Sylvain

> Hopefully at least one person (Sylvain) can agree with me here and the
> plan of action I've put forth.
>
> [1]
>
> https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/db/sqlalchemy/models.py#L123
> [2]
>
> https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/objects/compute_node.py#L150
> [3]
>
> https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/objects/compute_node.py#L263
> [4]
> https://blueprints.launchpad.net/nova/+spec/detach-service-from-computenode
> [5]
>
> https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/objects/compute_node.py#L118
>
> --
>
> Thanks,
>
> Matt
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190626/e93d39e3/attachment.html>

Open Stack

[nova] Can we drop the kilo era ComputeNode host/service_id compat code now?

OpenStack

Community

Documentation

Branding & Legal