[nova] Can we drop the kilo era ComputeNode host/service_id compat code now?

Matt Riedemann mriedemos at gmail.com
Tue Jun 25 20:05:05 UTC 2019


There are still quite a few TODOs in the code [1][2][3] from a kilo era 
blueprint [4]. At this point I'm pretty sure you can't startup the 
nova-compute service without having a ComputeNode record without a host 
and hypervisor_hostname field set (we don't set the 
ComputeNode.service_id anywhere anymore as far as I can tell, except in 
some ComputeNode RPC compat code [5]).

I've stumbled across all of this code before, but was looking at it 
again today because I have a very simple change I need to make which is 
going from a ComputeNode object and getting the related nova-compute 
Service object for that node.

Looking at the code one might think this is reasonable:

service = objects.Service.get_by_id(ctxt, compute_node.service_id)

But compute_node.service_id is likely None. Or how about:

service = objects.Service.get_by_compute_host(ctxt, compute_node.host)

But ComputeNode.host is also nullable (though likely should have a value 
as noted above).

This is a long way of me saying this code is all gross and we should 
clean it up, which means making sure all of this Kilo era compat code 
for old records is no longer necessary, which means all of those records 
should be migrated by now but how should we check?

I *think* this might just be as simple as a "nova-status upgrade check" 
check which scans the cells looking for (non-deleted) compute_nodes 
records where host is NULL and report an error if any are found. I 
believe the recovery action for an operator that hits this is to delete 
the busted compute_nodes record and restart the nova-compute service so 
a new compute node record is created. I would really think that anything 
this scan would find would be orphaned compute_nodes records that could 
just be deleted since another compute_nodes record probably already 
exists for the same hypervisor_hostname value. IOW, I don't think we 
need an online data migration routine for this.

Hopefully at least one person (Sylvain) can agree with me here and the 
plan of action I've put forth.

[1] 
https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/db/sqlalchemy/models.py#L123
[2] 
https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/objects/compute_node.py#L150
[3] 
https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/objects/compute_node.py#L263
[4] 
https://blueprints.launchpad.net/nova/+spec/detach-service-from-computenode
[5] 
https://github.com/openstack/nova/blob/91647a9b711a8102c79bb17c6b4dff24ad6f8f58/nova/objects/compute_node.py#L118

-- 

Thanks,

Matt



More information about the openstack-discuss mailing list