Cleanup database(s)

Sean Mooney smooney at redhat.com
Mon Mar 8 14:48:41 UTC 2021


On Mon, 2021-03-08 at 14:18 +0000, Eugen Block wrote:
> Thank you, Sean.
> 
> > so you need to do
> > openstack compute service list to get teh compute service ids
> > then do
> > openstack compute service delete <id-1> <id-2> ...
> > 
> > you need to make sure that you only remvoe the unused old serivces  
> > but i think that would fix your issue.
> 
> That's the thing, they don't show up in the compute service list. But  
> I also found them in the resource_providers table, only the old  
> compute nodes appear here:
> 
> MariaDB [nova]> select name from nova_api.resource_providers;
> +--------------------------+
> > name                     |
> +--------------------------+
> > compute1.fqdn            |
> > compute2.fqdn            |
> > compute3.fqdn            |
> > compute4.fqdn            |
> +--------------------------+
ah in that case the compute service delete is ment to remove the RPs too
but if the RP had stale allcoation at teh time of the delete the RP delete will fail

what you proably need to do in this case is check if the RPs still have allocations and if so
verify that the allocation are owned by vms that nolonger exist.
if that is the case you should be able to delete teh allcaotion and then the RP
if the allocations are related to active vms that are now on the rebuild nodes then you will have to try and
heal the allcoations.

there is a openstack client extention called osc-placement that you can install to help.
we also have a heal allcoation command in nova-manage that may help but the next step would be to validate
if the old RPs are still in use or not. from there you can then work to align novas and placment view with
the real toplogy.

that could invovle removing the old compute nodes form the compute_nodes table or marking them as deleted but
both nova db and plamcent need to be kept in sysnc to correct your current issue.

> 
> 
> Zitat von Sean Mooney <smooney at redhat.com>:
> 
> > On Mon, 2021-03-08 at 13:18 +0000, Eugen Block wrote:
> > > Hi *,
> > > 
> > > I have a quick question, last year we migrated our OpenStack to a
> > > highly available environment through a reinstall of all nodes. The
> > > migration went quite well, we're working happily in the new cloud but
> > > the databases still contain deprecated data. For example, the
> > > nova-scheduler logs lines like these on a regular basis:
> > > 
> > > /var/log/nova/nova-scheduler.log:2021-02-19 12:02:46.439 23540 WARNING
> > > nova.scheduler.host_manager [...] No compute service record found for
> > > host compute1
> > > 
> > > This is one of the old compute nodes that has been reinstalled and is
> > > now compute01. I tried to find the right spot to delete some lines in
> > > the DB but there are a couple of places so I wanted to check and ask
> > > you for some insights.
> > > 
> > > The scheduler messages seem to originate in
> > > 
> > > /usr/lib/python3.6/site-packages/nova/scheduler/host_manager.py
> > > 
> > > ---snip---
> > >          for cell_uuid, computes in compute_nodes.items():
> > >              for compute in computes:
> > >                  service = services.get(compute.host)
> > > 
> > >                  if not service:
> > >                      LOG.warning(
> > >                          "No compute service record found for host  
> > > %(host)s",
> > >                          {'host': compute.host})
> > >                      continue
> > > ---snip---
> > > 
> > > So I figured it could be this table in the nova DB:
> > > 
> > > ---snip---
> > > MariaDB [nova]> select host,deleted from compute_nodes;
> > > +-----------+---------+
> > > > host      | deleted |
> > > +-----------+---------+
> > > > compute01 |       0 |
> > > > compute02 |       0 |
> > > > compute03 |       0 |
> > > > compute04 |       0 |
> > > > compute05 |       0 |
> > > > compute1  |       0 |
> > > > compute2  |       0 |
> > > > compute3  |       0 |
> > > > compute4  |       0 |
> > > +-----------+---------+
> > > ---snip---
> > > 
> > > What would be the best approach here to clean up a little? I believe
> > > it would be safe to simply purge those lines containing the old
> > > compute node, but there might be a smoother way. Or maybe there are
> > > more places to purge old data from?
> > so the step you porably missed was deleting the old compute service records
> > 
> > so you need to do
> > openstack compute service list to get teh compute service ids
> > then do
> > openstack compute service delete <id-1> <id-2> ...
> > 
> > you need to make sure that you only remvoe the unused old serivces  
> > but i think that would fix your issue.
> > 
> > > 
> > > I'd appreciate any ideas.
> > > 
> > > Regards,
> > > Eugen
> > > 
> > > 
> 
> 
> 





More information about the openstack-discuss mailing list