Cleanup database(s)

Eugen Block eblock at nde.ag
Mon Mar 8 15:28:19 UTC 2021


Hi,

> there is a openstack client extention called osc-placement that you  
> can install to help.
> we also have a heal allcoation command in nova-manage that may help  
> but the next step would be to validate
> if the old RPs are still in use or not. from there you can then work  
> to align novas and placment view with
> the real toplogy.

I read about that in the docs, but there's no RPM for our distro  
(openSUSE), I guess we'll have to build it from source.

> what you proably need to do in this case is check if the RPs still  
> have allocations and if so
> verify that the allocation are owned by vms that nolonger exist.

Is this the right place to look at?

MariaDB [nova]> select count(*) from nova_api.allocations;
+----------+
| count(*) |
+----------+
|      263 |
+----------+


MariaDB [nova]> select resource_provider_id,consumer_id from  
nova_api.allocations limit 10;
+----------------------+--------------------------------------+
| resource_provider_id | consumer_id                          |
+----------------------+--------------------------------------+
|                    3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
|                    3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
|                    3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
|                    3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
|                    3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
|                    3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
|                    1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
|                    1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
|                    1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
|                    1 | 339d0585-b671-4afa-918b-a772bfc36da8 |
+----------------------+--------------------------------------+

MariaDB [nova]> select name,id from nova_api.resource_providers;
+--------------------------+----+
| name                     | id |
+--------------------------+----+
| compute1.fqdn            |  3 |
| compute2.fqdn            |  1 |
| compute3.fqdn            |  2 |
| compute4.fqdn            |  4 |
+--------------------------+----+

I only checked four of those consumer_id entries and all are existing  
VMs, I'll need to check all of them tomorrow. So I guess we should try  
to get the osc-placement tool running for us.

Thanks, that already helped a lot!

Eugen


Zitat von Sean Mooney <smooney at redhat.com>:

> On Mon, 2021-03-08 at 14:18 +0000, Eugen Block wrote:
>> Thank you, Sean.
>>
>> > so you need to do
>> > openstack compute service list to get teh compute service ids
>> > then do
>> > openstack compute service delete <id-1> <id-2> ...
>> >
>> > you need to make sure that you only remvoe the unused old serivces
>> > but i think that would fix your issue.
>>
>> That's the thing, they don't show up in the compute service list. But
>> I also found them in the resource_providers table, only the old
>> compute nodes appear here:
>>
>> MariaDB [nova]> select name from nova_api.resource_providers;
>> +--------------------------+
>> > name                     |
>> +--------------------------+
>> > compute1.fqdn            |
>> > compute2.fqdn            |
>> > compute3.fqdn            |
>> > compute4.fqdn            |
>> +--------------------------+
> ah in that case the compute service delete is ment to remove the RPs too
> but if the RP had stale allcoation at teh time of the delete the RP  
> delete will fail
>
> what you proably need to do in this case is check if the RPs still  
> have allocations and if so
> verify that the allocation are owned by vms that nolonger exist.
> if that is the case you should be able to delete teh allcaotion and  
> then the RP
> if the allocations are related to active vms that are now on the  
> rebuild nodes then you will have to try and
> heal the allcoations.
>
> there is a openstack client extention called osc-placement that you  
> can install to help.
> we also have a heal allcoation command in nova-manage that may help  
> but the next step would be to validate
> if the old RPs are still in use or not. from there you can then work  
> to align novas and placment view with
> the real toplogy.
>
> that could invovle removing the old compute nodes form the  
> compute_nodes table or marking them as deleted but
> both nova db and plamcent need to be kept in sysnc to correct your  
> current issue.
>
>>
>>
>> Zitat von Sean Mooney <smooney at redhat.com>:
>>
>> > On Mon, 2021-03-08 at 13:18 +0000, Eugen Block wrote:
>> > > Hi *,
>> > >
>> > > I have a quick question, last year we migrated our OpenStack to a
>> > > highly available environment through a reinstall of all nodes. The
>> > > migration went quite well, we're working happily in the new cloud but
>> > > the databases still contain deprecated data. For example, the
>> > > nova-scheduler logs lines like these on a regular basis:
>> > >
>> > > /var/log/nova/nova-scheduler.log:2021-02-19 12:02:46.439 23540 WARNING
>> > > nova.scheduler.host_manager [...] No compute service record found for
>> > > host compute1
>> > >
>> > > This is one of the old compute nodes that has been reinstalled and is
>> > > now compute01. I tried to find the right spot to delete some lines in
>> > > the DB but there are a couple of places so I wanted to check and ask
>> > > you for some insights.
>> > >
>> > > The scheduler messages seem to originate in
>> > >
>> > > /usr/lib/python3.6/site-packages/nova/scheduler/host_manager.py
>> > >
>> > > ---snip---
>> > >          for cell_uuid, computes in compute_nodes.items():
>> > >              for compute in computes:
>> > >                  service = services.get(compute.host)
>> > >
>> > >                  if not service:
>> > >                      LOG.warning(
>> > >                          "No compute service record found for host
>> > > %(host)s",
>> > >                          {'host': compute.host})
>> > >                      continue
>> > > ---snip---
>> > >
>> > > So I figured it could be this table in the nova DB:
>> > >
>> > > ---snip---
>> > > MariaDB [nova]> select host,deleted from compute_nodes;
>> > > +-----------+---------+
>> > > > host      | deleted |
>> > > +-----------+---------+
>> > > > compute01 |       0 |
>> > > > compute02 |       0 |
>> > > > compute03 |       0 |
>> > > > compute04 |       0 |
>> > > > compute05 |       0 |
>> > > > compute1  |       0 |
>> > > > compute2  |       0 |
>> > > > compute3  |       0 |
>> > > > compute4  |       0 |
>> > > +-----------+---------+
>> > > ---snip---
>> > >
>> > > What would be the best approach here to clean up a little? I believe
>> > > it would be safe to simply purge those lines containing the old
>> > > compute node, but there might be a smoother way. Or maybe there are
>> > > more places to purge old data from?
>> > so the step you porably missed was deleting the old compute  
>> service records
>> >
>> > so you need to do
>> > openstack compute service list to get teh compute service ids
>> > then do
>> > openstack compute service delete <id-1> <id-2> ...
>> >
>> > you need to make sure that you only remvoe the unused old serivces
>> > but i think that would fix your issue.
>> >
>> > >
>> > > I'd appreciate any ideas.
>> > >
>> > > Regards,
>> > > Eugen
>> > >
>> > >
>>
>>
>>






More information about the openstack-discuss mailing list