Cleanup database(s)

Eugen Block eblock at nde.ag
Mon Mar 8 20:57:35 UTC 2021


> I read about that in the docs, but there's no RPM for our distro  
> (openSUSE), I guess we'll have to build it from source.

I should have read the docs more carefully, I installed the  
osc-placement plug-in on a test machine and will play around with the  
options.

Thanks again!


Zitat von Eugen Block <eblock at nde.ag>:

> Hi,
>
>> there is a openstack client extention called osc-placement that you  
>> can install to help.
>> we also have a heal allcoation command in nova-manage that may help  
>> but the next step would be to validate
>> if the old RPs are still in use or not. from there you can then  
>> work to align novas and placment view with
>> the real toplogy.
>
> I read about that in the docs, but there's no RPM for our distro  
> (openSUSE), I guess we'll have to build it from source.
>
>> what you proably need to do in this case is check if the RPs still  
>> have allocations and if so
>> verify that the allocation are owned by vms that nolonger exist.
>
> Is this the right place to look at?
>
> MariaDB [nova]> select count(*) from nova_api.allocations;
> +----------+
> | count(*) |
> +----------+
> |      263 |
> +----------+
>
>
> MariaDB [nova]> select resource_provider_id,consumer_id from  
> nova_api.allocations limit 10;
> +----------------------+--------------------------------------+
> | resource_provider_id | consumer_id                          |
> +----------------------+--------------------------------------+
> |                    3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
> |                    3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
> |                    3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
> |                    3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
> |                    3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
> |                    3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
> |                    1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
> |                    1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
> |                    1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
> |                    1 | 339d0585-b671-4afa-918b-a772bfc36da8 |
> +----------------------+--------------------------------------+
>
> MariaDB [nova]> select name,id from nova_api.resource_providers;
> +--------------------------+----+
> | name                     | id |
> +--------------------------+----+
> | compute1.fqdn            |  3 |
> | compute2.fqdn            |  1 |
> | compute3.fqdn            |  2 |
> | compute4.fqdn            |  4 |
> +--------------------------+----+
>
> I only checked four of those consumer_id entries and all are  
> existing VMs, I'll need to check all of them tomorrow. So I guess we  
> should try to get the osc-placement tool running for us.
>
> Thanks, that already helped a lot!
>
> Eugen
>
>
> Zitat von Sean Mooney <smooney at redhat.com>:
>
>> On Mon, 2021-03-08 at 14:18 +0000, Eugen Block wrote:
>>> Thank you, Sean.
>>>
>>>> so you need to do
>>>> openstack compute service list to get teh compute service ids
>>>> then do
>>>> openstack compute service delete <id-1> <id-2> ...
>>>>
>>>> you need to make sure that you only remvoe the unused old serivces
>>>> but i think that would fix your issue.
>>>
>>> That's the thing, they don't show up in the compute service list. But
>>> I also found them in the resource_providers table, only the old
>>> compute nodes appear here:
>>>
>>> MariaDB [nova]> select name from nova_api.resource_providers;
>>> +--------------------------+
>>>> name                     |
>>> +--------------------------+
>>>> compute1.fqdn            |
>>>> compute2.fqdn            |
>>>> compute3.fqdn            |
>>>> compute4.fqdn            |
>>> +--------------------------+
>> ah in that case the compute service delete is ment to remove the RPs too
>> but if the RP had stale allcoation at teh time of the delete the RP  
>> delete will fail
>>
>> what you proably need to do in this case is check if the RPs still  
>> have allocations and if so
>> verify that the allocation are owned by vms that nolonger exist.
>> if that is the case you should be able to delete teh allcaotion and  
>> then the RP
>> if the allocations are related to active vms that are now on the  
>> rebuild nodes then you will have to try and
>> heal the allcoations.
>>
>> there is a openstack client extention called osc-placement that you  
>> can install to help.
>> we also have a heal allcoation command in nova-manage that may help  
>> but the next step would be to validate
>> if the old RPs are still in use or not. from there you can then  
>> work to align novas and placment view with
>> the real toplogy.
>>
>> that could invovle removing the old compute nodes form the  
>> compute_nodes table or marking them as deleted but
>> both nova db and plamcent need to be kept in sysnc to correct your  
>> current issue.
>>
>>>
>>>
>>> Zitat von Sean Mooney <smooney at redhat.com>:
>>>
>>>> On Mon, 2021-03-08 at 13:18 +0000, Eugen Block wrote:
>>>> > Hi *,
>>>> >
>>>> > I have a quick question, last year we migrated our OpenStack to a
>>>> > highly available environment through a reinstall of all nodes. The
>>>> > migration went quite well, we're working happily in the new cloud but
>>>> > the databases still contain deprecated data. For example, the
>>>> > nova-scheduler logs lines like these on a regular basis:
>>>> >
>>>> > /var/log/nova/nova-scheduler.log:2021-02-19 12:02:46.439 23540 WARNING
>>>> > nova.scheduler.host_manager [...] No compute service record found for
>>>> > host compute1
>>>> >
>>>> > This is one of the old compute nodes that has been reinstalled and is
>>>> > now compute01. I tried to find the right spot to delete some lines in
>>>> > the DB but there are a couple of places so I wanted to check and ask
>>>> > you for some insights.
>>>> >
>>>> > The scheduler messages seem to originate in
>>>> >
>>>> > /usr/lib/python3.6/site-packages/nova/scheduler/host_manager.py
>>>> >
>>>> > ---snip---
>>>> >          for cell_uuid, computes in compute_nodes.items():
>>>> >              for compute in computes:
>>>> >                  service = services.get(compute.host)
>>>> >
>>>> >                  if not service:
>>>> >                      LOG.warning(
>>>> >                          "No compute service record found for host
>>>> > %(host)s",
>>>> >                          {'host': compute.host})
>>>> >                      continue
>>>> > ---snip---
>>>> >
>>>> > So I figured it could be this table in the nova DB:
>>>> >
>>>> > ---snip---
>>>> > MariaDB [nova]> select host,deleted from compute_nodes;
>>>> > +-----------+---------+
>>>> > > host      | deleted |
>>>> > +-----------+---------+
>>>> > > compute01 |       0 |
>>>> > > compute02 |       0 |
>>>> > > compute03 |       0 |
>>>> > > compute04 |       0 |
>>>> > > compute05 |       0 |
>>>> > > compute1  |       0 |
>>>> > > compute2  |       0 |
>>>> > > compute3  |       0 |
>>>> > > compute4  |       0 |
>>>> > +-----------+---------+
>>>> > ---snip---
>>>> >
>>>> > What would be the best approach here to clean up a little? I believe
>>>> > it would be safe to simply purge those lines containing the old
>>>> > compute node, but there might be a smoother way. Or maybe there are
>>>> > more places to purge old data from?
>>>> so the step you porably missed was deleting the old compute  
>>>> service records
>>>>
>>>> so you need to do
>>>> openstack compute service list to get teh compute service ids
>>>> then do
>>>> openstack compute service delete <id-1> <id-2> ...
>>>>
>>>> you need to make sure that you only remvoe the unused old serivces
>>>> but i think that would fix your issue.
>>>>
>>>> >
>>>> > I'd appreciate any ideas.
>>>> >
>>>> > Regards,
>>>> > Eugen
>>>> >
>>>> >
>>>
>>>
>>>






More information about the openstack-discuss mailing list