Cleanup database(s)
Eugen Block
eblock at nde.ag
Mon Mar 8 15:28:19 UTC 2021
Hi,
> there is a openstack client extention called osc-placement that you
> can install to help.
> we also have a heal allcoation command in nova-manage that may help
> but the next step would be to validate
> if the old RPs are still in use or not. from there you can then work
> to align novas and placment view with
> the real toplogy.
I read about that in the docs, but there's no RPM for our distro
(openSUSE), I guess we'll have to build it from source.
> what you proably need to do in this case is check if the RPs still
> have allocations and if so
> verify that the allocation are owned by vms that nolonger exist.
Is this the right place to look at?
MariaDB [nova]> select count(*) from nova_api.allocations;
+----------+
| count(*) |
+----------+
| 263 |
+----------+
MariaDB [nova]> select resource_provider_id,consumer_id from
nova_api.allocations limit 10;
+----------------------+--------------------------------------+
| resource_provider_id | consumer_id |
+----------------------+--------------------------------------+
| 3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
| 3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
| 3 | fce8f56e-e50b-47ef-bbf5-87b91336b2d4 |
| 3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
| 3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
| 3 | 67d95ce0-7902-40db-8ad7-ef0ce350bcb4 |
| 1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
| 1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
| 1 | 0caaebae-56a6-45d8-a486-f3294ab321e8 |
| 1 | 339d0585-b671-4afa-918b-a772bfc36da8 |
+----------------------+--------------------------------------+
MariaDB [nova]> select name,id from nova_api.resource_providers;
+--------------------------+----+
| name | id |
+--------------------------+----+
| compute1.fqdn | 3 |
| compute2.fqdn | 1 |
| compute3.fqdn | 2 |
| compute4.fqdn | 4 |
+--------------------------+----+
I only checked four of those consumer_id entries and all are existing
VMs, I'll need to check all of them tomorrow. So I guess we should try
to get the osc-placement tool running for us.
Thanks, that already helped a lot!
Eugen
Zitat von Sean Mooney <smooney at redhat.com>:
> On Mon, 2021-03-08 at 14:18 +0000, Eugen Block wrote:
>> Thank you, Sean.
>>
>> > so you need to do
>> > openstack compute service list to get teh compute service ids
>> > then do
>> > openstack compute service delete <id-1> <id-2> ...
>> >
>> > you need to make sure that you only remvoe the unused old serivces
>> > but i think that would fix your issue.
>>
>> That's the thing, they don't show up in the compute service list. But
>> I also found them in the resource_providers table, only the old
>> compute nodes appear here:
>>
>> MariaDB [nova]> select name from nova_api.resource_providers;
>> +--------------------------+
>> > name |
>> +--------------------------+
>> > compute1.fqdn |
>> > compute2.fqdn |
>> > compute3.fqdn |
>> > compute4.fqdn |
>> +--------------------------+
> ah in that case the compute service delete is ment to remove the RPs too
> but if the RP had stale allcoation at teh time of the delete the RP
> delete will fail
>
> what you proably need to do in this case is check if the RPs still
> have allocations and if so
> verify that the allocation are owned by vms that nolonger exist.
> if that is the case you should be able to delete teh allcaotion and
> then the RP
> if the allocations are related to active vms that are now on the
> rebuild nodes then you will have to try and
> heal the allcoations.
>
> there is a openstack client extention called osc-placement that you
> can install to help.
> we also have a heal allcoation command in nova-manage that may help
> but the next step would be to validate
> if the old RPs are still in use or not. from there you can then work
> to align novas and placment view with
> the real toplogy.
>
> that could invovle removing the old compute nodes form the
> compute_nodes table or marking them as deleted but
> both nova db and plamcent need to be kept in sysnc to correct your
> current issue.
>
>>
>>
>> Zitat von Sean Mooney <smooney at redhat.com>:
>>
>> > On Mon, 2021-03-08 at 13:18 +0000, Eugen Block wrote:
>> > > Hi *,
>> > >
>> > > I have a quick question, last year we migrated our OpenStack to a
>> > > highly available environment through a reinstall of all nodes. The
>> > > migration went quite well, we're working happily in the new cloud but
>> > > the databases still contain deprecated data. For example, the
>> > > nova-scheduler logs lines like these on a regular basis:
>> > >
>> > > /var/log/nova/nova-scheduler.log:2021-02-19 12:02:46.439 23540 WARNING
>> > > nova.scheduler.host_manager [...] No compute service record found for
>> > > host compute1
>> > >
>> > > This is one of the old compute nodes that has been reinstalled and is
>> > > now compute01. I tried to find the right spot to delete some lines in
>> > > the DB but there are a couple of places so I wanted to check and ask
>> > > you for some insights.
>> > >
>> > > The scheduler messages seem to originate in
>> > >
>> > > /usr/lib/python3.6/site-packages/nova/scheduler/host_manager.py
>> > >
>> > > ---snip---
>> > > for cell_uuid, computes in compute_nodes.items():
>> > > for compute in computes:
>> > > service = services.get(compute.host)
>> > >
>> > > if not service:
>> > > LOG.warning(
>> > > "No compute service record found for host
>> > > %(host)s",
>> > > {'host': compute.host})
>> > > continue
>> > > ---snip---
>> > >
>> > > So I figured it could be this table in the nova DB:
>> > >
>> > > ---snip---
>> > > MariaDB [nova]> select host,deleted from compute_nodes;
>> > > +-----------+---------+
>> > > > host | deleted |
>> > > +-----------+---------+
>> > > > compute01 | 0 |
>> > > > compute02 | 0 |
>> > > > compute03 | 0 |
>> > > > compute04 | 0 |
>> > > > compute05 | 0 |
>> > > > compute1 | 0 |
>> > > > compute2 | 0 |
>> > > > compute3 | 0 |
>> > > > compute4 | 0 |
>> > > +-----------+---------+
>> > > ---snip---
>> > >
>> > > What would be the best approach here to clean up a little? I believe
>> > > it would be safe to simply purge those lines containing the old
>> > > compute node, but there might be a smoother way. Or maybe there are
>> > > more places to purge old data from?
>> > so the step you porably missed was deleting the old compute
>> service records
>> >
>> > so you need to do
>> > openstack compute service list to get teh compute service ids
>> > then do
>> > openstack compute service delete <id-1> <id-2> ...
>> >
>> > you need to make sure that you only remvoe the unused old serivces
>> > but i think that would fix your issue.
>> >
>> > >
>> > > I'd appreciate any ideas.
>> > >
>> > > Regards,
>> > > Eugen
>> > >
>> > >
>>
>>
>>
More information about the openstack-discuss
mailing list