[placement] Train upgrade warning
Seth Tunstall
sethp at gurukuli.co.uk
Wed Nov 4 16:54:32 UTC 2020
Hello,
In case it helps anyone else searching for this in future: Melanie's
suggestion to clean out the orphaned consumers worked perfectly in my
situation.
The last two I had were apparently left over from the original build of
this environment. I brute-force cleaned them out of the DB manually:
DELETE FROM nova_cell0.block_device_mapping WHERE
nova_cell0.block_device_mapping.instance_uuid IN (SELECT uuid FROM
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT
nova_api.allocations.consumer_id FROM nova_api.allocations));
DELETE FROM nova_cell0.instance_faults WHERE
nova_cell0.instance_faults.instance_uuid IN (SELECT uuid FROM
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT
nova_api.allocations.consumer_id FROM nova_api.allocations));
DELETE FROM nova_cell0.instance_extra WHERE
nova_cell0.instance_extra.instance_uuid IN (SELECT uuid FROM
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT
nova_api.allocations.consumer_id FROM nova_api.allocations));
DELETE FROM nova_cell0.instance_info_caches WHERE
nova_cell0.instance_info_caches.instance_uuid IN (SELECT uuid FROM
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT
nova_api.allocations.consumer_id FROM nova_api.allocations));
DELETE FROM nova_cell0.instance_system_metadata WHERE
nova_cell0.instance_system_metadata.instance_uuid IN (SELECT uuid FROM
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT
nova_api.allocations.consumer_id FROM nova_api.allocations));
DELETE FROM nova_cell0.instances WHERE nova_cell0.instances.uuid IN
(SELECT uuid FROM nova_api.consumers WHERE nova_api.consumers.uuid NOT
IN (SELECT nova_api.allocations.consumer_id FROM nova_api.allocations));
Caveat: I am not intimately familiar with how the ORM handles these DB
tables, I may have done something stupid here.
I tried to run:
nova-manage db archive_deleted_rows --verbose --until-complete --all-cells
but nova-db-manage complained that it didn't recognise --no-cells
Thanks very much for your help, Melanie
Seth
On 30/10/2020 16:50, melanie witt wrote:
> On 10/30/20 01:37, Seth Tunstall wrote:
>> Hello,
>>
>> On 10/28/20 12:01, melanie witt wrote:
>> >> The main idea of the row deletions is to delete "orphan" records
>> which are records tied to an instance's lifecycle when that instance
>> no longer exists. Going forward, nova will delete these records itself
>> at instance deletion time but did not in the past because of bugs, and
>> any records generated before a bug was fixed will become orphaned once
>> the associated instance is deleted.
>>
>> I've done the following in this order:
>>
>> nova-manage api_db sync
>>
>> nova-manage db sync
>>
>> (to bring the DBs up to the version I'm upgrading to (Train)
>>
>> nova-manage db archive_deleted_rows --verbose --until-complete
>
> The thing I notice here ^ is that you didn't (but should) use
> --all-cells to also clean up based on the nova_cell0 database (where
> instances that failed scheduling go). If you've ever had an instance go
> into ERROR state for failing the scheduling step and you deleted it, its
> nova_api.instance_mappings record would be a candidate for being
> archived (removed).
>
> <snip>
>
>> # placement-status upgrade check
>> +-----------------------------------------------------------------------+
>> | Upgrade Check Results |
>> +-----------------------------------------------------------------------+
>> | Check: Missing Root Provider IDs |
>> | Result: Success |
>> | Details: None |
>> +-----------------------------------------------------------------------+
>> | Check: Incomplete Consumers |
>> | Result: Warning |
>> | Details: There are -2 incomplete consumers table records for existing |
>> | allocations. Run the "placement-manage db |
>> | online_data_migrations" command. |
>> +-----------------------------------------------------------------------+
>>
>> argh! again a negative number! But at least it's only 2, which is well
>> within the realm of manual fixes.
>
> The only theory I have for how this occurred is you have 2 consumers
> that are orphaned due to missing the nova_cell0 during database
> archiving ... Like if you have a couple of deleted instances in
> nova_cell0 and thus still have nova_api.instance_mappings and without
> --all-cells those instance_mappings didn't get removed and so affected
> the manual cleanup query you ran (presence of instance_mappings
> prevented deletion of 2 orphaned consumers).
>
> If that's not it, then I'm afraid I don't have any other ideas at the
> moment.
>
> -melanie
More information about the openstack-discuss
mailing list