[placement] Train upgrade warning

Seth Tunstall sethp at gurukuli.co.uk
Wed Nov 4 16:54:32 UTC 2020


Hello,

In case it helps anyone else searching for this in future: Melanie's 
suggestion to clean out the orphaned consumers worked perfectly in my 
situation.

The last two I had were apparently left over from the original build of 
this environment. I brute-force cleaned them out of the DB manually:

DELETE FROM nova_cell0.block_device_mapping WHERE 
nova_cell0.block_device_mapping.instance_uuid IN (SELECT uuid FROM 
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT 
nova_api.allocations.consumer_id FROM nova_api.allocations));

DELETE FROM nova_cell0.instance_faults WHERE 
nova_cell0.instance_faults.instance_uuid IN (SELECT uuid FROM 
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT 
nova_api.allocations.consumer_id FROM nova_api.allocations));

DELETE FROM nova_cell0.instance_extra WHERE 
nova_cell0.instance_extra.instance_uuid IN (SELECT uuid FROM 
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT 
nova_api.allocations.consumer_id FROM nova_api.allocations));

DELETE FROM nova_cell0.instance_info_caches WHERE 
nova_cell0.instance_info_caches.instance_uuid IN (SELECT uuid FROM 
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT 
nova_api.allocations.consumer_id FROM nova_api.allocations));

DELETE FROM nova_cell0.instance_system_metadata WHERE 
nova_cell0.instance_system_metadata.instance_uuid IN (SELECT uuid FROM 
nova_api.consumers WHERE nova_api.consumers.uuid NOT IN (SELECT 
nova_api.allocations.consumer_id FROM nova_api.allocations));

DELETE FROM nova_cell0.instances WHERE nova_cell0.instances.uuid IN 
(SELECT uuid FROM nova_api.consumers WHERE nova_api.consumers.uuid NOT 
IN (SELECT nova_api.allocations.consumer_id FROM nova_api.allocations));

Caveat: I am not intimately familiar with how the ORM handles these DB 
tables, I may have done something stupid here.

I tried to run:

nova-manage db archive_deleted_rows --verbose --until-complete --all-cells

but nova-db-manage complained that it didn't recognise --no-cells

Thanks very much for your help, Melanie

Seth


On 30/10/2020 16:50, melanie witt wrote:
> On 10/30/20 01:37, Seth Tunstall wrote:
>> Hello,
>>
>> On 10/28/20 12:01, melanie witt wrote:
>>  >> The main idea of the row deletions is to delete "orphan" records 
>> which are records tied to an instance's lifecycle when that instance 
>> no longer exists. Going forward, nova will delete these records itself 
>> at instance deletion time but did not in the past because of bugs, and 
>> any records generated before a bug was fixed will become orphaned once 
>> the associated instance is deleted.
>>
>> I've done the following in this order:
>>
>> nova-manage api_db sync
>>
>> nova-manage db sync
>>
>> (to bring the DBs up to the version I'm upgrading to (Train)
>>
>> nova-manage db archive_deleted_rows --verbose --until-complete
> 
> The thing I notice here ^ is that you didn't (but should) use 
> --all-cells to also clean up based on the nova_cell0 database (where 
> instances that failed scheduling go). If you've ever had an instance go 
> into ERROR state for failing the scheduling step and you deleted it, its 
> nova_api.instance_mappings record would be a candidate for being 
> archived (removed).
> 
> <snip>
> 
>> # placement-status upgrade check
>> +-----------------------------------------------------------------------+
>> | Upgrade Check Results |
>> +-----------------------------------------------------------------------+
>> | Check: Missing Root Provider IDs |
>> | Result: Success |
>> | Details: None |
>> +-----------------------------------------------------------------------+
>> | Check: Incomplete Consumers |
>> | Result: Warning |
>> | Details: There are -2 incomplete consumers table records for existing |
>> | allocations. Run the "placement-manage db |
>> | online_data_migrations" command. |
>> +-----------------------------------------------------------------------+
>>
>> argh! again a negative number! But at least it's only 2, which is well 
>> within the realm of manual fixes.
> 
> The only theory I have for how this occurred is you have 2 consumers 
> that are orphaned due to missing the nova_cell0 during database 
> archiving ... Like if you have a couple of deleted instances in 
> nova_cell0 and thus still have nova_api.instance_mappings and without 
> --all-cells those instance_mappings didn't get removed and so affected 
> the manual cleanup query you ran (presence of instance_mappings 
> prevented deletion of 2 orphaned consumers).
> 
> If that's not it, then I'm afraid I don't have any other ideas at the 
> moment.
> 
> -melanie



More information about the openstack-discuss mailing list