[heat] Resource replacement terminates at DELETE_COMPLETE

Zane Bitter zbitter at redhat.com
Tue Jun 25 19:24:03 UTC 2019


On 22/06/19 11:30 AM, Erik McCormick wrote:
> HI everyone!
> 
> I have a situation with a heat stack where it has an Octavia Load 
> Balancer resource which it thinks it's already replaced and so will not 
> recreate it.
> 
> Resource api_lbwith id 3978 already replaced by 3999; not checking check 
> /var/lib/kolla/venv/lib/python2.7/site-packages/heat/engine/check_resource.py:310 
> :

Ruh-roh. What version of Heat are you using? There has been at least one 
known bug related to that check. The one that I can find easily is 
https://storyboard.openstack.org/#!/story/2001974 (fixed in Rocky; 
backported to Queens and Pike). I think there might have been earlier 
issues found but they predated the existence of that log message (those 
were fun to debug). The log message was added in Queens 
(https://review.opendev.org/533015) so in theory whatever version you're 
running, the fix should be available in the latest stable release - 
though if memory serves that only prevents the issue rather than 
recovering from it.

You'll be happy to hear that the check was eliminated forever in Stein: 
https://review.opendev.org/600278

> It goes to a DELETE_COMPLETED state and just sits there. The stack stays 
> UPDATE_IN_PROGRESS and nothing else moves. It doesn't even time out 
> after 4 hours.
> 
> Doing a stack check puts everytinng as CHECK_COMPLETE, even the 
> non-existent load balancers. I can mark the LB and its components 
> unhealthy and start another update, but this just repeats the cycle.
> 
> This all started with some Octavia shenanigans which ended with all the 
> load balancers being deleted manually. I have 2 similar stacks which 
> recreated fine, but this one went through the cycle several other times 
> as we were trying to fix the LB problem. This is a super edge case, but 
> hopefully someone has another idea how to get out of it.

If you're up for some database hacking, removing that (DELETE_COMPLETE) 
resource ought to get you unblocked:

 > DELETE FROM resource WHERE id=3978;

Obviously take appropriate precautions, back up the DB first, &c.

cheers,
Zane.



More information about the openstack-discuss mailing list