[nova][ops] What should the compute service delete behavior be wrt resource providers with allocations?

Mohammed Naser mnaser at vexxhost.com
Wed Jun 12 23:26:06 UTC 2019


On Wed, Jun 12, 2019 at 4:44 PM Matt Riedemann <mriedemos at gmail.com> wrote:
>
> Before [1] when deleting a compute service in the API we did not check
> to see if the compute service was hosting any instances and just blindly
> deleted the service and related compute_node(s) records which orphaned
> the resource provider(s) for those nodes.
>
> With [2] we built on that and would cleanup the (first [3]) compute node
> resource provider by first deleting any allocations for instances still
> on that host - which because of the check in [1] should be none - and
> then deleted the resource provider itself.
>
> [2] forgot about ironic where a single compute service can be managing
> multiple (hundreds or even thousands) of baremetal compute nodes so I
> wrote [3] to delete *all* resource providers for compute nodes tied to
> the service - again barring there being any instances running on the
> service because of the check added in [1].
>
> What we've failed to realize until recently is that there are cases
> where deleting the resource provider can still fail because there are
> allocations we haven't cleaned up, namely:
>
> 1. Residual allocations for evacuated instances from a source host.
>
> 2. Allocations held by a migration record for an unconfirmed (or not yet
> complete) migration.
>
> Because the delete_resource_provider method isn't checking for those, we
> can get ResourceProviderInUse errors which are then ignored [4]. Since
> that error is ignored, we continue on to delete the compute service
> record [5], effectively orphaning the providers (which is what [2] was
> meant to fix). I have recreated the evacuate scenario in a functional
> test here [6].
>
> The question is what should we do about the fix? I'm getting lost
> thinking about this in a vacuum so trying to get some others to help
> think about it.
>
> Clearly with [1] we said you shouldn't be able to delete a compute
> service that has instances on it because that corrupts our resource
> tracking system. If we extend that to any allocations held against
> providers for that compute service, then the fix might be as simple as
> not ignoring the ResourceProviderInUse error and fail if we can't delete
> the provider(s).
>
> The question I'm struggling with is what does an operator do for the two
> cases mentioned above, not-yet-complete migrations and evacuated
> instances? For migrations, that seems pretty simple - wait for the
> migration to complete and confirm it (reverting a cold migration or
> resize would put the instance back on the compute service host you're
> trying to delete).
>
> The nastier thing is the allocations tied to an evacuated instance since
> those don't get cleaned up until the compute service is restarted [7].
> If the operator never intends on restarting that compute service and
> just wants to clear the data, then they have to manually delete the
> allocations for the resource providers associated with that host before
> they can delete the compute service, which kind of sucks.
>
> What are our options?
>
> 1. Don't delete the compute service if we can't cleanup all resource
> providers - make sure to not orphan any providers. Manual cleanup may be
> necessary by the operator.

I'm personally in favor of this.  I think that currently a lot of
operators don't
really think of the placement service much (or perhaps don't really know what
it's doing).

There's a lack of transparency in the data that exists in that service, a lot of
users will actually rely on the information fed by *nova* and not *placement*.

Because of this, I've seen a lot of deployments with stale placement records
or issues with clouds where the hypervisors are not efficiently used because
of a bunch of stale resource allocations that haven't been cleaned up (and
counting on deployers watching logs for warnings.. eh)

I would be more in favor of failing a delete if it will cause the cloud to reach
an inconsistent state than brute-force a delete leaving you in a messy state
where you need to login to the database to unkludge things.

> 2. Change delete_resource_provider cascade=True logic to remove all
> allocations for the provider before deleting it, i.e. for
> not-yet-complete migrations and evacuated instances. For the evacuated
> instance allocations this is likely OK since restarting the source
> compute service is going to do that cleanup anyway. Also, if you delete
> the source compute service during a migration, confirming or reverting
> the resize later will likely fail since we'd be casting to something
> that is gone (and we'd orphan those allocations). Maybe we need a
> functional recreate test for the unconfirmed migration scenario before
> deciding on this?
>
> 3. Other things I'm not thinking of? Should we add a force parameter to
> the API to allow the operator to forcefully delete (#2 above) if #1
> fails? Force parameters are hacky and usually seem to cause more
> problems than they solve, but it does put the control in the operators
> hands.
>
> If we did remove allocations for an instance when deleting it's compute
> service host, the operator should be able to get them back by running
> the "nova-manage placement heal_allocations" CLI - assuming they restart
> the compute service on that host. This would have to be tested of course.
>
> Help me Obi-Wan Kenobi. You're my only hope.
>
> [1] https://review.opendev.org/#/q/I0bd63b655ad3d3d39af8d15c781ce0a45efc8e3a
> [2] https://review.opendev.org/#/q/I7b8622b178d5043ed1556d7bdceaf60f47e5ac80
> [3] https://review.opendev.org/#/c/657016/
> [4]
> https://github.com/openstack/nova/blob/cb0cfc90e1e03e82c42187ec60f46fb8fd590a06/nova/scheduler/client/report.py#L2180
> [5]
> https://github.com/openstack/nova/blob/cb0cfc90e1e03e82c42187ec60f46fb8fd590a06/nova/api/openstack/compute/services.py#L279
> [6] https://review.opendev.org/#/c/663737/
> [7]
> https://github.com/openstack/nova/blob/cb0cfc90e1e03e82c42187ec60f46fb8fd590a06/nova/compute/manager.py#L706
>
> --
>
> Thanks,
>
> Matt
>


-- 
Mohammed Naser — vexxhost
-----------------------------------------------------
D. 514-316-8872
D. 800-910-1726 ext. 200
E. mnaser at vexxhost.com
W. http://vexxhost.com



More information about the openstack-discuss mailing list