[nova][ptg] Summary: Implicit trait-based filters
mriedemos at gmail.com
Wed May 8 20:41:25 UTC 2019
On 5/6/2019 1:44 PM, Eric Fried wrote:
> There's another implicit trait-based filter that bears mentioning:
> Excluding disabled compute hosts.
> We have code that disables a compute service when "something goes wrong"
> in various ways. This code should decorate the compute node's resource
> provider with a COMPUTE_SERVICE_DISABLED trait, and every GET
> /allocation_candidates request should include
> ?required=!COMPUTE_SERVICE_DISABLED, so that we don't retrieve
> allocation candidates for disabled hosts.
> mriedem has started to prototype the code for this .
> Action: Spec to be written. Code to be polished up. Possibly aspiers to
> be involved in this bit as well.
Here is the spec . There are noted TODOs and quite a few alternatives
listed, mostly alternatives to the proposed design and what's in my PoC.
One thing my PoC didn't cover was the service group API and it
automatically reporting a service as up or down, I think that will have
to be incorp0rated into this, but how best to do that without having
this 'disabled' trait management everywhere might be tricky. My PoC
tries to make the compute the single place we manage the trait, but
that's also problematic if we lose a race with the API to disable a
compute before the compute dies, or if MQ drops the call, etc.
We might need/want to hook into the update_available_resource periodic
to heal / sync the trait if we have an issue like that, or on startup
during upgrade, and we likely also need a CLI to sync the trait status
manually - at least to aid with the upgrade.
Who knew that managing a status reporting daemon could be complicated
(oh right everyone).
More information about the openstack-discuss