[openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

Eric Fried openstack at fried.cc
Mon Apr 23 20:26:25 UTC 2018


Semantically, GET /allocation_candidates where we don't actually want to
allocate anything (i.e. we don't want to use the returned candidates) is
goofy, and talking about what the result would look like when there's no
`resources` is going to spider into some weird questions.

Like what does the response payload look like?  In the "good" scenario,
you would be expecting an allocation_request like:

            "allocations": {
                $rp_uuid: {
                    "resources": {
                        # Nada
                    }
                },
            }

...which is something we discussed recently [1] in relation to "anchor"
providers, and killed.

No, the question you're really asking in this case is, "Do the resource
providers in this tree contain (or not contain) these traits?"  Which to
me, translates directly to:

 GET /resource_providers?in_tree=$rp_uuid&required={$TRAIT|!$TRAIT, ...}

...which we already support.  The answer is a list of providers. Compare
that to the providers from which resources are already allocated, and
Bob's your uncle.

(I do find it messy/weird that the required/forbidden traits in the
image meta are supposed to apply *anywhere* in the provider tree.  But I
get that that's probably going to make the most sense.)

[1]
http://lists.openstack.org/pipermail/openstack-dev/2018-April/129408.html

On 04/23/2018 02:48 PM, Matt Riedemann wrote:
> We seem to be at a bit of an impasse in this spec amendment [1] so I
> want to try and summarize the alternative solutions as I see them.
> 
> The overall goal of the blueprint is to allow defining traits via image
> properties, like flavor extra specs. Those image-defined traits are used
> to filter hosts during scheduling of the instance. During server create,
> that filtering happens during the normal "GET /allocation_candidates"
> call to placement.
> 
> The problem is during rebuild with a new image that specifies new
> required traits. A rebuild is not a move operation, but we run through
> the scheduler filters to make sure the new image (if one is specified),
> is valid for the host on which the instance is currently running.
> 
> We don't currently call "GET /allocation_candidates" during rebuild
> because that could inadvertently filter out the host we know we need
> [2]. Also, since flavors don't change for rebuild, we haven't had a need
> for getting allocation candidates during rebuild since we're not
> allocating new resources (pretend bug 1763766 [3] does not exist for now).
> 
> Now that we know the problem, here are some of the solutions that have
> been discussed in the spec amendment, again, only for rebuild with a new
> image that has new traits:
> 
> 1. Fail in the API saying you can't rebuild with a new image with new
> required traits.
> 
> Pros:
> 
> - Simple way to keep the new image off a host that doesn't support it.
> - Similar solution to volume-backed rebuild with a new image.
> 
> Cons:
> 
> - Confusing user experience since they might be able to rebuild with
> some new images but not others with no clear explanation about the
> difference.
> 
> 2. Have the ImagePropertiesFilter call "GET
> /resource_providers/{rp_uuid}/traits" and compare the compute node root
> provider traits against the new image's required traits.
> 
> Pros:
> 
> - Avoids having to call "GET /allocation_candidates" during rebuild.
> - Simple way to compare the required image traits against the compute
> node provider traits.
> 
> Cons:
> 
> - Does not account for nested providers so the scheduler could reject
> the image due to its required traits which actually apply to a nested
> provider in the tree. This is somewhat related to bug 1763766.
> 
> 3. Slight variation on #2 except build a set of all traits from all
> providers in the same tree.
> 
> Pros:
> 
> - Handles the nested provider traits issue from #2.
> 
> Cons:
> 
> - Duplicates filtering in ImagePropertiesFilter that could otherwise
> happen in "GET /allocation_candidates".
> 
> 4. Add a microversion to change "GET /allocation_candidates" to make two
> changes:
> 
> a) Add an "in_tree" filter like in "GET /resource_providers". This would
> be needed to limit the scope of what gets returned since we know we only
> want to check against one specific host (the current host for the
> instance).
> 
> b) Make "resources" optional since on a rebuild we don't want to
> allocate new resources (again, notwithstanding bug 1763766).
> 
> Pros:
> 
> - We can call "GET /allocation_candidates?in_tree=<current node rp
> UUID>&required=<new image required traits>" and if nothing is returned,
> we know the new image's required traits don't work with the current node.
> - The filtering is baked into "GET /allocation_candidates" and not
> client-side in ImagePropertiesFilter.
> 
> Cons:
> 
> - Changes to the "GET /allocation_candidates" API which is going to be
> more complicated and more up-front work, but I don't have a good idea of
> how hard this would be to add since we already have the same "in_tree"
> logic in "GET /resource_providers".
> - Potentially slows down the completion of the overall blueprint.
> 
> ===========
> 
> My personal thoughts are, I don't like option 1 since it adds technical
> debt which we'll eventually just need to solve later (think about [4]).
> Similar feelings for #2. #3 might be a short-term solution until #4 is
> done, but I think the best long-term solution to this problem is #4.
> 
> [1] https://review.openstack.org/#/c/560718/
> [2] https://review.openstack.org/#/c/546357/
> [3] https://bugs.launchpad.net/nova/+bug/1763766
> [4] https://review.openstack.org/#/c/532407/
> 



More information about the OpenStack-dev mailing list