[nova] Validation for requested host/node on server create

Matt Riedemann mriedemos at gmail.com
Thu May 23 16:37:03 UTC 2019


On 5/23/2019 9:46 AM, Surya Seetharaman wrote:
>     1. Omit the validation in the API and let the scheduler do the
>     validation.
> 
>     Pros: no performance impact in the API when creating server(s)
> 
>     Cons: if the host/node does not exist, the user will get a 202 response
>     and eventually a NoValidHost error which is not a great user
>     experience,
>     although it is what happens today with the availability_zone forced
>     host/node behavior we already have [3] so maybe it's acceptable.
> 
> 
> 
> What I had in mind when suggesting this was to actually return a 
> Host/NodeNotFound exception from the host_manager [1] instead of 
> confusing that with the NoValidHost exception when its actually not a 
> NoValidHost (as this is usually associated with host capacity) if the 
> host or node doesn't exist. I know that it has already been implemented 
> as a NoValidHost [2] but we could change this.

The point is by the time we hit this, we've given the user a 202 and 
eventually scheduling is going to fail. It doesn't matter if it's 
NoValidHost or HostNotFound or NodeNotFound or MyToiletIsBroken, the 
server is going to go to ERROR state and the user has to figure out why 
from the fault information which is poor UX IMO.

> 
>     3. Validate both the host and node in the API. This can be broken down:
> 
>     a) If only host is specified, do #2 above.
>     b) If only node is specified, iterate the cells looking for the node
>     (or
>     query a resource provider with that name in placement which would avoid
>     down cell issues)
>     c) If both host and node is specified, get the HostMapping and from
>     that
>     lookup the ComputeNode in the given cell (per the HostMapping)
> 
>     Pros: fail fast behavior in the API if either the host and/or node do
>     not exist
> 
>     Cons: performance hit in the API to validate the host/node and
>     redundancy with the scheduler to find the ComputeNode to get its uuid
>     for the in_tree filtering on GET /allocation_candidates.
> 
> 
> I don't mind if we did this as long as don't hit all the cells twice (in 
> api and scheduler) which like you said could be avoided by going to 
> placement.

Yeah I think we can more efficiently check for the node using placement 
(this was Sean Mooney's idea while talking about it yesterday in IRC).

> 
> 
>     Note that if we do find the ComputeNode in the API, we could also
>     (later?) make a change to the Destination object to add a node_uuid
>     field so we can pass that through on the RequestSpec from
>     API->conductor->scheduler and that should remove the need for the
>     duplicate query in the scheduler code for the in_tree logic.
> 
> 
> I guess we discussed this in a similar(ly different) situation and 
> decided against it [3].

I'm having a hard time dredging up the context on that conversation but 
unless I'm mistaken I think that was talking about the RequestGroup vs 
the Destination object. Because of when and where the RequestGroup stuff 
happens today, we can't really use that from the API to set in_tree 
early, which is why the API code is only setting the 
RequestSpec.requested_destination (Destination object) field with the 
host/node values.

-- 

Thanks,

Matt



More information about the openstack-discuss mailing list