[nova] Validation for requested host/node on server create
mriedemos at gmail.com
Thu May 23 16:37:03 UTC 2019
On 5/23/2019 9:46 AM, Surya Seetharaman wrote:
> 1. Omit the validation in the API and let the scheduler do the
> Pros: no performance impact in the API when creating server(s)
> Cons: if the host/node does not exist, the user will get a 202 response
> and eventually a NoValidHost error which is not a great user
> although it is what happens today with the availability_zone forced
> host/node behavior we already have  so maybe it's acceptable.
> What I had in mind when suggesting this was to actually return a
> Host/NodeNotFound exception from the host_manager  instead of
> confusing that with the NoValidHost exception when its actually not a
> NoValidHost (as this is usually associated with host capacity) if the
> host or node doesn't exist. I know that it has already been implemented
> as a NoValidHost  but we could change this.
The point is by the time we hit this, we've given the user a 202 and
eventually scheduling is going to fail. It doesn't matter if it's
NoValidHost or HostNotFound or NodeNotFound or MyToiletIsBroken, the
server is going to go to ERROR state and the user has to figure out why
from the fault information which is poor UX IMO.
> 3. Validate both the host and node in the API. This can be broken down:
> a) If only host is specified, do #2 above.
> b) If only node is specified, iterate the cells looking for the node
> query a resource provider with that name in placement which would avoid
> down cell issues)
> c) If both host and node is specified, get the HostMapping and from
> lookup the ComputeNode in the given cell (per the HostMapping)
> Pros: fail fast behavior in the API if either the host and/or node do
> not exist
> Cons: performance hit in the API to validate the host/node and
> redundancy with the scheduler to find the ComputeNode to get its uuid
> for the in_tree filtering on GET /allocation_candidates.
> I don't mind if we did this as long as don't hit all the cells twice (in
> api and scheduler) which like you said could be avoided by going to
Yeah I think we can more efficiently check for the node using placement
(this was Sean Mooney's idea while talking about it yesterday in IRC).
> Note that if we do find the ComputeNode in the API, we could also
> (later?) make a change to the Destination object to add a node_uuid
> field so we can pass that through on the RequestSpec from
> API->conductor->scheduler and that should remove the need for the
> duplicate query in the scheduler code for the in_tree logic.
> I guess we discussed this in a similar(ly different) situation and
> decided against it .
I'm having a hard time dredging up the context on that conversation but
unless I'm mistaken I think that was talking about the RequestGroup vs
the Destination object. Because of when and where the RequestGroup stuff
happens today, we can't really use that from the API to set in_tree
early, which is why the API code is only setting the
RequestSpec.requested_destination (Destination object) field with the
More information about the openstack-discuss