[openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

Matt Riedemann mriedemos at gmail.com
Wed May 2 13:46:51 UTC 2018

On 4/23/2018 4:51 PM, Arvind N wrote:
> For #1, we can make the explanation very clear that we rejected the 
> request because the original traits specified in the original image and 
> the new traits specified in the new image do not match and hence rebuild 
> is not supported.

We don't reject rebuild requests today where you rebuild with a new 
image as long as that new image passes the scheduler filters for the 
host on which the instance is already running. I don't see why we'd just 
immediately fail in the API because the new image has required traits, 
when we have no idea, from the nova-api service, whether or not those 
image-defined required traits are going to match the current host or 
not. That's just adding technical debt to rebuild, like we have for 
rebuilding a volume-backed instance with a new image (you can't do it 
today because it wasn't thought about early enough in the design process).

> For #2,
> Other Cons:
>  1. None of the filters currently make other API requests and my
>     understanding is we want to avoid reintroducing such a pattern. But
>     definitely workable solution.

For a rebuild-specific request (which we can determine already), I'm OK 
with this - we're already not calling GET /allocation_candidates in this 
case, so if people are worried about performance, it's just a trade of 
one REST API call for another.

>  2. If the user disables the image properties filter, then traits based
>     filtering will not be run in rebuild case

The user doesn't disable the filter, the operator does, and likely for 
good reason. I don't see a problem with this.

> For #3,
> Even though it handles the nested provider, there is a potential issue.
> Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1), 
> another one with some kind of offload feature(VF2).(Described by alex)
> Initial instance launch happens with VF:1 allocated, rebuild launches 
> with modified request with traits=HW_NIC_OFFLOAD_X, so basically we want 
> the instance to be allocated VF2.
> But the original allocation happens against VF1 and since in rebuild the 
> original allocations are not changed, we have wrong allocations.

I don't know what to say about this. We shouldn't have any quantitative 
resource allocation changes as a result of a rebuild. This actually 
sounds like a case for option #4 with using GET /allocation_candidates 
and then being able to filter out if rebuliding the instance with the 
new image with new required traits but on the same host would result in 
new allocation requests, and if so, we should fail - but we can (only?) 
determine that via the response from GET /allocation_candidates.




More information about the OpenStack-dev mailing list