[openstack-dev] [nova] [placement] [operators] Optional resource asking or not?

Dan Smith dms at danplanet.com
Wed Jan 25 16:32:51 UTC 2017


> Update on that agreement : I made the necessary modification in the
> proposal [1] for not verifying the filters. We now send a request to the
> Placement API by introspecting the flavor and we get a list of potential
> destinations.

Thanks!

> When I began doing that modification, I know there was a functional test
> about server groups that needed modifications to match our agreement. I
> consequently made that change located in a separate patch [2] as a
> prerequisite for [1].
> 
> I then spotted a problem that we didn't identified when discussing :
> when checking a destination, the legacy filters for CPU, RAM and disk
> don't verify the maximum capacity of the host, they only multiple the
> total size by the allocation ratio, so our proposal works for them.
> Now, when using the placement service, it fails because somewhere in the
> DB call needed for returning the destinations, we also verify a specific
> field named max_unit [3].
> 
> Consequently, the proposal we agreed is not feature-parity between
> Newton and Ocata. If you follow our instructions, you will still get
> different result from a placement perspective between what was in Newton
> and what will be Ocata.

To summarize some discussion on IRC:

The max_unit field limits the maximum size of any single allocation and
is not scaled by the allocation_ratio (for good reason). Right now,
computes report a max_unit equal to their total for CPU and RAM
resources. So the different behavior here is that placement will not
choose hosts where the instance would single-handedly overcommit the
entire host. Multiple instances still could, per the rules of the
allocation-ratio.

The consensus seems to be that this is entirely sane behavior that the
previous core and ram filters weren't considering. If there's a good
reason to allow computes to report that they're willing to take a
larger-than-100% single allocation, then we can make that change later,
but the justification seems lacking at the moment.

> Technically speaking, the functional test is a canary bird, telling you
> that you get NoValidHosts while it was working previously.

My opinion, which is shared by several other people, is that this test
is broken. It's trying to overcommit the host with a single instance,
and in fact, it's doing it unintentionally for some resources that just
aren't checked before the move to placement. Changing the test to
properly reflect the resources on the host should be the path forward
and Sylvain is working on that now.

The other concern that was raised was that since CoreFilter is not
necessarily enabled on all clouds, cpu_allocation_ratio is not being
honored on those systems today. Moving to placement with ocata will
cause that value to be used, which may be incorrect for certain
overly-committed clouds which had previously ignored it. However, I
think we need not be too concerned as the defaults for these values are
16x overcommit for CPU and 1.5x overcommit for RAM. Those are probably
on the upper limit of sane for most environments, but also large enough
to not cause any sort of immediate panic while people realize (if they
didn't read the release notes) that they may want to tweak them.

--Dan



More information about the OpenStack-dev mailing list