On Tue, Mar 2, 2021 at 9:44 PM Ken D'Ambrosio <ken@jots.org> wrote:
Hey, all.  Turns out we really need anti-affinity running on our (very
freaking old -- Juno) clouds.  I'm trying to find docs that describe its
functionality, and am failing.  If I enable it, and (say) have 10
hypervisors, and 12 VMs to fire off, what happens when VM #11 goes to
fire?  Does it fail, or does the scheduler just continue to at least
*try* to maintain as few possible on each hypervisor?


No, it's a hard-stop. To be clear, that means that all the instances *from the same instance group* (with a anti-affinity policy) are not mixed between the same compute nodes.
With your example, this means that if you have 10 compute nodes, an anti-affinity instance group can support up to 10 instances, because an eleven created instance from the same group would get a NoValidHost.
This being said, you can of course create more than 10 instances, provided they don't share the same group.

For what said Sean, soft-anti-affinity was a new feature that was provided by the Liberty timeframe. The difference with hard anti-affinity is that it's no longer an hardstop, you can have more than 10 instances within your group, it's just that the scheduler will try to spread them correctly (using weighers) between all your computes.
FWIW, I wouldn't address backporting the feature as it's not only providing soft-affiinity and soft-anti-affinity weighers but it also adds those policies into the os-servergroup API. You should rather think about upgrading to Liberty if you do really care about soft-affinity or not use the filter. There are other possibilities for spreading instances between computes (for example using aggregates) that don't give you NoValidHosts exceptions on boots.

-Sylvain

Thanks!

-Ken