anti-affinity: what're the mechanics?
Hey, all. Turns out we really need anti-affinity running on our (very freaking old -- Juno) clouds. I'm trying to find docs that describe its functionality, and am failing. If I enable it, and (say) have 10 hypervisors, and 12 VMs to fire off, what happens when VM #11 goes to fire? Does it fail, or does the scheduler just continue to at least *try* to maintain as few possible on each hypervisor?
Thanks!
-Ken
On Tue, 2021-03-02 at 15:31 -0500, Ken D'Ambrosio wrote:
Hey, all. Turns out we really need anti-affinity running on our (very freaking old -- Juno) clouds. I'm trying to find docs that describe its functionality, and am failing. If I enable it, and (say) have 10 hypervisors, and 12 VMs to fire off, what happens when VM #11 goes to fire? Does it fail, or does the scheduler just continue to at least *try* to maintain as few possible on each hypervisor?
in juno i belive we only have hard anti affinity via the filter. i belive it predates the soft affinit/anti-affinity filter so it will error out. the behavior i belive will depend on if you ddi a multi create or booted the vms serally if you do it serially then you should be able to boot 10 vms. if you do a multi create then it depends on if you set the min value or not. if you dont set the min value i think only 10 will boot and the last two will error. if you set --min 12 --max 12 i think they all will go to error or be deleted. i have not checked that but i belive we are ment to try and role back in that case.
the soft affinity weigher was added by https://github.com/openstack/nova/commit/72ba18468e62370522e07df796f5ff74ae1... in mitaka. if you want to be able to boot all 12 then you need a weigher like that.
for the most part you can proably backport that directly from master and use it in juno as i dont think we have matirally altered the way the filters work that much bu the weigher like the filters are also plugabl so you can backport it externally and load it if you wanted too. that is proably your best bet.
Thanks!
-Ken
On Tue, Mar 2, 2021 at 9:44 PM Ken D'Ambrosio ken@jots.org wrote:
Hey, all. Turns out we really need anti-affinity running on our (very freaking old -- Juno) clouds. I'm trying to find docs that describe its functionality, and am failing. If I enable it, and (say) have 10 hypervisors, and 12 VMs to fire off, what happens when VM #11 goes to fire? Does it fail, or does the scheduler just continue to at least *try* to maintain as few possible on each hypervisor?
No, it's a hard-stop. To be clear, that means that all the instances *from the same instance group* (with a anti-affinity policy) are not mixed between the same compute nodes. With your example, this means that if you have 10 compute nodes, an anti-affinity instance group can support up to 10 instances, because an eleven created instance from the same group would get a NoValidHost. This being said, you can of course create more than 10 instances, provided they don't share the same group.
For what said Sean, soft-anti-affinity was a new feature that was provided by the Liberty timeframe. The difference with hard anti-affinity is that it's no longer an hardstop, you can have more than 10 instances within your group, it's just that the scheduler will try to spread them correctly (using weighers) between all your computes. FWIW, I wouldn't address backporting the feature as it's not only providing soft-affiinity and soft-anti-affinity weighers but it also adds those policies into the os-servergroup API. You should rather think about upgrading to Liberty if you do really care about soft-affinity or not use the filter. There are other possibilities for spreading instances between computes (for example using aggregates) that don't give you NoValidHosts exceptions on boots.
-Sylvain
Thanks!
-Ken
participants (3)
-
Ken D'Ambrosio
-
Sean Mooney
-
Sylvain Bauza