[openstack-dev] [nova] Fixing race condition with server groups and affinity policy

Alex Cantu miguel.cantu at RACKSPACE.COM
Tue Aug 23 02:51:37 UTC 2016

According to [1]:


1) It's possible to hit a similar race condition for server groups with the "affinity" policy. Suppose we create a new group and then create two instances simultaneously. The scheduler sees an empty group for each, assigns them to different compute nodes, and the policy is violated. We should add a check in _validate_instance_group_policy() to cover the "affinity" case.

2) It's possible to create two instances simultaneously, have them be scheduled to conflicting hosts, both of them detect the problem in _validate_instance_group_policy(), both of them get sent back for rescheduling, and both of them get assigned to conflicting hosts *again*, resulting in an error. In order to fix this I propose that instead of checking against all other instances in the group, we only check against instances that were created before the current instance.


I've been trying to improve upon Chris' solution here[2], but honestly i'm not exactly sure if I'm approaching this correctly. Chris' solution is to only consider the older members when validating group policy(ignoring any members younger than the instance we are validating), eliminating the possibility for the two cases mentioned above. I don't really know enough about the scheduler and instance group code to validate the integrity of the solution, hence my plea for help here :)

I've attached a git format of my attempt to make the implementation a little cleaner. It's the same solution Chris implemented, just moved to the setup_hosts() method to avoid creating a new remotable method.
I haven't gotten the tests to pass, i'm having trouble getting the expected filters to work.

I'm pretty new to the openstack code base, is there anyway anyone here could give me some direction? Is the solution correct? What am I missing? How can I fill in the gaps? Is this even still a valid issue?





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160823/00f3f469/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Fix-race-in-server-group-policy-validation.patch
Type: text/x-patch
Size: 7516 bytes
Desc: 0001-Fix-race-in-server-group-policy-validation.patch
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160823/00f3f469/attachment.bin>

More information about the OpenStack-dev mailing list