[openstack-dev] [nova] question about e41fb84 "fix anti-affinity race condition on boot"
John Garbutt
john at johngarbutt.com
Mon Mar 17 17:54:32 UTC 2014
On 15 March 2014 18:39, Chris Friesen <chris.friesen at windriver.com> wrote:
> Hi,
>
> I'm curious why the specified git commit chose to fix the anti-affinity race
> condition by aborting the boot and triggering a reschedule.
>
> It seems to me that it would have been more elegant for the scheduler to do
> a database transaction that would atomically check that the chosen host was
> not already part of the group, and then add the instance (with the chosen
> host) to the group. If the check fails then the scheduler could update the
> group_hosts list and reschedule. This would prevent the race condition in
> the first place rather than detecting it later and trying to work around it.
>
> This would require setting the "host" field in the instance at the time of
> scheduling rather than the time of instance creation, but that seems like it
> should work okay. Maybe I'm missing something though...
We deal with memory races in the same way as this today, when they
race against the scheduler.
Given the scheduler split, writing that value into the nova db from
the scheduler would be a step backwards, and it probably breaks lots
of code that assumes the host is not set until much later.
John
More information about the OpenStack-dev
mailing list