“i would generaly say that it not a race although it is undefined behviaor.”
Is just calling the “add host to aggregate API” concurrently, undefined behavior?
My sequence of operations is:
To rule out our code as the issue, I was able to reproduce the behavior using devstack on master, using the nova_fake driver with 10 fake compute services and the aggregate_instance_extra_specs filter instead
of ironic and the blazar-nova filter.
So long as the N “add_host_to_aggregate” calls to nova_api are made in parallel, there’s a decent probability that the host_state aggregate info passed to the filters will not agree with the values in the
DB.
This doesn’t depend on launching instances quickly after making the changes, the inconsistency does not seem to ever resolve until nova-scheduler is restarted.
-Mike