[openstack-dev] [nova] Bug 1781710 killing the check queue

Chris Friesen chris.friesen at windriver.com
Wed Jul 18 18:05:13 UTC 2018


On 07/18/2018 10:14 AM, Matt Riedemann wrote:
> As can be seen from logstash [1] this bug is hurting us pretty bad in the check
> queue.
>
> I thought I originally had this fixed with [2] but that turned out to only be
> part of the issue.
>
> I think I've identified the problem but I have failed to write a recreate
> regression test [3] because (I think) it's due to random ordering of which
> request spec we select to send to the scheduler during a multi-create request
> (and I tried making that predictable by sorting the instances by uuid in both
> conductor and the scheduler but that didn't make a difference in my test).

Can we get rid of multi-create?  It keeps causing complications, and it already 
has weird behaviour if you ask for min_count=X and max_count=Y and only X 
instances can be scheduled.  (Currently it fails with NoValidHost, but it should 
arguably start up X instances.)

> After talking with Sean Mooney, we have another fix which is self-contained to
> the scheduler [5] so we wouldn't need to make any changes to the RequestSpec
> handling in conductor. It's admittedly a bit hairy, so I'm asking for some eyes
> on it since either way we go, we should get going soon before we hit the FF and
> RC1 rush which *always* kills the gate.

One of your options mentioned using RequestSpec.num_instances to decide if it's 
in a multi-create.  Is there any reason to persist RequestSpec.num_instances? 
It seems like it's only applicable to the initial request, since after that each 
instance is managed individually.

Chris



More information about the OpenStack-dev mailing list