[openstack-dev] [nova] Bug 1781710 killing the check queue
Chris Friesen
chris.friesen at windriver.com
Wed Jul 18 18:05:13 UTC 2018
On 07/18/2018 10:14 AM, Matt Riedemann wrote:
> As can be seen from logstash [1] this bug is hurting us pretty bad in the check
> queue.
>
> I thought I originally had this fixed with [2] but that turned out to only be
> part of the issue.
>
> I think I've identified the problem but I have failed to write a recreate
> regression test [3] because (I think) it's due to random ordering of which
> request spec we select to send to the scheduler during a multi-create request
> (and I tried making that predictable by sorting the instances by uuid in both
> conductor and the scheduler but that didn't make a difference in my test).
Can we get rid of multi-create? It keeps causing complications, and it already
has weird behaviour if you ask for min_count=X and max_count=Y and only X
instances can be scheduled. (Currently it fails with NoValidHost, but it should
arguably start up X instances.)
> After talking with Sean Mooney, we have another fix which is self-contained to
> the scheduler [5] so we wouldn't need to make any changes to the RequestSpec
> handling in conductor. It's admittedly a bit hairy, so I'm asking for some eyes
> on it since either way we go, we should get going soon before we hit the FF and
> RC1 rush which *always* kills the gate.
One of your options mentioned using RequestSpec.num_instances to decide if it's
in a multi-create. Is there any reason to persist RequestSpec.num_instances?
It seems like it's only applicable to the initial request, since after that each
instance is managed individually.
Chris
More information about the OpenStack-dev
mailing list