[openstack-dev] [nova] Bug 1781710 killing the check queue

work at seanmooney.info work at seanmooney.info
Wed Jul 18 22:58:00 UTC 2018


On Wed, 2018-07-18 at 15:14 -0500, Matt Riedemann wrote:
> On 7/18/2018 1:13 PM, melanie witt wrote:
> > > 
> > > Can we get rid of multi-create?  It keeps causing complications,
> > > and 
> > > it already
> > > has weird behaviour if you ask for min_count=X and max_count=Y
> > > and only X
> > > instances can be scheduled.  (Currently it fails with
> > > NoValidHost, but 
> > > it should
> > > arguably start up X instances.)
> > 
> > We've discussed that before but I think users do use it and
> > appreciate 
> > the ability to boot instances in batches (one request). The
> > behavior you 
> > describe could be changed with a microversion, though I'm not sure
> > if 
> > that would mean we have to preserve old behavior with the previous 
> > microversion.
> 
> Correct, we can't just remove it since that's a backward
> incompatible 
> microversion change. Plus, NFV people *love* it.

do they? alot of nfv folks use heat,osm or onap to drive there
deployments. im not sure if any of thoes actully use the multi create
support. but yes people proably do use it. 
> 
> > 
> > > > After talking with Sean Mooney, we have another fix which is 
> > > > self-contained to
> > > > the scheduler [5] so we wouldn't need to make any changes to
> > > > the 
> > > > RequestSpec
> > > > handling in conductor. It's admittedly a bit hairy, so I'm
> > > > asking for 
> > > > some eyes
> > > > on it since either way we go, we should get going soon before
> > > > we hit 
> > > > the FF and
> > > > RC1 rush which *always* kills the gate.
> > > 
> > > One of your options mentioned using RequestSpec.num_instances to 
> > > decide if it's
> > > in a multi-create.  Is there any reason to persist 
> > > RequestSpec.num_instances?
> > > It seems like it's only applicable to the initial request, since
> > > after 
> > > that each
> > > instance is managed individually. 
> 
> Yes, I agree RequestSpec.num_instances is something we shouldn't
> persist 
> since it's only applicable to the initial server create (you can't 
> multi-migrate a group of instances, for example - but I'm sure
> people 
> have asked for that at some point), and it should be set per call to
> the 
> scheduler, but that's a wider-ranging change since it would touch 
> several parts of conductor, plus the request spec, plus the 
> ServerGroupAntiAffinitySchedulerFilter.

i might be a little biased but i think the localised change in the
schduler makes sense for now and we should clean this up in stine.

general update.
i spent some time this afternoon debuging matt's regression test
https://review.openstack.org/#/c/583339
and it now works as intended with the addtion of disableing the late
check on the compute node in the regression test to mimic devstack.

matt has rebased https://review.openstack.org/#/c/583347 ontop of
the regression test and its currently in the ci queue.
hopefully that will pass soon.

while the chage is less then ideal it is backportable downstream if
needed where as the wider change would not be easily so that is a
plus in the short term.

> Honestly I'm OK with doing either, and I don't think they are
> mutually 
> exclusive things, so we could make num_instances a per-request thing
> in 
> the future for sanity reasons.



More information about the OpenStack-dev mailing list