[Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

Eric Fried openstack at fried.cc
Mon May 22 21:15:53 UTC 2017


Dan, et al-

> Well, (a) today you can't really externally retry a single instance
> build without just creating a new one. The new one could suffer the same
> fate, but that's why we just did the auto-disable feature for nova-compute.

Whoah, but that's after 10 tries (by default).  And if e.g. it bounced
because the instance is too big for the host, but other, smaller
instances come in and succeed in the meantime, that could wind up being
stretched indefinitely.  Doesn't sound like a complete answer to this issue.

> Thing (b) is that if we fix rebuild so it works on a failed
> shell-of-an-instance from a boot operation, we could easily exclude the
> host it failed on, but it'd require some additional logic.

Right, so I think the need for that "additional logic" was my point.

Today you can limit the set of compute hosts to try by specifying an
"availability zone".  Perhaps the answer here is to support some kind of
"exclude these hosts" list to a "fresh" deploy.

But is the cure worse than the disease?

-efried
.




More information about the OpenStack-operators mailing list