[Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality

Jay Pipes jaypipes at gmail.com
Tue May 23 16:39:03 UTC 2017


On 05/23/2017 12:34 PM, Marc Heckmann wrote:
> On Tue, 2017-05-23 at 11:44 -0400, Jay Pipes wrote:
>> On 05/23/2017 09:48 AM, Marc Heckmann wrote:
>>> For the anti-affinity use case, it's really useful for smaller or
>>> medium
>>> size operators who want to provide some form of failure domains to
>>> users
>>> but do not have the resources to create AZ's at DC or even at rack
>>> or
>>> row scale. Don't forget that as soon as you introduce AZs, you need
>>> to
>>> grow those AZs at the same rate and have the same flavor offerings
>>> across those AZs.
>>>
>>> For the retry thing, I think enough people have chimed in to echo
>>> the
>>> general sentiment.
>>
>> The purpose of my ML post was around getting rid of retries, not the
>> usefulness of affinity groups. That seems to have been missed,
>> however.
>>
>> Do you or David have any data on how often you've actually seen
>> retries
>> due to the last-minute affinity constraint violation in real world
>> production?
> 
> No I don't have any data unfortunately. Mostly because we haven't
> advertised the feature to end users yet. We only now are in a position
> to do so because, previously there was a bug causing nova-scheduler to
> grow in RAM usage if the required config flag to enable the feature was
>   on.

k.

> I have however seen retry's triggered on hypervisors for other reasons.
> I can try to dig up why specifically if that would be useful. I will
> add that we do not use Ironic at all.

Yeah, any data you can get about real-world retry causes would be 
awesome. Note that all "resource over-consumption" causes of retries 
will be going away once we do claims in the scheduler. So, really, we're 
looking for data on the *other* causes of retries.

Thanks much in advance!

-jay



More information about the OpenStack-operators mailing list