[Openstack] Folsom nova-scheduler race condition?

Day, Phil philip.day at hp.com
Wed Oct 10 07:44:32 UTC 2012


> Per my understanding, this shouldn't happen no matter how (fast) you create instances since the requests are
> queued and scheduler updates resource information after it processes each request.  The only possibility may cause 
>the problem you met that I can think of is there are more than 1 scheduler doing scheduling.

I think the new retry logic is meant to be safe even if there is more than one scheduler, as the requests are effectively serialised when they get to the compute manager, which can then reject any that break its actual resource limits ?

-----Original Message-----
From: openstack-bounces+philip.day=hp.com at lists.launchpad.net [mailto:openstack-bounces+philip.day=hp.com at lists.launchpad.net] On Behalf Of Huang Zhiteng
Sent: 10 October 2012 04:28
To: Jonathan Proulx
Cc: openstack at lists.launchpad.net
Subject: Re: [Openstack] Folsom nova-scheduler race condition?

On Tue, Oct 9, 2012 at 10:52 PM, Jonathan Proulx <jon at jonproulx.com> wrote:
> Hi All,
>
> Looking for a sanity test before I file a bug.  I very recently 
> upgraded my install to Folsom (on top of Ubuntu 12.04/kvm).  My 
> scheduler settings in nova.conf are:
>
> scheduler_available_filters=nova.scheduler.filters.standard_filters
> scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter,
> ComputeFilter 
> least_cost_functions=nova.scheduler.least_cost.compute_fill_first_cost
> _fn
> compute_fill_first_cost_fn_weight=1.0
> cpu_allocation_ratio=1.0
>
> This had been working to fill systems based on available RAM and to 
> not exceed 1:1 allocation ration of CPU resources with Essex.  With 
> Folsom, if I specify a moderately large number of instances to boot or 
> spin up single instances in a tight shell loop they will all get 
> schedule on the same compute node well in excess of the number of 
> available vCPUs . If I start them one at a time (using --poll in a 
> shell loop so each instance is started before the next launches) then 
> I get the expected allocation behaviour.
>
Per my understanding, this shouldn't happen no matter how (fast) you create instances since the requests are queued and scheduler updates resource information after it processes each request.  The only possibility may cause the problem you met that I can think of is there are more than
 1 scheduler doing scheduling.
> I see https://bugs.launchpad.net/nova/+bug/1011852 which seems to 
> attempt to address this issue but as I read it that "fix" is based on 
> retrying failures.  Since KVM is capable of over committing both CPU 
> and Memory I don't seem to get retryable failure, just really bad 
> performance.
>
> Am I missing something this this fix or perhaps there's a reported bug 
> I didn't find in my search, or is this really a bug no one has 
> reported?
>
> Thanks,
> -Jon
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp



--
Regards
Huang Zhiteng

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack at lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp




More information about the Openstack mailing list