[openstack-dev] Nova scheduler sub-group meeting agenda 6/11
Mike Wilson
geekinutah at gmail.com
Wed Jun 12 18:14:22 UTC 2013
Wow, I missed this thread completely, sorry. I just went over the meeting
notes and I'd like to add what I can from our own experience with the
scheduler at Bluehost.
The first issue we had was dealing with the fanout_cast to schedulers from
the compute nodes. With a large number of nodes all of the scheduler's
processing time is just getting these updates and processing them. I wasn't
the one who dug into this and tore it out, but I think we determined that
for us it was sufficient to get the information from the DB and rely on
that. In any case, we need to have one way to report instead of reporting
to the DB and to the individual schedulers as was discussed in the meeting.
Personally, I think the fanout_cast needs to go away. If
updating capabilities using RPC is desired that's fine, but it shouldn't be
a broadcast type communication. It would be better to have the schedulers
share a host state and one of them at a time can get an update and apply it
to the shared store. That way we can just spin up more schedulers when your
current set are not keeping up.
Second issue is something Phil brought up which is the filtering stuff.
This, to me, was the larger issue and why we stuck our own scheduler in
instead of trying to fix the problem. There are a few filters that you
don't need to spin through the whole list to apply. For example filters
that select or exclude specific hosts should be applied to a collection
instead of each item of the collection. Btw, I'm geekinutah on IRC, feel
free to msg me about Bluehost stuff anytime.
-Mike Wilson
On Wed, Jun 12, 2013 at 11:31 AM, Joe Gordon <joe.gordon0 at gmail.com> wrote:
>
>
>
> On Mon, Jun 10, 2013 at 3:11 PM, Dugger, Donald D <
> donald.d.dugger at intel.com> wrote:
>
>> Current list of topics we're going over is:
>>
>> 1) Extending data in host state
>> 2) Utilization based scheduling
>> 3) Whole host allocation capability
>> 4) Coexistence of different schedulers
>> 5) Rack aware scheduling
>> 6) List scheduler hints via API
>> 7) Host directory service
>> 8) The future of the scheduler
>> 9) Network bandwisth aware scheduling (and wider aspects)
>> 10) ensembles/vclusters
>>
>> We've done a first pass over all of these so next will be follow ups to
>> see where we are. But first, a new issue was raised at the last meeting:
>>
>> 11) Scheduler scalability
>>
>> The assertion was that BlueHost has created an OpenStack cluster with
>> ~16,000 nodes and the scheduler didn't scale, they had to throw it out
>> completely and just put in a simple random selection scheduler. Obviously
>> scalability of the scheduler is a concern so I'd like to spend this meeting
>> discussing this topic. (If someone from BlueHost could attend that would
>> be great).
>>
>
>
> This is what I am basing my information on (
> http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment starting
> at 9:45). Compute nodes broadcast updates to the schedulers every minute
> which for 16k nodes is 266 messages a second (on average). And with the
> scheduler being single threaded, processing these messages will keep the
> scheduler(s) very busy just processing compute broadcasts.
>
>
>>
>> --
>> Don Dugger
>> "Censeo Toto nos in Kansa esse decisse." - D. Gale
>> Ph: 303/443-3786
>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130612/bea8fb4c/attachment.html>
More information about the OpenStack-dev
mailing list