[openstack-dev] Scheduler proposal

Joshua Harlow harlowja at fastmail.com
Tue Oct 13 03:49:44 UTC 2015


Ian Wells wrote:
> On 10 October 2015 at 23:47, Clint Byrum <clint at fewbar.com
> <mailto:clint at fewbar.com>> wrote:
>
>     >  Per before, my suggestion was that every scheduler tries to
>     maintain a copy
>     >  of the cloud's state in memory (in much the same way, per the previous
>     >  example, as every router on the internet tries to make a route
>     table out of
>     >  what it learns from BGP).  They don't have to be perfect.  They
>     don't have
>     >  to be in sync.  As long as there's some variability in the
>     decision making,
>     >  they don't have to update when another scheduler schedules
>     something (and
>     >  you can make the compute node send an immediate update when a new
>     VM is
>     >  run, anyway).  They all stand a good chance of scheduling VMs well
>     >  simultaneously.
>     >
>
>     I'm quite in favor of eventual consistency and retries. Even if we had
>     a system of perfect updating of all state records everywhere, it would
>     break sometimes and I'd still want to not trust any record of state as
>     being correct for the entire distributed system. However, there is an
>     efficiency win gained by staying _close_ to correct. It is actually a
>     function of the expected entropy. The more concurrent schedulers, the
>     more entropy there will be to deal with.
>
>
> ... and the fewer the servers in total, the larger the entropy as a
> proportion of the whole system (if that's a thing, it's a long time
> since I did physical chemistry).  But consider the use cases:
>
> 1. I have a small cloud, I run two schedulers for redundancy.  There's a
> good possibility that, when the cloud is loaded, the schedulers make
> poor decisions occasionally.  We'd have to consider how likely that was,
> certainly.
>
> 2. I have a large cloud, and I run 20 schedulers for redundancy.
> There's a good chance that a scheduler is out of date on its
> information.  But there could be several hundred hosts willing to
> satisfy a scheduling request, and even of the ones with incorrect
> information a low chance that any of those are close to the threshold
> where they won't run the VM in question, so good odds it will pick a
> host that's happy to satsify the request.
>
>
>     >  But to be fair, we're throwing made up numbers around at this
>     point.  Maybe
>     >  it's time to work out how to test this for scale in a harness -
>     which is
>     >  the bit of work we all really need to do this properly, or there's
>     no proof
>     >  we've actually helped - and leave people to code their ideas up?
>
>     I'm working on adding meters for rates and amounts of messages and
>     queries that the system does right now for performance purposes. Rally
>     though, is the place where I'd go to ask "how fast can we schedule
>     things
>     right now?".
>
>
> My only concern is that we're testing a real cloud at scale and I
> haven't got any more firstborn to sell for hardware, so I wonder if we
> can fake up a compute node in our test harness.

Does the openstack foundation have access to a scaling area that can be 
used by the community for this kind of experimental work? It seems like 
infra or others should be able make that possible? Maybe we could 
sacrifice a summit and instead of spending the money on that we (as a 
community) could spend the money on a really nice scale lab for the 
community ;)

> --
> Ian.
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list