[openstack-dev] A simple way to improve nova scheduler

Soren Hansen soren at linux2go.dk
Thu Sep 26 20:53:09 UTC 2013


Hey, sorry for necroposting. I completely missed this thread when it was
active, but Russel just pointed it out to me on Twitter earlier today and
I couldn't help myself.


2013/7/19 Sandy Walsh <sandy.walsh at rackspace.com>:
> On 07/19/2013 05:01 PM, Boris Pavlovic wrote:
> Sorry, I was commenting on Soren's suggestion from way back (essentially
> listening on a separate exchange for each unique flavor ... so no
> scheduler was needed at all). It was a great idea, but fell apart rather
> quickly.

I don't recall we ever really had the discussion, but it's been a while :)

Yes, when moving beyond simple flavours, the idea as initially proposed
falls apart.  I see two ways to fix that:

 * Don't move beyond simple flavours. Seriously. Amazon have been pretty
   darn succesful with just their simple instance types.

 * If you must make things complicated, use fanout to send a reservation
   request:

   - Send out reservation requests to everyone listening (*)

   - Compute nodes able to accommodate the request reserve the resources
in question and respond directly to the requestor. Those unable to
     accommodate the request do nothing.

   - Requestor (scheduler, API server, whatever) picks a winner amongst
the repondants and broadcasts a message announcing the winner of
     the request.

   - The winning node acknowledges acceptance of the task to the
     requestor and gets to work.

   - Every other node that responded also sees the broadcast and cancels
     the reservation.

   - Reservations time out after 5 seconds, so a lost broadcast doesn't
     result in reserved-but-never-used resources.

   - If noone has volunteered to accept the reservation request within a
couple of seconds, broadcast wider.

(*) "Everyone listening" isn't necessarily every node. Maybe you have
topics for nodes that are at less than 10% utilisation, one for less
than 25% utilisation, etc. First broadcast to those at 10% or less, move
on to 20%, etc.

This is just off the top of my head. I'm sure it can be improved upon. A
lot. My point is just that there's plenty of alternatives to the
omniscient schedulers that we've been used to for 3 years now.

-- 
Soren Hansen             | http://linux2go.dk/
Ubuntu Developer         | http://www.ubuntu.com/
OpenStack Developer      | http://www.openstack.org/



More information about the OpenStack-dev mailing list