[openstack-dev] [Magnum] Scheduling for Magnum

Adrian Otto adrian.otto at rackspace.com
Sat Feb 7 00:44:52 UTC 2015


Magnum Team,

In our initial spec, we addressed the subject of resource scheduling. Our plan was to begin with a naive scheduler that places resources on a specified Node and can sequentially fill Nodes if one is not specified.

Magnum supports multiple conductor backends[1], one of which is our Kubernetes backend. We also have a native Docker backend that we would like to enhance so that when pods or containers are created, the target nodes can be selected according to user-supplied filters. Some examples of this are:

Constraint, Affinity, Anti-Affinity, Health

We have multiple options for solving this challenge. Here are a few:

1) Cherry pick scheduler code from Nova, which already has a working a filter scheduler design. 
2) Integrate swarmd to leverage its scheduler[2]. 
3) Wait for the Gantt, when Nova Scheduler to be moved out of Nova. This is expected to happen about a year from now, possibly sooner.
4) Write our own filter scheduler, inspired by Nova.

I suggest that we deserve to have a scheduling implementation for our native docker backend before Gantt is ready. It’s unrealistic that the Magnum team will be able to accelerate Gantt’s progress, as significant changes must be made in Nova for this to happen. The Nova team is best equipped to do this. It requires active participation from Nova’s core review team, which has limited bandwidth, and other priorities to focus on. I think we unanimously agree that we would prefer to use Gantt, if it were available sooner.

I suggest we also rule out option 4, because it amounts to re-inventing the wheel.

That leaves us with options 1 and 2 in the short term. The disadvantage of either of these approaches is that we will likely need to remove them and replace them with Gantt (or a derivative work) once it matures. The advantage of option 1 is that python code already exists for this, and we know it works well in Nova. We could cherry pick that over, and drop it directly into Magnum. The advantage of option 2 is that we leverage the talents of the developers working on Swarm, which is better than option 4. New features are likely to surface in parallel with our efforts, so we would enjoy the benefits of those without expending work in our own project.

So, how do you feel about options 1 and 2? Which do you feel is more suitable for Magnum? What other options should we consider that might be better than either of these choices?

I have a slight preference for option 2 - integrating with Swarm, but I could be persuaded to pick option 1, or something even more brilliant. Please discuss.

Thanks,

Adrian

[1] https://github.com/stackforge/magnum/tree/master/magnum/conductor/handlers
[2] https://github.com/docker/swarm/tree/master/scheduler/
[3] https://wiki.openstack.org/wiki/Gantt


More information about the OpenStack-dev mailing list