[Openstack] distributed and heterogeneous schedulers

Jay Pipes jaypipes at gmail.com
Tue Apr 12 16:49:24 UTC 2011


Hi Brian, comments inline :)

On Tue, Apr 12, 2011 at 12:34 PM, Brian Schott <bfschott at gmail.com> wrote:
>
> I'm trying to understand how best to implement our architecture-aware scheduler for Diablo:
> https://blueprints.launchpad.net/nova/+spec/schedule-instances-on-heterogeneous-architectures
>
> Right now our scheduler is similar in approach to SimpleScheduler with a few extra filters on instances and compute_nodes table queries for the cpu_arch and xpu_arch fields that we added.  For example, for "-t cg1.4xlarge" GPU instance type the scheduler reads instance_types.cpu_arch="x86_64" and instance_types.xpu_arch = "fermi", then filters the respective compute_node and instance fields. http://wiki.openstack.org/HeterogeneousInstanceTypes
>
> That's OK for Cactus, but going beyond that, I'm struggling to reconcile these different blueprints:
> https://blueprints.launchpad.net/nova/+spec/advanced-scheduler
> https://blueprints.launchpad.net/nova/+spec/distributed-scheduler
>
> - How is the instance_metadata table used?  I see the "cpu_arch, xpu_arch" and other fields we added as of the same class of data as vcpus, local_gb, or mem_mb fields, which is why I put them in the instances table.  Virtualization type is of a similar class.  I think of meta-data as less defined constraints passed to the scheduler like "near vol-12345678".

:( I've brought this up before as well. The term metadata is used
incorrectly to refer to custom key/value attributes of something
instead of referring to data about the data (for instance, the type
and length constraints of a data field).

Unfortunately, because the OpenStack API uses the actual term
"metadata" in the API, that's what the table was named and that's how
key/value pairs are referred to in the code.

We have at least three choices here:

1) Continue to add fields to the instances table (or compute_nodes
table) for these main attributes like cpu_arch, etc.
2) Use the custom key/value table (instance_metadata) to store these
attribute names and their values
3) Do both 1) and 2)

I would prefer that we use 1) above for fields that are common to all
nodes (and thus can be NOT NULL fields in the database and be properly
indexed. And all other attributes that are not common to all nodes use
the instance_metadata table.

Thoughts?

> - Will your capabilities scheduler, constraint scheduler, and/or distributed schedulers understand different available hardware resources on compute nodes?

I was assuming they would "understand" different available hardware
resources by querying a database table that housed attributes
pertaining to a single host or a group of hosts (a zone).

> - Should there be an instance_types_metadata table for things like "cpu_arch" rather than our current approach?

There could be if those fields were added as main attributes on the
instances table. If those attributes are added to the
instances_metadata table as custom key/value pairs, no, that wouldn't
make much sense.

> As long as we can inject a "-t cg1.4xlarge" at one end and have that get routed to a compute node with GPU hardware on the other end, we're not tied to the centralized database implementation.

I don't see how having the database implementation be centralized or
not affects the above statement. Could you elaborate?

> PS: I sent this to the mailing list a week ago and didn't get a reply, now can't even find this in the openstack list archive.  Anyone else having their posts quietly rejected?

I saw the original, if you are referring to this one:

https://lists.launchpad.net/openstack/msg01645.html

Cheers!
-jay




More information about the Openstack mailing list