[openstack-dev] Bare-metal node scheduling

Michael J Fork mjfork at us.ibm.com
Mon Oct 15 14:24:28 UTC 2012


Being late to the discussion, I will skip most of these points since they
have already been discussed, but wanted to capture my intent in the
original mailing list discussion from August.

Mark McLoughlin <markmc at redhat.com> wrote on 10/08/2012 02:49:03 AM:

> From: Mark McLoughlin <markmc at redhat.com>
> To: OpenStack Development Mailing List
<openstack-dev at lists.openstack.org>,
> Date: 10/08/2012 02:51 AM
> Subject: [openstack-dev] Bare-metal node scheduling
>
> Hi,
>
> I'm reviewing the first of the "general bare-metal provisioning"
> patches:
>
>   https://review.openstack.org/13920
>
> and I'm really very concerned at how invasive this is to the core
> compute scheduling infrastructure.
>
> Basically, we're adding infrastructure so that a virt driver in a single
> compute service can cause the resources of multiple "nodes" to be
> advertised to the scheduler.

Based on the e-mail thread at the time (primarily the team working on
bare-metal + Vish and myself), this was the desired behavior: the scheduler
schedules to a node and then sends the message to the service with the
provision request and the targeted node.  The service just provides the
communication/control path.

> Making already confusing core infrastructure much more confusing for the
> sake of a single virt driver seems like a bad idea.
>
> What we're doing is allowing the scheduler to choose a compute node
> based on the details of the individual bare-metal nodes available via
> the compute node. However, the compute node is still responsible for
> choosing which bare-metal node to provision.

I didn't review the code in detail, but this is different than what I would
expect - the scheduler should pick the bare-metal node to provision and the
"proxy" node running the service would be responsible for acting on it.
This model would apply when a single service is responsible for providing
the control plane to many boxes (in addition to bare-metal, there is
vCenter, oVirt, HMC on IBM System p, etc - it could also apply to Cinder
with a single service talking to a SAN registering all the pools from it).
When compute pooling is in use, a different approach would be needed to
advertise resources (e.g. in aggregate 32GB of RAM may be free, but the
largest contiguous is 16GB).  However, when the systems are in a manual
placement mode, the OpenStack scheduler would be ideal.

Michael

-------------------------------------------------
Michael Fork
Cloud Architect - OpenStack, Emerging Solutions
IBM Systems & Technology Group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20121015/869c2aff/attachment.html>


More information about the OpenStack-dev mailing list