[openstack-dev] [nova][dpm] multiple nova-compute services on *one* host?

Markus Zoeller mzoeller at linux.vnet.ibm.com
Wed Jan 18 13:03:21 UTC 2017


TL;DR:
Is it advisable to run multiple nova-compute services within the same
operating system while each nova-compute service manages a different
(remote) hypervisor?

The longer version:
Co-workers and I are working on a new (out-of-tree [1]) driver for a
system z hypervisor [2]. A model of what we came up with looks like this:

                      compute-node(hostname=hansel)
    +-------------------------------------------------------------+
    | +------------+         +------------+        +------------+ |
    | |nova1.conf  |         |nova2.conf  |        |nova3.conf  | |
    | |  host=foo  |         |  host=bar  |        |  host=baz  | |
    | +-----^------+         +-----^------+        +-----^------+ |
    |       |                      |                     |        |
    | +-----+------+         +-----+------+        +-----+------+ |
    | |            |         |            |        |            | |
    | |nova-compute|         |nova-compute|        |nova-compute| |
    | |            |         |            |        |            | |
    | +------------+         +------------+        +------------+ |
    +-------------------------------------------------------------+
            |                      |                     |
            |                   +--v--+                  |
            +-------------------> HMC <------------------+
                                +--+--+
                                   |
            +--------------------------------------------+
    cpc1    |                      |                     |     cpc2
    +---------------------------------------+  +------------------+
    |       |                      |        |  |         |        |
    | +-----v------+         +-----v------+ |  |  +------v-----+  |
    | |            |         |            | |  |  |            |  |
    | | cpc-subset |         | cpc-subset | |  |  | cpc-subset |  |
    | | name=foo   |         | name=bar   | |  |  | name=baz   |  |
    | |            |         |            | |  |  |            |  |
    | +------------+         +------------+ |  |  +------------+  |
    |                                       |  |                  |
    +---------------------------------------+  +------------------+

The hypervisor itself is running inside a CPC [3]. All communication
with these hypervisors need to go via the REST API of a so-called "HMC".
A cpc-subset is a logical constraint of the overall available resources
inside a CPC. That's where the Nova instances will live in, as so-called
"partitions".

This sub-setting means, there is no 1-to-1 relationship of nova-compute
service to host/hypervisor anymore. We already tested that this works
in a small testing environment.

The diagram above shows, that we configured the `host` config option
with a value which is *not* related to the hostname of the compute node
(neither its IP address or FQDN).

The docs of the config option `[DEFAULT].host` makes me believe this is
*not* valid, as is says:

    "Hostname, FQDN or IP address of this host. Must be valid within
    AMQP key."

The first sentence is the one which raised doubts if our model is a
valid one. The second sentence "weakens" the first one a little. A valid
AMQP name could also be totally different from hostname, FQDN or IP
address.

The functional tests (e.g. [4]) on the other hand, make me believe our
model is a valid one, as the tests have code like this:

    self.start_service('compute', host='fake-host')
    self.start_service('compute', host='fake-host2')

Also, the developer docs [5] say:

    "The one major exception is nova-compute, where a single process
    runs on the hypervisor it is managing (except when using the VMware
    or Ironic drivers)."

Our model is close to the one of Ironic IMO.

We also spent thought about using one single nova-compute service for
all CPCs. We rejected that idea, as we came to the conclusion that this
would be a single-point-of-failure which is also complicated to
configure. The interaction with the neutron networking-agent was also
not straight forward.

Long story short, are we bending the rules here or did I overlook code
usages of `[DEFAULT].host` where it *has to be* a network related
attribute like IP address / FQDN / hostname?

References:
[1] https://github.com/openstack/nova-dpm
[2] https://blueprints.launchpad.net/nova/+spec/dpm-driver
[3]
https://www.ibm.com/support/knowledgecenter/zosbasics/com.ibm.zos.zmainframe/zconc_mfhwterms.htm
[4]
https://github.com/openstack/nova/blob/bcbfee183e74f696085fcd5c18aff333fc5f1403/nova/tests/unit/conductor/test_conductor.py#L1468-L1469
[5] http://docs.openstack.org/developer/nova/architecture.html


-- 
Regards, Markus Zoeller (markus_z)




More information about the OpenStack-dev mailing list