[Openstack] [openstack-dev] Discussion about where to put database for bare-metal provisioning (review 10726)

VTJ NOTSU Arata notsu at virtualtech.jp
Mon Aug 27 23:30:40 UTC 2012


Hi Michael,

> Looking at line 203 in nova/scheduler/filter_scheduler.py, the target host in the cast call is weighted_host*.*host_state*.*host and not a service host. (My guess is this will likely require a fair number of changes in the scheduler area to change cast calls to target a service host instead of a compute node)

weighted_host.host_state.host still seems to be service['host']...
Please look at it again with me.

# First, HostStateManager.get_all_host_states:
# host_manager.py:264
         compute_nodes = db.compute_node_get_all(context)
         for compute in compute_nodes:
# service is from services table (joined-loaded with compute_nodes)
             service = compute['service']
             if not service:
                 LOG.warn(_("No service for compute ID %s") % compute['id'])
                 continue
             host = service['host']
             capabilities = self.service_states.get(host, None)
# go to HostState constructor:
# the 1st parameter 'host' is service['host']
             host_state = self.host_state_cls(host, topic,
                     capabilities=capabilities,
                     service=dict(service.iteritems()))

# host_manager.py:101
     def __init__(self, host, topic, capabilities=None, service=None):
         self.host = host
         self.topic = topic
# here, HostState.host is service['host']

Then, update_from_compute_node(compute) is called but it leaves self.host unchanged.
WeightedHost.host_state is this HostState. So, host at filter_scheduler.py:203 is service['host']. We can use existing code about RPC target. Do I miss something?

Thanks,
Arata


(2012/08/28 6:45), Michael J Fork wrote:
> VTJ NOTSU Arata <notsu at virtualtech.jp> wrote on 08/27/2012 05:19:40 PM:
>
>  > From: VTJ NOTSU Arata <notsu at virtualtech.jp>
>  > To: Michael J Fork/Rochester/IBM at IBMUS,
>  > Cc: David Kang <dkang at isi.edu>, OpenStack Development Mailing List
>  > <openstack-dev at lists.openstack.org>, openstack-bounces
>  > +mjfork=us.ibm.com at lists.launchpad.net,
>  > "openstack at lists.launchpad.net (openstack at lists.launchpad.net)"
>  > <openstack at lists.launchpad.net>
>  > Date: 08/27/2012 05:19 PM
>  > Subject: Re: [Openstack] [openstack-dev] Discussion about where to
>  > put database for bare-metal provisioning (review 10726)
>  >
>  > Hello all,
>  >
>  > It seems that the requirement for keys of HostManager.service_state
>  > is just to be unique;
>  > these do not have to be valid hostnames or queues (Already, existingcode casts
>  > messages to <topic>.<service-hostname>. Michael, doesn't it?).
>
> Looking at line 203 in nova/scheduler/filter_scheduler.py, the target host in the cast call is weighted_host*.*host_state*.*host and not a service host. (My guess is this will likely require a fair number of changes in the scheduler area to change cast calls to target a service host instead of a compute node)
>
>  > So, I tried
>  > '<host>/<bm_node_id>' as 'host' of capabilities. Then,
>  > HostManager.service_state is:
>  >       { <host>/<bm_node_id> : { <service> : { cap k : v }}}.
>  > So far, it works fine. How about this way?
>
> I will defer to Vish here, but seems like a reasonable solution.
>
>  > I paste relevant code in the bottom of this mail just to make sure.
>  > NOTE: I added a new column 'nodename' to compute_nodes to store bm_node_id,
>  > but storing it in 'hypervisor_hostname' may be a right solution.
>
> Again, I will defer to Vish, but seems like using the existing "hypervisor_hostname" would be correct (otherwise I have no idea what that field would have been intended for).
>
>  > (The whole code is in our github(NTTdocomo-openstack/nova, branch
>  > 'multinode'),
>  > multiple resource_trackers are also implemented.)
>  >
>  > Thanks,
>  > Arata
>  >
>  >
>  > diff --git a/nova/scheduler/host_manager.py b/nova/scheduler/host_manager.py
>  > index 33ba2c1..567729f 100644
>  > --- a/nova/scheduler/host_manager.py
>  > +++ b/nova/scheduler/host_manager.py
>  > @@ -98,9 +98,10 @@ class HostState(object):
>  >       previously used and lock down access.
>  >       """
>  >
>  > -    def __init__(self, host, topic, capabilities=None, service=None):
>  > +    def __init__(self, host, topic, capabilities=None,
>  > service=None, nodename=None):
>  >           self.host = host
>  >           self.topic = topic
>  > +        self.nodename = nodename
>  >
>  >           # Read-only capability dicts
>  >
>  > @@ -175,8 +176,8 @@ class HostState(object):
>  >           return True
>  >
>  >       def __repr__(self):
>  > -        return ("host '%s': free_ram_mb:%s free_disk_mb:%s" %
>  > -                (self.host, self.free_ram_mb, self.free_disk_mb))
>  > +        return ("host '%s' / nodename '%s': free_ram_mb:%s free_disk_mb:%s" %
>  > +                (self.host, self.nodename, self.free_ram_mb,
>  > self.free_disk_mb))
>  >
>  >
>  >   class HostManager(object):
>  > @@ -268,11 +269,16 @@ class HostManager(object):
>  >                   LOG.warn(_("No service for compute ID %s") % compute['id'])
>  >                   continue
>  >               host = service['host']
>  > -            capabilities = self.service_states.get(host, None)
>  > +            if compute['nodename']:
>  > +                host_node = '%s/%s' % (host, compute['nodename'])
>  > +            else:
>  > +                host_node = host
>  > +            capabilities = self.service_states.get(host_node, None)
>  >               host_state = self.host_state_cls(host, topic,
>  >                       capabilities=capabilities,
>  > -                    service=dict(service.iteritems()))
>  > +                    service=dict(service.iteritems()),
>  > +                    nodename=compute['nodename'])
>  >               host_state.update_from_compute_node(compute)
>  > -            host_state_map[host] = host_state
>  > +            host_state_map[host_node] = host_state
>  >
>  >           return host_state_map
>  >
>  > diff --git a/nova/virt/baremetal/driver.py b/nova/virt/baremetal/driver.py
>  > index 087d1b6..dbcfbde 100644
>  > --- a/nova/virt/baremetal/driver.py
>  > +++ b/nova/virt/baremetal/driver.py
>  > (skip...)
>  > +    def _create_node_cap(self, node):
>  > +        dic = self._node_resources(node)
>  > +        dic['host'] = '%s/%s' % (FLAGS.host, node['id'])
>  > +        dic['cpu_arch'] = self._extra_specs.get('cpu_arch')
>  > +        dic['instance_type_extra_specs'] = self._extra_specs
>  > +        dic['supported_instances'] = self._supported_instances
>  > +        # TODO: put node's extra specs
>  > +        return dic
>  >
>  >       def get_host_stats(self, refresh=False):
>  > -        return self._get_host_stats()
>  > +        caps = []
>  > +        context = nova_context.get_admin_context()
>  > +        nodes = bmdb.bm_node_get_all(context,
>  > +                                     service_host=FLAGS.host)
>  > +        for node in nodes:
>  > +            node_cap = self._create_node_cap(node)
>  > +            caps.append(node_cap)
>  > +        return caps
>  >
>  >
>  > (2012/08/28 5:55), Michael J Fork wrote:
>  > > openstack-bounces+mjfork=us.ibm.com at lists.launchpad.net wrote on
>  > 08/27/2012 02:58:56 PM:
>  > >
>  > >  > From: David Kang <dkang at isi.edu>
>  > >  > To: Vishvananda Ishaya <vishvananda at gmail.com>,
>  > >  > Cc: OpenStack Development Mailing List <openstack-
>  > >  > dev at lists.openstack.org>, "openstack at lists.launchpad.net \
>  > >  > (openstack at lists.launchpad.net\)" <openstack at lists.launchpad.net>
>  > >  > Date: 08/27/2012 03:06 PM
>  > >  > Subject: Re: [Openstack] [openstack-dev] Discussion about where to
>  > >  > put database for bare-metal provisioning (review 10726)
>  > >  > Sent by: openstack-bounces+mjfork=us.ibm.com at lists.launchpad.net
>  > >  >
>  > >  >
>  > >  >  Hi Vish,
>  > >  >
>  > >  >  I think I understand your idea.
>  > >  > One service entry with multiple bare-metal compute_node entries are
>  > >  > registered at the start of bare-metal nova-compute.
>  > >  > 'hypervisor_hostname' must be different for each bare-metal machine,
>  > >  > such as 'bare-metal-0001.xxx.com', 'bare-metal-0002.xxx.com', etc.)
>  > >  > But their IP addresses must be the IP address of bare-metal nova-
>  > >  > compute, such that an instance is casted
>  > >  > not to bare-metal machine directly but to bare-metal nova-compute.
>  > >
>  > > I believe the change here is to cast out the message to the
>  > <topic>.<service-hostname>. Existing code sends it to the
>  > compute_node hostname (see line 202 of nova/scheduler/
>  > filter_scheduler.py, specifically
>  > host=weighted_host.host_state.host).  Changing that to cast to the
>  > service hostname would send the message to the bare-metal proxy node
>  > and should not have an effect on current deployments since the
>  > service hostname and the host_state.host would always be equal.
>  > This model will also let you keep the bare-metal compute node IP in
>  > the compute node table.
>  > >
>  > >  >  One extension we need to do at the scheduler side is using (host,
>  > >  > hypervisor_hostname) instead of (host) only in host_manager.py.
>  > >  > 'HostManager.service_state' is { <host> : { <service > : { cap k : v }}}.
>  > >  > It needs to be changed to { <host> : { <service> : {
>  > >  > <hypervisor_name> : { cap k : v }}}}.
>  > >  > Most functions of HostState need to be changed to use (host,
>  > >  > hypervisor_name) pair to identify a compute node.
>  > >
>  > > Would an alternative here be to change the top level "host" to be
>  > the hypervisor_hostname and enforce uniqueness?
>  > >
>  > >  >  Are we on the same page, now?
>  > >  >
>  > >  >  Thanks,
>  > >  >  David
>  > >  >
>  > >  > ----- Original Message -----
>  > >  > > Hi David,
>  > >  > >
>  > >  > > I just checked out the code more extensively and I don't see why you
>  > >  > > need to create a new service entry for each compute_node entry. The
>  > >  > > code in host_manager to get all host states explicitly gets all
>  > >  > > compute_node entries. I don't see any reason why multiple compute_node
>  > >  > > entries can't share the same service. I don't see any place in the
>  > >  > > scheduler that is grabbing records by "service" instead of by "compute
>  > >  > > node", but if there is one that I missed, it should be fairly easy to
>  > >  > > change it.
>  > >  > >
>  > >  > > The compute_node record is created in the compute/resource_tracker.py
>  > >  > > as of a recent commit, so I think the path forward would be to make
>  > >  > > sure that one of the records is created for each bare metal node by
>  > >  > > the bare metal compute, perhaps by having multiple resource_trackers.
>  > >  > >
>  > >  > > Vish
>  > >  > >
>  > >  > > On Aug 27, 2012, at 9:40 AM, David Kang <dkang at isi.edu> wrote:
>  > >  > >
>  > >  > > >
>  > >  > > >  Vish,
>  > >  > > >
>  > >  > > >  I think I don't understand your statement fully.
>  > >  > > > Unless we use different hostnames, (hostname, hypervisor_hostname)
>  > >  > > > must be the
>  > >  > > > same for all bare-metal nodes under a bare-metal nova-compute.
>  > >  > > >
>  > >  > > >  Could you elaborate the following statement a little bit more?
>  > >  > > >
>  > >  > > >> You would just have to use a little more than hostname. Perhaps
>  > >  > > >> (hostname, hypervisor_hostname) could be used to update the entry?
>  > >  > > >>
>  > >  > > >
>  > >  > > >  Thanks,
>  > >  > > >  David
>  > >  > > >
>  > >  > > >
>  > >  > > >
>  > >  > > > ----- Original Message -----
>  > >  > > >> I would investigate changing the capabilities to key off of
>  > >  > > >> something
>  > >  > > >> other than hostname. It looks from the table structure like
>  > >  > > >> compute_nodes could be have a many-to-one relationship with
>  > >  > > >> services.
>  > >  > > >> You would just have to use a little more than hostname. Perhaps
>  > >  > > >> (hostname, hypervisor_hostname) could be used to update the entry?
>  > >  > > >>
>  > >  > > >> Vish
>  > >  > > >>
>  > >  > > >> On Aug 24, 2012, at 11:23 AM, David Kang <dkang at isi.edu> wrote:
>  > >  > > >>
>  > >  > > >>>
>  > >  > > >>>  Vish,
>  > >  > > >>>
>  > >  > > >>>  I've tested your code and did more testing.
>  > >  > > >>> There are a couple of problems.
>  > >  > > >>> 1. host name should be unique. If not, any repetitive updates of
>  > >  > > >>> new
>  > >  > > >>> capabilities with the same host name are simply overwritten.
>  > >  > > >>> 2. We cannot generate arbitrary host names on the fly.
>  > >  > > >>>   The scheduler (I tested filter scheduler) gets host names from
>  > >  > > >>>   db.
>  > >  > > >>>   So, if a host name is not in the 'services' table, it is not
>  > >  > > >>>   considered by the scheduler at all.
>  > >  > > >>>
>  > >  > > >>> So, to make your suggestions possible, nova-compute should
>  > >  > > >>> register
>  > >  > > >>> N different host names in 'services' table,
>  > >  > > >>> and N corresponding entries in 'compute_nodes' table.
>  > >  > > >>> Here is an example:
>  > >  > > >>>
>  > >  > > >>> mysql> select id, host, binary, topic, report_count, disabled,
>  > >  > > >>> availability_zone from services;
>  > >  > > >>> +----+-------------+----------------+-----------
>  > >  > +--------------+----------+-------------------+
>  > >  > > >>> | id | host | binary | topic | report_count | disabled |
>  > >  > > >>> | availability_zone |
>  > >  > > >>> +----+-------------+----------------+-----------
>  > >  > +--------------+----------+-------------------+
>  > >  > > >>> |  1 | bespin101 | nova-scheduler | scheduler | 17145 | 0 | nova |
>  > >  > > >>> |  2 | bespin101 | nova-network | network | 16819 | 0 | nova |
>  > >  > > >>> |  3 | bespin101-0 | nova-compute | compute | 16405 | 0 | nova |
>  > >  > > >>> |  4 | bespin101-1 | nova-compute | compute | 1 | 0 | nova |
>  > >  > > >>> +----+-------------+----------------+-----------
>  > >  > +--------------+----------+-------------------+
>  > >  > > >>>
>  > >  > > >>> mysql> select id, service_id, hypervisor_hostname from
>  > >  > > >>> compute_nodes;
>  > >  > > >>> +----+------------+------------------------+
>  > >  > > >>> | id | service_id | hypervisor_hostname |
>  > >  > > >>> +----+------------+------------------------+
>  > >  > > >>> |  1 | 3 | bespin101.east.isi.edu |
>  > >  > > >>> |  2 | 4 | bespin101.east.isi.edu |
>  > >  > > >>> +----+------------+------------------------+
>  > >  > > >>>
>  > >  > > >>>  Then, nova db (compute_nodes table) has entries of all bare-metal
>  > >  > > >>>  nodes.
>  > >  > > >>> What do you think of this approach.
>  > >  > > >>> Do you have any better approach?
>  > >  > > >>>
>  > >  > > >>>  Thanks,
>  > >  > > >>>  David
>  > >  > > >>>
>  > >  > > >>>
>  > >  > > >>>
>  > >  > > >>> ----- Original Message -----
>  > >  > > >>>> To elaborate, something the below. I'm not absolutely sure you
>  > >  > > >>>> need
>  > >  > > >>>> to
>  > >  > > >>>> be able to set service_name and host, but this gives you the
>  > >  > > >>>> option
>  > >  > > >>>> to
>  > >  > > >>>> do so if needed.
>  > >  > > >>>>
>  > >  > > >>>> iff --git a/nova/manager.py b/nova/manager.py
>  > >  > > >>>> index c6711aa..c0f4669 100644
>  > >  > > >>>> --- a/nova/manager.py
>  > >  > > >>>> +++ b/nova/manager.py
>  > >  > > >>>> @@ -217,6 +217,8 @@ class SchedulerDependentManager(Manager):
>  > >  > > >>>>
>  > >  > > >>>> def update_service_capabilities(self, capabilities):
>  > >  > > >>>> """Remember these capabilities to send on next periodic
>  > >  > > >>>> update."""
>  > >  > > >>>> + if not isinstance(capabilities, list):
>  > >  > > >>>> + capabilities = [capabilities]
>  > >  > > >>>> self.last_capabilities = capabilities
>  > >  > > >>>>
>  > >  > > >>>> @periodic_task
>  > >  > > >>>> @@ -224,5 +226,8 @@ class SchedulerDependentManager(Manager):
>  > >  > > >>>> """Pass data back to the scheduler at a periodic interval."""
>  > >  > > >>>> if self.last_capabilities:
>  > >  > > >>>> LOG.debug(_('Notifying Schedulers of capabilities ...'))
>  > >  > > >>>> - self.scheduler_rpcapi.update_service_capabilities(context,
>  > >  > > >>>> - self.service_name, self.host, self.last_capabilities)
>  > >  > > >>>> + for capability_item in self.last_capabilities:
>  > >  > > >>>> + name = capability_item.get('service_name', self.service_name)
>  > >  > > >>>> + host = capability_item.get('host', self.host)
>  > >  > > >>>> + self.scheduler_rpcapi.update_service_capabilities(context,
>  > >  > > >>>> + name, host, capability_item)
>  > >  > > >>>>
>  > >  > > >>>> On Aug 21, 2012, at 1:28 PM, David Kang <dkang at isi.edu> wrote:
>  > >  > > >>>>
>  > >  > > >>>>>
>  > >  > > >>>>>  Hi Vish,
>  > >  > > >>>>>
>  > >  > > >>>>>  We are trying to change our code according to your comment.
>  > >  > > >>>>> I want to ask a question.
>  > >  > > >>>>>
>  > >  > > >>>>>>>> a) modify driver.get_host_stats to be able to return a list
>  > >  > > >>>>>>>> of
>  > >  > > >>>>>>>> host
>  > >  > > >>>>>>>> stats instead of just one. Report the whole list back to the
>  > >  > > >>>>>>>> scheduler. We could modify the receiving end to accept a list
>  > >  > > >>>>>>>> as
>  > >  > > >>>>>>>> well
>  > >  > > >>>>>>>> or just make multiple calls to
>  > >  > > >>>>>>>> self.update_service_capabilities(capabilities)
>  > >  > > >>>>>
>  > >  > > >>>>>  Modifying driver.get_host_stats to return a list of host stats
>  > >  > > >>>>>  is
>  > >  > > >>>>>  easy.
>  > >  > > >>>>> Calling muliple calls to
>  > >  > > >>>>> self.update_service_capabilities(capabilities) doesn't seem to
>  > >  > > >>>>> work,
>  > >  > > >>>>> because 'capabilities' is overwritten each time.
>  > >  > > >>>>>
>  > >  > > >>>>>  Modifying the receiving end to accept a list seems to be easy.
>  > >  > > >>>>> However, 'capabilities' is assumed to be dictionary by all other
>  > >  > > >>>>> scheduler routines,
>  > >  > > >>>>> it looks like that we have to change all of them to handle
>  > >  > > >>>>> 'capability' as a list of dictionary.
>  > >  > > >>>>>
>  > >  > > >>>>>  If my understanding is correct, it would affect many parts of
>  > >  > > >>>>>  the
>  > >  > > >>>>>  scheduler.
>  > >  > > >>>>> Is it what you recommended?
>  > >  > > >>>>>
>  > >  > > >>>>>  Thanks,
>  > >  > > >>>>>  David
>  > >  > > >>>>>
>  > >  > > >>>>>
>  > >  > > >>>>> ----- Original Message -----
>  > >  > > >>>>>> This was an immediate goal, the bare-metal nova-compute node
>  > >  > > >>>>>> could
>  > >  > > >>>>>> keep an internal database, but report capabilities through nova
>  > >  > > >>>>>> in
>  > >  > > >>>>>> the
>  > >  > > >>>>>> common way with the changes below. Then the scheduler wouldn't
>  > >  > > >>>>>> need
>  > >  > > >>>>>> access to the bare metal database at all.
>  > >  > > >>>>>>
>  > >  > > >>>>>> On Aug 15, 2012, at 4:23 PM, David Kang <dkang at isi.edu> wrote:
>  > >  > > >>>>>>
>  > >  > > >>>>>>>
>  > >  > > >>>>>>> Hi Vish,
>  > >  > > >>>>>>>
>  > >  > > >>>>>>> Is this discussion for long-term goal or for this Folsom
>  > >  > > >>>>>>> release?
>  > >  > > >>>>>>>
>  > >  > > >>>>>>> We still believe that bare-metal database is needed
>  > >  > > >>>>>>> because there is not an automated way how bare-metal nodes
>  > >  > > >>>>>>> report
>  > >  > > >>>>>>> their capabilities
>  > >  > > >>>>>>> to their bare-metal nova-compute node.
>  > >  > > >>>>>>>
>  > >  > > >>>>>>> Thanks,
>  > >  > > >>>>>>> David
>  > >  > > >>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> I am interested in finding a solution that enables bare-metal
>  > >  > > >>>>>>>> and
>  > >  > > >>>>>>>> virtualized requests to be serviced through the same
>  > >  > > >>>>>>>> scheduler
>  > >  > > >>>>>>>> where
>  > >  > > >>>>>>>> the compute_nodes table has a full view of schedulable
>  > >  > > >>>>>>>> resources.
>  > >  > > >>>>>>>> This
>  > >  > > >>>>>>>> would seem to simplify the end-to-end flow while opening up
>  > >  > > >>>>>>>> some
>  > >  > > >>>>>>>> additional use cases (e.g. dynamic allocation of a node from
>  > >  > > >>>>>>>> bare-metal to hypervisor and back).
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> One approach would be to have a proxy running a single
>  > >  > > >>>>>>>> nova-compute
>  > >  > > >>>>>>>> daemon fronting the bare-metal nodes . That nova-compute
>  > >  > > >>>>>>>> daemon
>  > >  > > >>>>>>>> would
>  > >  > > >>>>>>>> report up many HostState objects (1 per bare-metal node) to
>  > >  > > >>>>>>>> become
>  > >  > > >>>>>>>> entries in the compute_nodes table and accessible through the
>  > >  > > >>>>>>>> scheduler HostManager object.
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> The HostState object would set cpu_info, vcpus, member_mb and
>  > >  > > >>>>>>>> local_gb
>  > >  > > >>>>>>>> values to be used for scheduling with the hypervisor_host
>  > >  > > >>>>>>>> field
>  > >  > > >>>>>>>> holding the bare-metal machine address (e.g. for IPMI based
>  > >  > > >>>>>>>> commands)
>  > >  > > >>>>>>>> and hypervisor_type = NONE. The bare-metal Flavors are
>  > >  > > >>>>>>>> created
>  > >  > > >>>>>>>> with
>  > >  > > >>>>>>>> an
>  > >  > > >>>>>>>> extra_spec of hypervisor_type= NONE and the corresponding
>  > >  > > >>>>>>>> compute_capabilities_filter would reduce the available hosts
>  > >  > > >>>>>>>> to
>  > >  > > >>>>>>>> those
>  > >  > > >>>>>>>> bare_metal nodes. The scheduler would need to understand that
>  > >  > > >>>>>>>> hypervisor_type = NONE means you need an exact fit (or
>  > >  > > >>>>>>>> best-fit)
>  > >  > > >>>>>>>> host
>  > >  > > >>>>>>>> vs weighting them (perhaps through the multi-scheduler). The
>  > >  > > >>>>>>>> scheduler
>  > >  > > >>>>>>>> would cast out the message to the <topic>.<service-hostname>
>  > >  > > >>>>>>>> (code
>  > >  > > >>>>>>>> today uses the HostState hostname), with the compute driver
>  > >  > > >>>>>>>> having
>  > >  > > >>>>>>>> to
>  > >  > > >>>>>>>> understand if it must be serviced elsewhere (but does not
>  > >  > > >>>>>>>> break
>  > >  > > >>>>>>>> any
>  > >  > > >>>>>>>> existing implementations since it is 1 to 1).
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> Does this solution seem workable? Anything I missed?
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> The bare metal driver already is proxying for the other nodes
>  > >  > > >>>>>>>> so
>  > >  > > >>>>>>>> it
>  > >  > > >>>>>>>> sounds like we need a couple of things to make this happen:
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> a) modify driver.get_host_stats to be able to return a list
>  > >  > > >>>>>>>> of
>  > >  > > >>>>>>>> host
>  > >  > > >>>>>>>> stats instead of just one. Report the whole list back to the
>  > >  > > >>>>>>>> scheduler. We could modify the receiving end to accept a list
>  > >  > > >>>>>>>> as
>  > >  > > >>>>>>>> well
>  > >  > > >>>>>>>> or just make multiple calls to
>  > >  > > >>>>>>>> self.update_service_capabilities(capabilities)
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> b) make a few minor changes to the scheduler to make sure
>  > >  > > >>>>>>>> filtering
>  > >  > > >>>>>>>> still works. Note the changes here may be very helpful:
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> https://review.openstack.org/10327
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> c) we have to make sure that instances launched on those
>  > >  > > >>>>>>>> nodes
>  > >  > > >>>>>>>> take
>  > >  > > >>>>>>>> up
>  > >  > > >>>>>>>> the entire host state somehow. We could probably do this by
>  > >  > > >>>>>>>> making
>  > >  > > >>>>>>>> sure that the instance_type ram, mb, gb etc. matches what the
>  > >  > > >>>>>>>> node
>  > >  > > >>>>>>>> has, but we may want a new boolean field "used" if those
>  > >  > > >>>>>>>> aren't
>  > >  > > >>>>>>>> sufficient.
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> I This approach seems pretty good. We could potentially get
>  > >  > > >>>>>>>> rid
>  > >  > > >>>>>>>> of
>  > >  > > >>>>>>>> the
>  > >  > > >>>>>>>> shared bare_metal_node table. I guess the only other concern
>  > >  > > >>>>>>>> is
>  > >  > > >>>>>>>> how
>  > >  > > >>>>>>>> you populate the capabilities that the bare metal nodes are
>  > >  > > >>>>>>>> reporting.
>  > >  > > >>>>>>>> I guess an api extension that rpcs to a baremetal node to add
>  > >  > > >>>>>>>> the
>  > >  > > >>>>>>>> node. Maybe someday this could be autogenerated by the bare
>  > >  > > >>>>>>>> metal
>  > >  > > >>>>>>>> host
>  > >  > > >>>>>>>> looking in its arp table for dhcp requests! :)
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> Vish
>  > >  > > >>>>>>>>
>  > >  > > >>>>>>>> _______________________________________________
>  > >  > > >>>>>>>> OpenStack-dev mailing list
>  > >  > > >>>>>>>> OpenStack-dev at lists.openstack.org
>  > >  > > >>>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/
>  > openstack-dev
>  > >  > > >>>>>>>
>  > >  > > >>>>>>> _______________________________________________
>  > >  > > >>>>>>> OpenStack-dev mailing list
>  > >  > > >>>>>>> OpenStack-dev at lists.openstack.org
>  > >  > > >>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/
>  > openstack-dev
>  > >  > > >>>>>>
>  > >  > > >>>>>>
>  > >  > > >>>>>> _______________________________________________
>  > >  > > >>>>>> OpenStack-dev mailing list
>  > >  > > >>>>>> OpenStack-dev at lists.openstack.org
>  > >  > > >>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/
>  > openstack-dev
>  > >  > > >>>>>
>  > >  > > >>>>> _______________________________________________
>  > >  > > >>>>> OpenStack-dev mailing list
>  > >  > > >>>>> OpenStack-dev at lists.openstack.org
>  > >  > > >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>  > >  > > >>>>
>  > >  > > >>>>
>  > >  > > >>>> _______________________________________________
>  > >  > > >>>> OpenStack-dev mailing list
>  > >  > > >>>> OpenStack-dev at lists.openstack.org
>  > >  > > >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>  > >  >
>  > >  > _______________________________________________
>  > >  > Mailing list: https://launchpad.net/~openstack
>  > >  > Post to     : openstack at lists.launchpad.net
>  > >  > Unsubscribe : https://launchpad.net/~openstack
>  > >  > More help   : https://help.launchpad.net/ListHelp
>  > >  >
>  > >
>  > > Michael
>  > >
>  > > -------------------------------------------------
>  > > Michael Fork
>  > > Cloud Architect, Emerging Solutions
>  > > IBM Systems & Technology Group
>  > >
>  > >
>  > > _______________________________________________
>  > > Mailing list: https://launchpad.net/~openstack
>  > > Post to     : openstack at lists.launchpad.net
>  > > Unsubscribe : https://launchpad.net/~openstack
>  > > More help   : https://help.launchpad.net/ListHelp
>  > >
>  >
>
> Michael
>
> -------------------------------------------------
> Michael Fork
> Cloud Architect, Emerging Solutions
> IBM Systems & Technology Group
>


-- 
日本仮想化技術株式会社(http://VirtualTech.jp)
技術部 開発課 課長 野津 新(notsu at VirtualTech.jp)

〒150-0002 東京都渋谷区渋谷1-8-1 第3西青山ビル 8F
TEL:03-6419-7841 FAX:03-5774-9462


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5161 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120828/ef4bc1a3/attachment.bin>


More information about the Openstack mailing list