[openstack-dev] [nova] Manage multiple clusters using a single nova service

Vaddi, Kiran Kumar kiran-kumar.vaddi at hp.com
Wed Jul 23 10:39:56 UTC 2014


Answers to some of your concerns

> Why can't ESXi hosts not run the nova-compute service? Is it like the
> XenServer driver that has a pitifully old version of Python (2.4) that
> constrains the code that is possible to run on it? If so, then I don't
> really think the poor constraints of the hypervisor dom0 should mean
> that Nova should change its design principles to accomodate. The
> XenServer driver uses custom agents to get around this issue, IIRC. Why
> can't the VCenter driver?

ESXi hosts are generally operated in a lock-down mode where installation of agents is not allowed.
All communication and tasks on the ESXi hosts must be done using vCenter.

> The fact that each connection to vCenter uses 140MB of memory is
> completely ridiculous. You can thank crappy SOAP for that, I believe.

Yes, and the problem becomes bigger if we create multiple services

> I'm just do not suppor the idea that Nova needs to
> change its fundamental design in order to support the *design* of other
> host management platforms.

The current implementation doesn't make nova change its design, the scheduling decisions are still done by nova.
Its only the deployment that has been changed. Agree that there are no separate topic-exchange queues for each cluster.

Thanks
Kiran

> -----Original Message-----
> From: Jay Pipes [mailto:jaypipes at gmail.com]
> Sent: Tuesday, July 22, 2014 9:30 AM
> To: openstack-dev at lists.openstack.org
> Subject: Re: [openstack-dev] [nova] Manage multiple clusters using a single
> nova service
> 
> On 07/14/2014 04:34 AM, Vaddi, Kiran Kumar wrote:
> > Hi,
> >
> > In the Juno summit, it was discussed that the existing approach of
> > managing multiple VMware Clusters using a single nova compute service
> > is not preferred and the approach of one nova compute service
> > representing one cluster should be looked into.
> 
> Even this is outside what I consider to be best practice for Nova,
> frankly. The model of scale-out inside Nova is to have a nova-compute
> worker responsible for only the distinct set of compute resources that
> are provided by a single bare metal node.
> 
> Unfortunately, with the introduction of the bare-metal driver in Nova,
> as well as the "clustered hypervisors" like VCenter and Hyper-V, this
> architectural design point was shot in the head, and now it is only
> possible to scale the nova-compute <-> hypervisor communication layer
> using a scale-up model instead of a scale-out model. This is a big deal,
> and unfortunately, not enough discussion has been had around this, IMO.
> 
> The proposed blueprint(s) around this and the code patches I've seen are
> moving Nova in the opposite direction it needs to go, IMHO.
> 
> > We would like to retain the existing approach (till we have resolved
> >  the issues) for the following reasons:
> >
> > 1.Even though a single service is managing all the clusters,
> > logically it is still one compute per cluster. To the scheduler each
> >  cluster is represented as individual computes. Even in the driver
> > each cluster is represented separately.
> 
> How is this so? In Kanagaraj Manickam's proposed blueprint about this
> [1], the proposed implementation would fork one process for each
> hypervisor or cluster. However, the problem with this is that the
> scheduler uses the single service record for the nova-compute worker to
> determine whether or not the node is available to place resources on.
> The servicegroup API would need to be refactored (rewritten, really) to
> change its definition of a service to instead of being a single daemon,
> now being a single process running within that daemon. Since the daemon
> only responds to a single RPC target endpoint and rpc.call direct and
> topic exchanges, all of that code would then need to be rewritten, or
> code would need to be added to nova.manager to dispatch events sent to
> the nova-compute's single RPC topic-exchange to one of the specific
> processes that is responsible for a particular cluster.
> 
> In short, a huge chunk of code would need to be refactored in order to
> make Nova's worldview amenable to the design choices of certain
> clustered hypervisors. That, IMHO, is not something to be taken lightly,
> and not something we should even consider without a REALLY good reason.
> And the use case of "Openstack is an platform and its good to provide
> flexibility in it to accommodate different needs." is not a really good
> reason, IMO.
> 
> > 2.Since ESXi does not allow to run nova-compute service on the
> > hypervisor unlike KVM, the service has to be run externally on a
> > different server. Its easier from administration perspective to
> > manage a single service than multiple.
> 
> Why can't ESXi hosts not run the nova-compute service? Is it like the
> XenServer driver that has a pitifully old version of Python (2.4) that
> constrains the code that is possible to run on it? If so, then I don't
> really think the poor constraints of the hypervisor dom0 should mean
> that Nova should change its design principles to accomodate. The
> XenServer driver uses custom agents to get around this issue, IIRC. Why
> can't the VCenter driver?
> 
> > 3.Every connection to vCenter uses up ~140MB in the driver. If we
> > were to manage each cluster by an individual service the memory
> > consumed for 32 clusters will be high (~4GB). The newer versions
> > support 64 clusters!
> 
> The fact that each connection to vCenter uses 140MB of memory is
> completely ridiculous. You can thank crappy SOAP for that, I believe.
> 
> That said, Nova should not be changing its design principles to
> accommodate poor software of a driver.
> 
> It raises questions on why exactly folks are even using OpenStack at all
> if they want to continue to use VCenter for host management, DRS, DPM,
> and the like.
> 
> What advantage are they getting from OpenStack?
> 
> If the idea is to move off of expensive VCenter-licensed clusters and on
> to a pure OpenStack infrastructure then, I don't see a point in
> supporting *more* clustered hypervisor features in the driver code at
> all. If the idea is to just "use what we know, don't rock the enterprise
> IT boat", then why use OpenStack at all?
> 
> Look, I'm all for compatibility and transferability of different image
> formats, different underlying hypervisors, and the dream of
> interoperable clouds. I'm happy to see Nova support a wide variety of
> disk image formats and hypervisor features (note: VCenter isn't a
> hypervisor). I'm just do not suppor the idea that Nova needs to
> change its fundamental design in order to support the *design* of other
> host management platforms.
> 
> Best,
> -jay
> 
> [1] https://review.openstack.org/#/c/103054/
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list