[openstack-dev] [Nova] virt driver architecture

Russell Bryant rbryant at redhat.com
Tue May 14 20:49:32 UTC 2013


On 05/13/2013 12:58 PM, Dan Wendlandt wrote:
> 
> 
> 
> On Fri, May 10, 2013 at 11:36 AM, Russell Bryant <rbryant at redhat.com
> <mailto:rbryant at redhat.com>> wrote:
> 
>     On 05/10/2013 12:25 PM, Armando Migliaccio wrote:
>     > Um...I wonder if  we are saying the same thing: I am thinking that for
>     > implementing a nova-api proxy one would need to provide their
>     > compute_api_class, that defaults to nova.compute.api.API.
>     Incidentally,
>     > when using cells this becomes nova.compute.cells_api.ComputeCellsAPI
>     > (for the top-level). By implementing the compute API class surely you
>     > don't need necessarily to hook into how Nova works, no?
> 
>     We were talking about different things.  I was talking about a REST API
>     proxy ... OpenStack Compute API -> whatever API.
> 
> 
> I may be at fault here for introducing the work "proxy", but I think it
> may be confusing things a bit, as many cases where you would want to use
> something like vCenter are not really a proxy. 
> 
> They way I see it, there are two extremes: 
> 1) The current virt-driver approach, with one nova-compute per per
> "host", where "host" is a single unit of capacity in terms of
> scheduling, etc.  In KVM-world a "host" is a hypervisor node.  In
> current vCenter driver, this is a cluster, with vCenter exposing one
> large capacity and spreading workloads evenly.  This approach leverages
> all scheduling logic available within nova.scheduler, uses nova DB
> model, etc.

I would add that we assume that Nova is in complete control of the
compute resources in this case, meaning that there is not another system
making changes to instances.  That's where we start to run into problems
with putting the cluster-based drivers at this level.

> 2) A true "API proxy" approach, possibly implemented using cells.  All
> scheduling/placement, data modeling, etc. logic would be implemented by
> a back-end system such as vCenter and one cannot leverage existing nova
> scheduler logic or database models.   I think this would mean that the
> nova-api, nova-scheduler, and nova-compute code used with the
> virt-driver model would not be used, and the new cell driver would have
> to create its own versions of this.  

I would actually break this up into two cases.

2.a) A true API proxy. You have an existing virt management solution
(vCenter, oVirt, whatever), and you want to interact with it using the
OpenStack APIs.  For this, I would propose not using Nova (or any
OpenStack component) at all.  Instead, I would implement the API in the
project/product itself, or use something built to be an API proxy, like
deltacloud.

2.b) A cell-based nova deployment.  A cell may be a compute cell (what
exists today) where Nova is managing all of the compute resources.
(Here is where the proposal comes in) A cell could also be a different,
existing virt management solution.  In that case, the other system is
responsible for everything that a compute cell does today, but does it
its own way and is responsible for reporting state up to the API cell.
Systems would of course be welcome to reuse Nova components if
applicable, such as nova-scheduler.

> However, what is being proposed in the blueprint 

Let's step back for just a second.  :-)

The intent of this thread was not to focus on one blueprint.  What I'd
really like to do is just sit back and think about how we think things
*should* look long term.

As for the blueprint [1] on one nova-compute talking to N clusters
instead of 1 ... fine.  I get that vCenter is supported today and this
really isn't an architectural shift from what we have.  I don't really
have a reason to say no to that specific blueprint.

What we're discussing here which has morphed into a high level
architectural proposal (existing virt management systems become a cell)
is larger scope.  It's about the future of where connecting to existing
virt management systems should live.  In the context of VMware stuff,
it's whether vCenter should be in the virt driver layer at all.

> is actually something
> in between these two extremes and in fact closer to the virt-driver
> model.  I suspect the proposal sees the following benefits to a model
> that is closer to the existing virt-driver (note: I am not working
> closely with the author, so I am guessing here based on my own views): 
> -  the nova scheduler logic is actually very beneficial even when you
> are using something like vCenter.  It lets you do a  KVM + vmware
> deployment, where workloads are directed to vmware vs. KVM by the
> scheduler based on disk type, host aggregates, etc.  It also lets you
> expose different vCenter clusters with different properties via host
> aggregates (e.g., an HA cluster and a non-HA cluster).  According to the
> docs I've read on cells (may be out of date), it seems like the current
> cell scheduler is very simple (i.e., random), so doing this with cells
> would require adding similar intelligence at the cell scheduler layer.  
> Additionally, I'm aware of people who would like to use nova's pluggable
> scheduling to even do fine-grain per-hypervisor scheduling on a
> "cluster" platform like vCenter (which for a cluster with DRS enabled
> would make sense). 

You could re-use the host scheduler in your cell if you really wanted to.

The cell scheduler will have filter/weight support just like the host
scheduler very soon, so we should be able to have very intelligent cell
scheduling, just like host scheduling.

https://review.openstack.org/#/c/16221/

> - there is a lot of nova code used in the virt-driver model that is
> still needed when implementing Nova with a system like vCenter.  This
> isn't just the API WSGI + scheduling logic, it includes code to talk to
> quantum, glance, cinder, etc.  There is also data modeled in  the Nova
> DB that is likely not modeled in the back-end system.  Perhaps with
> significant refactoring this shared functionality could be put in proper
> libraries that could be re-used in cells of different types, but my
> guess is that it would be a significant shake-up of the Nova codebase.  

I certainly don't suggest it's a trivial effort.

Some systems will want to interact with glance/cinder/quantum/etc
directly their own way.  That's what oVirt is already doing.

> As I read the blueprint, it seems like the idea is to make some more
> targeted changes.  In particular: 
> 1) remove the hard coupling in the scheduler logic between a device
> being scheduler to, and the queue that the scheduling message is placed
> into.  
> 2) On the nova-compute side, do not limit a nova-compute to creating a
> single "host" record, but allow it to dynamically update the set of
> available hosts based on its own mechanism for discovering available
> hosts.  
> 
> To me these seem like fairly clean separations of duty: the scheduler is
> still in charge of deciding where workloads should run, and the set of
> nova-computes are still responsible for exposing the set of available
> resources, and implementing requests to place a workload on a particular
> host resource.   Maybe its more complicated than that, but at a high
> level this strikes me as reasonable.

Nova has a concept of scheduling to a host,node pair.  In most cases
they are the same.  In baremetal, they are different (nova-compute host
vs the actual bare metal node).  The blueprint is just saying the driver
would use this breakdown to present multiple clusters from one
nova-compute service.  This isn't significantly different from what is
already there, so that's fine.

Like I mentioned before, my intent is to look beyond this blueprint.  We
need to identify where we want to go so that the line can be drawn
appropriately in the virt driver layer for future proposals.

There are other blueprints that perhaps do a better job at demonstrating
the problems with putting these systems at the existing virt layer.  As
an example, check out the vCenter blueprint related to volume support [2].

To make things work, this blueprint proposes that Nova gets volume
connection info *for every possible host* that the volume may end up
getting connected to.  Presumably this is because the system wants to
move things around on its own.  This feels very, very wrong to me.

If another system wants to manage the VMs, that's fine, but I'd rather
Nova not *also* think it's in charge, which is what we have right now.

[1]
https://blueprints.launchpad.net/nova/+spec/multiple-clusters-managed-by-one-service
[2]
https://blueprints.launchpad.net/nova/+spec/fc-support-for-vcenter-driver

-- 
Russell Bryant



More information about the OpenStack-dev mailing list