[openstack-dev] [nova][neutron][cinder] Averting the Nova crisis by splitting out virt drivers

Daniel P. Berrange berrange at redhat.com
Thu Sep 11 09:18:40 UTC 2014


On Thu, Sep 11, 2014 at 09:23:34AM +1000, Michael Still wrote:
> On Thu, Sep 11, 2014 at 8:11 AM, Jay Pipes <jaypipes at gmail.com> wrote:
> 
> > a) Sorting out the common code is already accounted for in Dan B's original
> > proposal -- it's a prerequisite for the split.
> 
> Its a big prerequisite though. I think we're talking about a release
> worth of work to get that right. I don't object to us doing that work,
> but I think we need to be honest about how long its going to take. It
> will also make the core of nova less agile, as we'll find it hard to
> change the hypervisor driver interface over time. Do we really think
> its ready to be stable?

Yes, in my proposal I explicitly said we'd need to have Kilo
for all the prep work to clean up the virt API, before only
doing the split in Lxxxxx.

The actual nova/virt/driver.py has been more stable over the
past few releases than I thought it would be. In terms of APIs
we're not really modified existing APIs, mostly added new ones.
Where we did modify existing APIs, we could have easily taken
the approach of adding a new API in parallel and deprecating
the old entry point to maintain compat.

The big change which isn't visible directly is the conversion
of internal nova code to use objects. Finishing this conversion
is clearly a pre-requisite to any such split, since we'd need
to make sure all data passed into the nova virt APIs as parameters
is stable & well defined. 

> As an alternative approach...
> 
> What if we pushed most of the code for a driver into a library?
> Imagine a library which controls the low level operations of a
> hypervisor -- create a vm, attach a NIC, etc. Then the driver would
> become a shim around that which was relatively thin, but owned the
> interface into the nova core. The driver handles the nova specific
> things like knowing how to create a config drive, or how to
> orchestrate with cinder, but hands over all the hypervisor operations
> to the library. If we found a bug in the library we just pin our
> dependancy on the version we know works whilst we fix things.
> 
> In fact, the driver inside nova could be a relatively generic "library
> driver", and we could have multiple implementations of the library,
> one for each hypervisor.

I don't think that particularly solves the problem, particularly
the ones you are most concerned about above of API stability. The
naive impl of any "library" for the virt driver would pretty much
mirror the nova virt API. The virt driver impls would thus have to
do the job of taking the Nova objects passed in as parameters and
turning them into something "stable" to pass to the library. Except
now instead of us only having to figure out a stable API in one
place, every single driver has to reinvent the wheel defining their
own stable interface & objects. I'd also be concerned that ongoing
work on drivers is still going to require alot of patches to Nova
to update the shims all the time, so we're still going to contend
on resource fairly highly.

> > b) The conflict Dan is speaking of is around the current situation where we
> > have a limited core review team bandwidth and we have to pick and choose
> > which virt driver-specific features we will review. This leads to bad
> > feelings and conflict.
> 
> The way this worked in the past is we had cores who were subject
> matter experts in various parts of the code -- there is a clear set of
> cores who "get" xen or libivrt for example and I feel like those
> drivers get reasonable review times. What's happened though is that
> we've added a bunch of drivers without adding subject matter experts
> to core to cover those drivers. Those newer drivers therefore have a
> harder time getting things reviewed and approved.

FYI, for Juno at least I really don't consider that even the libvirt
driver got acceptable review times in any sense. The pain of waiting
for reviews in libvirt code I've submitted this cycle is what prompted
me to start this thread. All the virt drivers are suffering way more
than they should be, but those without core team representation suffer
to an even greater degree.  And this is ignoring the point Jay & I
were making about how the use of a single team means that there is
always contention for feature approval, so much work gets cut right
at the start even if maintainers of that area felt it was valuable
and worth taking.

> > c) It's the impact to the CI and testing load that I see being the biggest
> > benefit to the split-out driver repos. Patches proposed to the XenAPI driver
> > shouldn't have the Hyper-V CI tests run against the patch. Likewise, running
> > libvirt unit tests in the VMWare driver repo doesn't make a whole lot of
> > sense, and all of these tests add a not-insignificant load to the overall
> > upstream and external CI systems. The long wait time for tests to come back
> > means contributors get frustrated, since many reviewers tend to wait until
> > Jenkins returns some result before they review. All of this leads to
> > increased conflict that would be somewhat ameliorated by having separate
> > code repos for the virt drivers.
> 
> It is already possible to filter CI runs to specific paths in the
> code. We just didn't choose to do that for policy reasons. We could
> change that right now with a trivial tweak to each CI system's zuul
> config.

We have to jump through far more hoops to do so, even as developers
running things locally. eg want to run pep8 locally to test your
work ? You have to wait 3 minutes while it checks the entire of
the nova codebase. So we had to invent a mode where it only checks
the files in the current GIT HEAD. Likewise for unit tests - if you
invoke them you have to pass args to filter to just the area of the
repo you are working on. These kind of problems simply goes away
completely if we have separate repos without having to do special
setup tasks. Smaller modules would be far less daunting for new
contributors looking to get involved in Nova development too which
I think is an important factor

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list