[openstack-dev] [nova][neutron][cinder] Averting the Nova crisis by splitting out virt drivers

Andrew Laski andrew.laski at rackspace.com
Thu Sep 11 13:15:54 UTC 2014


On 09/10/2014 07:23 PM, Michael Still wrote:
> On Thu, Sep 11, 2014 at 8:11 AM, Jay Pipes <jaypipes at gmail.com> wrote:
>
>> a) Sorting out the common code is already accounted for in Dan B's original
>> proposal -- it's a prerequisite for the split.
> Its a big prerequisite though. I think we're talking about a release
> worth of work to get that right. I don't object to us doing that work,
> but I think we need to be honest about how long its going to take. It
> will also make the core of nova less agile, as we'll find it hard to
> change the hypervisor driver interface over time. Do we really think
> its ready to be stable?

I don't.  For a long time now I've wanted to split the gigantic spawn() 
method in the virt api into more discrete steps.  I think there's some 
opportunity for doing some steps in parallel and the potential to have 
failures reported earlier and handled better.  But I've been sitting on 
it because I wanted to use 'tasks' as a way to address the 
parallelization and that work hasn't happened yet.  But this work would 
be introducing new calls which would be used based on some sort of 
capability query to the driver, so I don't think this work is 
necessarily hindered by stabilizing the interface.

I also think the migration/resize methods could use some analysis before 
making a determination that they are what we want in a stable interface.

>
> As an alternative approach...
>
> What if we pushed most of the code for a driver into a library?
> Imagine a library which controls the low level operations of a
> hypervisor -- create a vm, attach a NIC, etc. Then the driver would
> become a shim around that which was relatively thin, but owned the
> interface into the nova core. The driver handles the nova specific
> things like knowing how to create a config drive, or how to
> orchestrate with cinder, but hands over all the hypervisor operations
> to the library. If we found a bug in the library we just pin our
> dependancy on the version we know works whilst we fix things.
>
> In fact, the driver inside nova could be a relatively generic "library
> driver", and we could have multiple implementations of the library,
> one for each hypervisor.
>
> This would make testing nova easier too, because we know how to mock
> libraries already.
>
> Now, that's kind of what we have in the hypervisor driver API now.
> What I'm proposing is that the point where we break out of the nova
> code base should be closer to the hypervisor than what that API
> presents.
>
>> b) The conflict Dan is speaking of is around the current situation where we
>> have a limited core review team bandwidth and we have to pick and choose
>> which virt driver-specific features we will review. This leads to bad
>> feelings and conflict.
> The way this worked in the past is we had cores who were subject
> matter experts in various parts of the code -- there is a clear set of
> cores who "get" xen or libivrt for example and I feel like those
> drivers get reasonable review times. What's happened though is that
> we've added a bunch of drivers without adding subject matter experts
> to core to cover those drivers. Those newer drivers therefore have a
> harder time getting things reviewed and approved.
>
> That said, a heap of cores have spent time reviewing vmware driver
> code this release, so its obviously not as simple as I describe above.
>
>> c) It's the impact to the CI and testing load that I see being the biggest
>> benefit to the split-out driver repos. Patches proposed to the XenAPI driver
>> shouldn't have the Hyper-V CI tests run against the patch. Likewise, running
>> libvirt unit tests in the VMWare driver repo doesn't make a whole lot of
>> sense, and all of these tests add a not-insignificant load to the overall
>> upstream and external CI systems. The long wait time for tests to come back
>> means contributors get frustrated, since many reviewers tend to wait until
>> Jenkins returns some result before they review. All of this leads to
>> increased conflict that would be somewhat ameliorated by having separate
>> code repos for the virt drivers.
> It is already possible to filter CI runs to specific paths in the
> code. We just didn't choose to do that for policy reasons. We could
> change that right now with a trivial tweak to each CI system's zuul
> config.
>
> Michael
>




More information about the OpenStack-dev mailing list