[openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

Sean Dague sean at dague.net
Fri Sep 5 11:00:44 UTC 2014


On 09/05/2014 06:22 AM, Daniel P. Berrange wrote:
> On Fri, Sep 05, 2014 at 07:31:50PM +0930, Christopher Yeoh wrote:
>> On Thu, 4 Sep 2014 11:24:29 +0100
>> "Daniel P. Berrange" <berrange at redhat.com> wrote:
>>>
>>>  - A fairly significant amount of nova code would need to be
>>>    considered semi-stable API. Certainly everything under nova/virt
>>>    and any object which is passed in/out of the virt driver API.
>>>    Changes to such APIs would have to be done in a backwards
>>>    compatible manner, since it is no longer possible to lock-step
>>>    change all the virt driver impls. In some ways I think this would
>>>    be a good thing as it will encourage people to put more thought
>>>    into the long term maintainability of nova internal code instead
>>>    of relying on being able to rip it apart later, at will.
>>>
>>>  - The nova/virt/driver.py class would need to be much better
>>>    specified. All parameters / return values which are opaque dicts
>>>    must be replaced with objects + attributes. Completion of the
>>>    objectification work is mandatory, so there is cleaner separation
>>>    between virt driver impls & the rest of Nova.
>>
>> I think for this to work well with multiple repositories and drivers
>> having different priorities over implementing changes in the API it
>> would not just need to be semi-stable, but stable with versioning built
>> in from the start to allow for backwards incompatible changes. And
>> the interface would have to be very well documented including things
>> such as what exceptions are allowed to be raised through the API.
>> Hopefully this would be enforced through code as well. But as long as
>> driver maintainers are willing to commit to this extra overhead I can
>> see it working. 
> 
> With our primary REST or RPC APIs we're under quite strict rules about
> what we can & can't change - almost impossible to remove an existing
> API from the REST API for example. With the internal virt driver API
> we would probably have a little more freedom. For example, I think
> if we found an existing virt driver API that was insufficient for a
> new bit of work, we could add a new API in parallel with it, give the
> virt drivers 1 dev cycle to convert, and then permanently delete the
> original virt driver API. So a combination of that kind of API
> replacement,  versioning for some data structures/objects, and use of
> the capabilties flags would probably be sufficient. That's what I mean
> by semi-stable here - no need to maintain existing virt driver APIs
> indefinitely - we can remove & replace them in reasonably short time
> scales as long as we avoid any lock-step updates.

I have spent a lot of time over the last year working on things that
require coordinated code lands between projects.... it's much more
friction than you give it credit.

Every added git tree adds a non linear cost to mental overhead, and a
non linear integration cost. Realistically the reason the gate is in the
state it is has a ton to do with the fact that it's integrating 40 git
trees. Because virt drivers run in the process space of Nova Compute,
they can pretty much do whatever, and the impacts are going to be
somewhat hard to figure out.

Also, if spinning these out seems like the right idea, I think nova-core
needs to retain core rights over the drivers as well. Because there do
need to be veto authority on some of the worst craziness.

If the VMWare team stopped trying to build a distributed lock manager
inside their compute driver, or the Hyperv team didn't wait until J2 to
start pushing patches, I think there would be more trust in some of
these teams. But, I am seriously concerned in both those cases, and the
slow review there is a function of a historic lack of trust in judgment.
I also personally went on a moratorium a year ago in reviewing either
driver because entities at both places where complaining to my
management chain through back channels that I was -1ing their code...
when I was one of the few people actually trying to provide constructive
feedback (basically only Russell and I were reviewing that code in
Grizzly, everyone else was ignoring it). Things may have changed since
then, at least I see a ton of good work from tjones in making Nova
overall better, but that was a pretty bitter pill. (Sorry for the
tangent, but honestly if we are going to fix what's broken we probably
have to expose all related brokens.)


If the concern is that we are keeping out too many contributors by the
CI requirements: let's let Class C back in tree. I believe in the
Freebsd case you were one of the original opponents to a top level
driver, and that they should go through libvirt instead. But I'm cool
with them just showing up as a Class C.

But I honestly don't think the virt driver split is going to make any of
this easier, when you account for the additional overhead it's going to
create, and the work required to get there.

	-Sean

-- 
Sean Dague
http://dague.net



More information about the OpenStack-dev mailing list