[openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

Nikola Đipanov ndipanov at redhat.com
Fri Sep 5 10:40:59 UTC 2014


On 09/04/2014 12:24 PM, Daniel P. Berrange wrote:
> Position statement
> ==================
> 
> Over the past year I've increasingly come to the conclusion that
> Nova is heading for (or probably already at) a major crisis. If
> steps are not taken to avert this, the project is likely to loose
> a non-trivial amount of talent, both regular code contributors and
> core team members. That includes myself. This is not good for
> Nova's long term health and so should be of concern to anyone
> involved in Nova and OpenStack.
> 
> For those who don't want to read the whole mail, the executive
> summary is that the nova-core team is an unfixable bottleneck
> in our development process with our current project structure.
> The only way I see to remove the bottleneck is to split the virt
> drivers out of tree and let them all have their own core teams
> in their area of code, leaving current nova core to focus on
> all the common code outside the virt driver impls. I, now, none
> the less urge people to read the whole mail.
> 
> 
> Background information
> ======================
> 
> I see many factors coming together to form the crisis
> 
>  - Burn out of core team members from over work 
>  - Difficulty bringing new talent into the core team
>  - Long delay in getting code reviewed & merged
>  - Marginalization of code areas which aren't popular
>  - Increasing size of nova code through new drivers
>  - Exclusion of developers without corporate backing
> 
> Each item on their own may not seem too bad, but combined they
> add up to a big problem.
> 

As many others - I cannot +1 this enough. Some technical comments below
that we may want to consider before, but to sum them up - this will be a
TON OF WORK! we better make sure we really want to do this before.

However - please don't read this as FUD, maybe rather pointing out that
devil is in the details, and maybe getting ahead of myself with too deep
of a dive.

> 
>  - A fairly significant amount of nova code would need to be
>    considered semi-stable API. Certainly everything under nova/virt
>    and any object which is passed in/out of the virt driver API.
>    Changes to such APIs would have to be done in a backwards
>    compatible manner, since it is no longer possible to lock-step
>    change all the virt driver impls. In some ways I think this would
>    be a good thing as it will encourage people to put more thought
>    into the long term maintainability of nova internal code instead
>    of relying on being able to rip it apart later, at will.
> 

I think we should not underestimate how big of a job this will be. We
have been treating that API as internal for a long time and a lot of
abstractions are just broken and need to be redesigned and then
refactored. A lot of the stuff is implementation specific (live
migrations is a good example of this). What makes it more difficult is
that we need to get this as right as possible before we do the split.

Now I am not saying this cannot be done or that we shouldn't to it,
however I _am_ saying that we should not take lightly how much work
there will be and how fiddly the work itself is.

On top of that - there are some other serious issues with nova common
code that we need to take care of ASAP, and this will definitely
increase the churn and make that more difficult. We should take this
into account and make sure we are focusing efforts on the right things.
Making sure we do is the biggest challenge nova core faces in addition
to all the others mentioned above.

>  - The nova/virt/driver.py class would need to be much better
>    specified. All parameters / return values which are opaque dicts
>    must be replaced with objects + attributes. Completion of the
>    objectification work is mandatory, so there is cleaner separation
>    between virt driver impls & the rest of Nova.
> 

Not only that - currently nova-objects do their versioning magic only
over RPC, while they would have to do it over library boundaries. This
in itself will require work, and is likely going to influence how we
stabilize the API.

However - splitting out the scheduler is likely to require objects to be
able to do similar things, and there are other things that we may want
to do (e.g. using properly versioned data for the extensible resources)
that will benefit from this.

>  - If changes are required to common code, the virt driver developer
>    would first have to get the necccessary pieces merged into Nova
>    common. Then the follow up virt driver specific changes could be
>    proposed to their repo. This implies that some changes to virt
>    drivers will still contend for resource in the common nova repo 
>    and team. This contention should be lower than it is today though
>    since the current nova core team should have less code to look 
>    after per-person on aggregate.
> 

A handy example of this I can think of is the currently granted FFE for
serial consoles - consider how much of the code went into the common
part vs. the libvirt specific part, I would say the ratio is very close
to 1 if not even in favour of the common part (current 4 outstanding
patches are all for core, and out of the 5 merged - only one of them was
purely libvirt specific, assuming virt/ will live in nova-common).

Joe asked a similar question elsewhere on the thread.

Once again - I am not against doing it - what I am saying is that we
need to look into this closer as it may not be as big of a win from the
number of changes needed per feature as we may think.

Just some things to think about with regards to the whole idea, by no
means exhaustive.

Thanks,
N.



More information about the OpenStack-dev mailing list