[openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
Alessandro Pilotti
apilotti at cloudbasesolutions.com
Thu Sep 4 17:27:06 UTC 2014
Hi all,
This is an issue that has been discussed quite a few times. As I was fearing the
bottleneck effect is getting worse with each release.
Nova grew simply too much and even though features like networking and block
storage have been spun off at some point in time, it still lacks the cohesion
necessary for a successful long term lifecycle, or in other terms, it’s just too big to
be properly maintained by a handful of amazing and overworked people.
Compute drivers are easy to identify as decoupled sub-projects and are among
those which suffer to a bigger extent the lack of an independent development
process.
Nova is a mature project (at least relatively to the OpenStack’s context) and as
such new features and bug fixes need to go through a very thorough screening and
review before being approved and merged, which does not work well with
sub-projects that need to grow faster, especially when introduced later in the
lifecycle (e.g the current Hyper-V driver introduced in Folsom) or when being
pushed by more aggressive market requirements.
Just as an example, only 3 out of 8 Hyper-V blueprint specs have been approved
and implemented in Juno, the rest will simply get bumped to Kilo, which means
that new additional specs will need to be bumped to L and so on introducing
further delays. We ended up privileging feature parity blueprints, delaying
almost anything else.
Bug fixes landing time in stable releases is also another issue for the user
base since merging in master takes a long time and backporting requires another
long review process, e.g. more than four months in some cases [1].
As a result we ended up releasing the fixes in a project fork that became our de
facto stable release in place of upstream, while waiting for upstream merge.
We never experienced similar issues in smaller projects like Neutron, Cinder,
Ceilometer or Horizon where we are involved as well, which can be a practical
example of the potential benefits of splitting Nova.
OpenStack has a clear process for incubation, letting new projects grow as fast
as they need during their youth and integrating them into core only when a
mature stage is reached [2]. Unfortunately this process applies to projects, but
not to subprojects (Hyper-V and VMWare drivers in particular, but not only)
resulting in a way slower development pace compared to what a project lead by an
independent team could have allowed. On the other hand, Docker is an example of
a driver going the StackForge way, but its ultimate potential inclusion in Nova
will just increase the current pain points.
>From an Hyper-V team perspective, in the late Havana cycle the same reasons
highlighted in this thread almost lead us to ask for removal of the driver from
Nova in order to improve our development process, even at the cost of the
subsequent fall from (core) grace and StackForge incubation Purgatory period, so
I’m definitely happy that the conversation has been resumed with a bigger
consensus.
The main factor that blocked the Hyper-V driver’s exit from Nova was the
introduction of the Hyper-V CI during the same cycle. Regressions are a very
sensitive topic when you run OpenStack components on an operating system which
is not Linux and the CI helped a lot in blocking or discovering issues in a
timely fashion. Beside that, the size of the Hyper-V team increased considerably
during Icehouse and Juno [3], so the Hyper-V CI became a mandatory and almost
irreplaceable tool in our review process, leading us to reach an excellent level
of stability of the driver on every supported version of Hyper-V (and
progressive CI voting stability as well, but that’s another topic [4]).
This means that if we reach a point in which we agree to spin off the drivers in
separate core projects, we need to consider how driver related CIs will be still
included in the Nova review process, possibly with voting rights when the
individual CI stability allows it. Having each third party CI to vote only on
its spin-off driver project is not an option IMO, as it won’t catch regressions
introduced in Nova that affect the drivers, including race conditions [5]
An interesting area of discussion is who is going to be part of the initial core
teams for each new subproject. I truly appreciated the experience and help of
the Nova core guys, so in order to allow a smoother transition I’d suggest to
have for each new project (e.g. nova-compute-hyperv, nova-compute-vmware, etc)
an initial core team consisting in one or two members of the current Nova
sub-team and one Nova core, with ideally each patch reviewed by both the domain
experts and the Nova core. The team could then go on its way by voting its own
members as any other OpenStack project does.
Another point of discussion is the stabilization and documentation of the driver
interface. There are simply too many areas where the behavior between drivers
differs, and looking at some other driver’s behavior was in too many cases the
only source of documentation we had.
As an alternative, I’m personally not against the option of keeping Nova the way
it is, giving more autonomy (aka +2 rights) on selected subtrees to sub-teams,
but that sounds to me like a half baked solution that will lead to problems like
blueprint spec approval rights and other issues.
One of the additional very positive side effects that I can foresee, is that by
freeing up our resources from the continuous "submit-code ->
wait-long-time-for-review -> rebase -> repeat" cycle that defines the current
Nova development cycle, we’ll be able to assign more people to review patches in
Nova itself, resulting hopefully in a long term benefit for everybody.
As another clear advantage, separate projects would allow growth and new
features, while allowing Nova to focus on stability and concentrate on reducing
its technical debt, another related and recurring topic of discussion.
To recap this post, if this proposal goes on:
1) As Hyper-V Nova sub-team we are definitely in favour of the spin-off of Nova
core sub-projects, including the compute drivers.
2) We need to agree on how to handle the driver CIs reporting in Nova
3) We need to define the members of the new projects core teams
4) We need to document better the compute driver interface
We are fully aware of the great efforts pulled by the nova core team and the
points above do not imply in any way that they are not doing their best. It’s
just a matter of removing unnecessary bottlenecks for the benefit of the entire
OpenStack community.
Thanks Dan for raising this topic!
Alessandro
[1] https://review.openstack.org/#/c/83143/
[2] http://robhirschfeld.com/2014/08/18/ugly-baby/
[3] http://stackalytics.com/?company=cloudbase%20solutions&module=nova&metric=commits&release=all
[4] http://ci-stats.cloudbase.it
[5] https://bugs.launchpad.net/nova/+bug/1276772
On 04 Sep 2014, at 15:59, Thierry Carrez <thierry at openstack.org> wrote:
> Like I mentioned before, I think the only way out of the Nova death
> spiral is to split code and give control over it to smaller dedicated
> review teams. This is one way to do it. Thanks Dan for pulling this
> together :)
>
> A couple comments inline:
>
> Daniel P. Berrange wrote:
>> [...]
>> This is a crisis. A large crisis. In fact, if you got a moment, it's
>> a twelve-storey crisis with a magnificent entrance hall, carpeting
>> throughout, 24-hour portage, and an enormous sign on the roof,
>> saying 'This Is a Large Crisis'. A large crisis requires a large
>> plan.
>> [...]
>
> I totally agree. We need a plan now, because we can't go through another
> cycle without a solution in sight.
>
>> [...]
>> This has quite a few implications for the way development would
>> operate.
>>
>> - The Nova core team at least, would be voluntarily giving up a big
>> amount of responsibility over the evolution of virt drivers. Due
>> to human nature, people are not good at giving up power, so this
>> may be painful to swallow. Realistically current nova core are
>> not experts in most of the virt drivers to start with, and more
>> important we clearly do not have sufficient time to do a good job
>> of review with everything submitted. Much of the current need
>> for core review of virt drivers is to prevent the mis-use of a
>> poorly defined virt driver API...which can be mitigated - See
>> later point(s)
>>
>> - Nova core would/should not have automatic +2 over the virt driver
>> repositories since it is unreasonable to assume they have the
>> suitable domain knowledge for all virt drivers out there. People
>> would of course be able to be members of multiple core teams. For
>> example John G would naturally be nova-core and nova-xen-core. I
>> would aim for nova-core and nova-libvirt-core, and so on. I do not
>> want any +2 responsibility over VMWare/HyperV/Docker drivers since
>> they're not my area of expertize - I only look at them today because
>> they have no other nova-core representation.
>>
>> - Not sure if it implies the Nova PTL would be solely focused on
>> Nova common. eg would there continue to be one PTL over all virt
>> driver implementation projects, or would each project have its
>> own PTL. Maybe this is irrelevant if a Czars approach is chosen
>> by virt driver projects for their work. I'd be inclined to say
>> that a single PTL should stay as a figurehead to represent all
>> the virt driver projects, acting as a point of contact to ensure
>> we keep communication / co-operation between the drivers in sync.
>> [...]
>
> At this point it may look like our current structure (programs, one PTL,
> single core teams...) prevents us from implementing that solution. I
> just want to say that in OpenStack, organizational structure reflects
> how we work, not the other way around. If we need to reorganize
> "official" project structure to work in smarter and long-term healthy
> ways, that's a really small price to pay.
>
> --
> Thierry Carrez (ttx)
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list