[openstack-dev] [Nova] What's holding nova development back?

Jay Pipes jaypipes at gmail.com
Mon Sep 15 21:59:10 UTC 2014


On 09/15/2014 05:30 PM, Michael Still wrote:
> On Tue, Sep 16, 2014 at 12:30 AM, Russell Bryant <rbryant at redhat.com> wrote:
>> On 09/15/2014 05:42 AM, Daniel P. Berrange wrote:
>>> On Sun, Sep 14, 2014 at 07:07:13AM +1000, Michael Still wrote:
>>>> Just an observation from the last week or so...
>>>>
>>>> The biggest problem nova faces at the moment isn't code review latency. Our
>>>> biggest problem is failing to fix our bugs so that the gate is reliable.
>>>> The number of rechecks we've done in the last week to try and land code is
>>>> truly startling.
>>>
>>> I consider both problems to be pretty much equally as important. I don't
>>> think solving review latency or test reliabilty in isolation is enough to
>>> save Nova. We need to tackle both problems as a priority. I tried to avoid
>>> getting into my concerns about testing in my mail on review team bottlenecks
>>> since I think we should address the problems independantly / in parallel.
>>
>> Agreed with this.  I don't think we can afford to ignore either one of them.
>
> Yes, that was my point. I don't mind us debating how to rearrange
> hypervisor drivers. However, if we think that will solve all our
> problems we are confused.
>
> So, how do we get people to start taking bugs / gate failures more seriously?

A few suggestions:

1) Bug bounties

Money talks. I know it sounds silly, but lots of developers get paid to 
work on features. Not as many have financial incentive to fix bugs.

It doesn't need to be a huge amount. And I think the "wall of fame 
respect" reward for top bug fixers or gate unblockers would be a good 
incentive as well.

The foundation has a budget. I can't think of a better way to effect 
positive change than allocating $10-20K to paying bug bounties.

2) Videos discussing gate tools and diagnostics techniques

I hope I'm not bursting any of Sean Dague's bubble, but one thing we've 
been discussing, together with Dan Smith, is having a weekly or 
bi-weekly Youtube show where we discuss Nova development topics, with 
deep dives into common but hairy parts of the Nova codebase. The idea is 
to grow Nova contributors' knowledge of more parts of Nova than just one 
particular area they might be paid to work on.

I think a weekly or bi-weekly show that focuses on bug and gate issues 
would be a really great idea, and I'd be happy to play a role in this. 
The Chef+OpenStack community does weekly Youtube recordings of their 
status meetings and AFAICT, it's pretty successful.

3) Provide a clearer way to understand what is a gate/CI/infra issue and 
what is a project bug

Sometimes it's pretty hard to determine whether something in the E-R 
check page is due to something in the infra scripts, some transient 
issue in the upstream CI platform (or part of it), or actually a bug in 
one or more of the OpenStack projects.

Perhaps there is a way to identify/categorize gate failures (in the form 
of E-R recheck queries) on some "meta status" page, that would either be 
populated manually or through some clever analysis to better direct 
would-be gate block fixers to where they need to focus?

Anyway, just a few ideas,
-jay



More information about the OpenStack-dev mailing list