[Openstack] Running for Nova PTL

Ed_Conzel at DELL.com Ed_Conzel at DELL.com
Fri Feb 24 15:35:25 UTC 2012


Soren,

I like most of what you say, but the ""no new features at all" policy for trunk for the Folsom" causes concern. I absolutely agree that Nova needs to be more stable, predictable and  have more work done around operations. This is especially true for service provider usage. I think if Nova was closer to feature complete, that the ""no new features at all" policy for trunk for the Folsom" would be fine. It is probably doable for Folsom if the focus is on private cloud functionality. But for service providers, Nova is not feature complete and there is still much needed from the operations standpoint.

I like the way you have put out your positions in public so everyone can understand your viewpoint. It would be good to see more of this from other PTL nominees.

Thanks!

Ed

-----Original Message-----
From: openstack-bounces+ed_conzel=dell.com at lists.launchpad.net [mailto:openstack-bounces+ed_conzel=dell.com at lists.launchpad.net] On Behalf Of Soren Hansen
Sent: Friday, February 24, 2012 6:08 AM
To: Jesse Andrews
Cc: openstack at lists.launchpad.net
Subject: Re: [Openstack] Running for Nova PTL

2012/2/23 Jesse Andrews <anotherjesse at gmail.com>:
> I'd love to hear more specifics about what needs more focus.  These 
> issues are large and have been the major focus of the core team for a 
> while.

>> * Nova is too big.
>>   Very few (if any) core developers are comfortable reviewing every
>>   part of the code base.  In itself, this isn't necessarily a 
>> problem,
>>   but I think it would be valuable to try to somehow acknowledge that
>>   the average focus is much narrower than "all of nova".
>
> As for services, a major amount of work has been done to improve the 
> situation, such as:
>
>  - volumes: once a name is agreed upon (cindr was vish's proposal) 
> volumes can be abstracted during folsom - the internals are now 
> separated and during essex you can deploy as seperate endpoints
>  - network: nova-network will be deprecated in folsom assuming 
> successful integration of quantum (as was discussed at the last PBB
> meeting)
>  - identity: nova's user system was deprecated during diablo and being 
> removed in essex - a migration path exists
>  - ec2 compat: during essex ec2 access/secret was moved to keystone, 
> cert management was decoupled from API
>
> Are there addition areas to make nova smaller?

The goal isn't really to make Nova smaller per se. It's really more about trying to group developers around areas of expertise instead of expecting every nova-core member to be an expert on >100K lines of code.
If that means splitting things out, that's ok, but I'd be more than happy to explore other avenues as well. Specifically, having subsystem specific branches that a) don't follow the release cycle (features can mature there for as long as they need, and don't get merged back into trunk until they're ready) and that b) are managed entirely by the relevant subteams. As PTL I might have final say (in the sense that I'd settle disputes), but I would encourage the subteams to feel empowered to make any and all decisions about things that affect only their subsystem.

> For instance, a topic for folsom is how we can move drivers out of core.

I'm not opposed at all to properly splitting things out. The more separate things are, the clearer the interfaces will have to be. This is a good thing.

>> * Lots of things in Nova that should be orthogonal are not.
>>   This problem is especially prevalent in the virtualisation layer. 
>> The
>>   layout and number of disks you get attached to instances shouldn't
>>   depend on the hypervisor you've chosen, but it does. There is lots
>>   and lots and lots of logic embedded in both the libvirt and 
>> XenServer
>>   drivers that isn't related to the hypervisor, but is a result of 
>> the
>>   origin of these drivers.
> There was a major push to fix many of the identified issues around 
> "parity" in Essex by Rackspace Public Cloud, Cloud Builders, and 
> Citrix.  For instance the disk configuration issue you mentioned was 
> blueprinted at the last summit and fixed in Essex.

Indeed. However, the fact that it could happen at all is symptomatic of a deeper problem. A problem that still exists, even. For example, the "public" method the virt drivers expose for spawning a new instance is a "spawn" method. What each driver decides to do when being told to "spawn" a new instance, is entirely up to the driver.  They may have gotten aligned more this cycle, but the core problem remains: Developers need to go out of their way to make sure these things are aligned. It should be the other way around: We should need to go out of our way to deviate from one another.

> Are there specific bugs/blueprints that should be prioritized in folsom?

I'll take a look at the bp list and get back to you on that.

>> * The overall quality is decreasing
>>   There's an almost unilateral focus on features across the board. 
>> The
>>   topic of almost every session at the summit is some new feature.
>>   There is very little focus on stability, predictability and
>>   operation. Personally, I think that shows very clearly in the final
>>   product.
> I think that your statement is harsh and over-reaching.

Fair enough. Reviewing the schedule from the last summit I guess it's more a of a 50/50-ish split.

> Unlike previous releases, we've tried to design the milestone  
>structure to have a focus on quality and uniform experience regardless  
>of deployment choices.

Also fair enough. I just think we need an even narrower focus on these issues. I'm tempted to go with a "no new features at all" policy for trunk for the Folsom release.

> While there are things that can be improved, we've taken an iterative  
>approach to improving the situation (both during essex and then in  
>discussions at the next summit)

Sure. I don't expect to be able to wave a magic wand, flip the "be stable" bit and everything will suddenly be awesome and solid.

> The work done by mtaylor & jblair on gating merges has lead to a much 
> saner trunk.  During diablo our team would routinely spend a few hours 
> a day fixing trunk.  During Essex the timeframe having a broken trunk 
> was the exception!

We've certainly made great strides on that front. However, we need better and more tests to ensure that different combinations of configuration options behave as expected. It's unfair to defer that to deployers who care about said combination of configuration options, and require that they offer ressources to test things there. There are plenty of things we can do to verify that e.g. a volume driver upholds the contract, and that, say, the virt drivers are happy as long as said contract is upheld.

I know it's an iterative process, but that doesn't mean the problems are any less real.

> I look forward to further discussions about improving openstack 
> regardless of who is PTL.

As do I.

--
Soren Hansen             | http://linux2go.dk/ Senior Software Engineer | http://www.cisco.com/ Ubuntu Developer         | http://www.ubuntu.com/ OpenStack Developer      | http://www.openstack.org/

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack at lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


More information about the Openstack mailing list