[Openstack-operators] A Hypervisor supporting containers
Fox, Kevin M
Kevin.Fox at pnnl.gov
Fri May 2 20:26:52 UTC 2014
There is only one other programming project I know on the same scale as OpenStack, the Linux Kernel, and it has gone through the same exact situation. Its best to learn from them. The push/pull between Developers and Users problem.
A number of years back, the Kernel guys solved it largely by clearly delineating the situation:
* Mainline is not for users! Its for developers. The develops have to take the long game into account and plan on maintaining the software for many many years. This means sometimes functionality must suffer in order to keep the whole thing maintainable. If this is not done though, the whole thing falls apart. In a decade, the users won't want to use the code if the maintainability suffers to much. Users usually don't think through the long game though. Don't force the developers to accept bad patch after bad patch because "it makes our system just work". Operators will pay the ultimate cost eventually.
* Distro's take whats in mainline and add pragmatic, often dirty, patches in order to produce a system operators desire to use. Then the distro's work with the upstream developers to produce proper, maintainable solutions.
I know some folks in the community have been recommending to Operators to use mainline (particularly the TripleO project), and that is, IMHO a huge mistake.
Decisions like where the code for the docker plugin is stored, are purely a developer concern if you are doing the proper thing and getting your packages from a distro. Its up for the distro to figure out where to pull the code from and how to get it to you. Its your choice which distro to use that has the all the features you desire.
If on the other hand, you want want to do all the work of a distro and roll your own from trunk, expect it to be hard. That's why the distro guys make the big bucks. ;)
Just my 2 cents.
From: Narayan Desai [narayan.desai at gmail.com]
Sent: Friday, May 02, 2014 12:56 PM
To: Stefano Maffulli
Cc: openstack-operators at lists.openstack.org
Subject: Re: [Openstack-operators] A Hypervisor supporting containers
Sorry, I took some metaphorical license. There was an old tv show called "when animals attack" that I was drawing a parallel to. I think there is literally no malicious intent in this situation; I said as much in my mail.
The problem is that there is a severe culture clash. This isn't new, and my talking about it isn't new. I sent several treatises to the foundation list about 18 months ago talking about just this sort of problem. The responses ranged from talks about project infrastructure as code to surprise that anyone was running the openstack code base.
In my mind, the project has been heading in a wrong direction. Here is my diagnosis. The project started from an extremely practical base. NASA had problems with eucalyptus (which mirrored the issues we had at scale on our system Magellan). Rackspace needed swift. In both cases, working code trumped design, testing, etc. There were machines to make work, and that was the number one priority. When we first deployed (around 12/2010) the system was a revelation compared with its competitors.
Over time, the project has become more of a developer focused project. There has been an enormous influx of developer resources between the project getting hot (essentially becoming a job mill for developers), and a pageant for companies to compete in. This growth and success has clearly been difficult to deal with in terms of project infrastructure. I'm still amazed at the fact that Vish managed to hold nova together through that kind of growth. (as a friend said, it isn't that the bear dances well, it is that it dances at all). As I said in my mail, openstack has developed an immune system to maintain aggregate code and design quality. While this immune system helps to keep bad stuff out, it keeps a lot of good stuff out as well. This immune system manifests itself as a bunch of things; the code integration process is one example, but there are plenty of others as well.
Over time, the incentives have pushed toward creating more projects and more advanced functionality. This has resulted in projects that could be incubated over time without a hard requirement that they be usable from day one. The usability problems with neutron are a good example of this class of issue, IMO.
The core misunderstanding seems to be expecting operators to work under the same incentive structure as developers. Operators have one incentive: to make their systems work well. They don't have explicit incentives to participate in blueprints, code reviews, or coding. Expecting them to integrate with the project in similar ways to developers seems to be a mistake the project continues to make, even with some of the new efforts to pay attention to their feedback more explicitly. The core ethos of the project are developer ethos, and this will likely continue.
Don't get me wrong, getting better feedback is better than not, but bolting on operator focused activities won't change this fundamental characteristic.
It isn't clear that operators can contribute in ways that are valued by the openstack community, or likely to result in leverage for their organizations. We've tried a variety of methods, and they have all fallen short.
My major question is how to build a productive configuration that integrates operators (and their expertise) without requiring them to interact like developers do. I'm not sure quite how to do that.
On Fri, May 2, 2014 at 11:18 AM, Stefano Maffulli <stefano at openstack.org<mailto:stefano at openstack.org>> wrote:
On 05/02/2014 05:47 AM, Narayan Desai wrote:
> tl;dr: openstack is starting to feel like a tv show called "when
> developers attack"
I respect your opinion but I strongly disagree with it: there is no
attack, there is no "fight" between developers and operators. There is
friction but no deliberate attempt to harm (as the term 'attack'
implies). Quite the contrary is true instead: I can see deliberate
attempts to oil and reduce friction at different spots in our community.
There is a strong and concerted effort to make sure that operators'
opinions are taken in proper consideration by OpenStack developers. The
OpenStack Foundation started regular meetings with operators to collect
feedback and identify issues. The first happened a couple months ago,
another one is being scheduled, at 6 months interval. For quite some
time, during the Summits, the Foundation had dedicated Operators tracks
and we keep introducing more operators-specific events (see Tom's
Outside of the Foundation, the developers community sent quite strong
signals recently with the election of Michael Still, whose platform as
PTL is all about listening to people and "producing reliable production
grade code". Russell Bryant's effort to change how Nova blueprints
are discussed and approved also is a strong signal from the developers
that they listen to operators and want to have their involvement.
I can see why you're upset though: change is slow to happen, this effort
may be too little, too late. Objectively though, look at the numbers:
thousands of occasional developers, hundreds of committed developers
need to be steered while running.
> We've seen features proposed for removal (not just in nova) because of
> lack of testing coverage. Features that have been integrated for years,
> that we've been using in production *for years without any problems*.
> Getting new code integrated is a nightmare. Take a look at this:
That's an unfortunate case but I read this differently from you. To me
this is a typical case of a blueprint approved without a proper
discussion and planning phase. Russell and Mark McClain+Kyle Mestery put
forward a fix for this sort of issues already for Nova and
Neutron; let's see how these work out in Juno.
> The feedback loops from users/ops continue to be broken. Tim's efforts
> on behalf of the user committee are important steps in the right
> direction, but the developer culture is openstack culture in a deep way.
> Operators continue to be on the outside.
What else would you suggest we can all do besides what is already being
put in place?
Ask and answer questions on https://ask.openstack.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-operators