[openstack-dev] [tc] [all] TC Report 18-26
Zane Bitter
zbitter at redhat.com
Mon Jul 2 19:31:13 UTC 2018
On 28/06/18 15:09, Fox, Kevin M wrote:
> I'll weigh in a bit with my operator hat on as recent experience it pertains to the current conversation....
>
> Kubernetes has largely succeeded in common distribution tools where OpenStack has not been able to.
> kubeadm was created as a way to centralize deployment best practices, config, and upgrade stuff into a common code based that other deployment tools can build on.
>
> I think this has been successful for a few reasons:
> * kubernetes followed a philosophy of using k8s to deploy/enhance k8s. (Eating its own dogfood)
This is also TripleO's philosophy :)
> * was willing to make their api robust enough to handle that self enhancement. (secrets are a thing, orchestration is not optional, etc)
I don't even think that self-upgrading was the most important
consequence of that. Fundamentally, they understood how applications
would use it and made sure that the batteries were included. I think the
fact that they conceived it explicitly as an application operation
technology made this an obvious choice. I suspect that the reason we've
lagged in standardising those things in OpenStack is that there's so
many other ways to think of OpenStack before you get to that one.
> * they decided to produce a reference product (very important to adoption IMO. You don't have to "build from source" to kick the tires.)
> * made the barrier to testing/development as low as 'curl http://......minikube; minikube start' (this spurs adoption and contribution)
That's not so different from devstack though.
> * not having large silo's in deployment projects allowed better communication on common tooling.
> * Operator focused architecture, not project based architecture. This simplifies the deployment situation greatly.
> * try whenever possible to focus on just the commons and push vendor specific needs to plugins so vendors can deal with vendor issues directly and not corrupt the core.
I agree with all of those, but to be fair to OpenStack, you're leaving
out arguably the most important one:
* Installation instructions start with "assume a working datacenter"
They have that luxury; we do not. (To be clear, they are 100% right to
take full advantage of that luxury. Although if there are still folks
who go around saying that it's a trivial problem and OpenStackers must
all be idiots for making it look so difficult, they should really stop
embarrassing themselves.)
> I've upgraded many OpenStacks since Essex and usually it is multiple weeks of prep, and a 1-2 day outage to perform the deed. about 50% of the upgrades, something breaks only on the production system and needs hot patching on the spot. About 10% of the time, I've had to write the patch personally.
>
> I had to upgrade a k8s cluster yesterday from 1.9.6 to 1.10.5. For comparison, what did I have to do? A couple hours of looking at release notes and trying to dig up examples of where things broke for others. Nothing popped up. Then:
>
> on the controller, I ran:
> yum install -y kubeadm #get the newest kubeadm
> kubeadm upgrade plan #check things out
>
> It told me I had 2 choices. I could:
> * kubeadm upgrade v1.9.8
> * kubeadm upgrade v1.10.5
>
> I ran:
> kubeadm upgrade v1.10.5
>
> The control plane was down for under 60 seconds and then the cluster was upgraded. The rest of the services did a rolling upgrade live and took a few more minutes.
>
> I can take my time to upgrade kubelets as mixed kubelet versions works well.
>
> Upgrading kubelet is about as easy.
>
> Done.
>
> There's a lot of things to learn from the governance / architecture of Kubernetes..
+1
> Fundamentally, there isn't huge differences in what Kubernetes and OpenStack tries to provide users. Scheduling a VM or a Container via an api with some kind of networking and storage is the same kind of thing in either case.
Yes, from a user perspective that is (very) broadly accurate. But again,
Kubernetes assumes that somebody else has provided the bottom few layers
of implementation, while OpenStack *is* the somebody else.
> The how to get the software (openstack or k8s) running is about as polar opposite you can get though.
>
> I think if OpenStack wants to gain back some of the steam it had before, it needs to adjust to the new world it is living in. This means:
> * Consider abolishing the project walls. They are driving bad architecture (not intentionally but as a side affect of structure)
In the spirit of cdent's blog post about random ideas: one idea I keep
coming back to (and it's been around for a while, I don't remember who
it first came from) is to start treating the compute node as a single
project (I guess the k8s equivalent would be a kubelet). Have a single
API - commands go in, events come out.
Note that this would not include just the compute-node functionality of
Nova, Neutron and Cinder, but ultimately also that of Ceilometer,
Watcher, Freezer, Masakari (and possibly Congress and Vitrage?) as well.
Some of those projects only exist at all because of boundaries between
stuff on the compute node, while others are just unnecessarily
complicated to add to a deployment because of those boundaries. (See
https://julien.danjou.info/lessons-from-openstack-telemetry-incubation/
for some insightful observations on that topic - note that you don't
have to agree with all of it to appreciate the point that the
balkanisation of the compute node architecture leads to bad design
decisions.)
In theory doing that should make it easier to build e.g. a cut-down
compute API of the kind that Jay was talking about upthread.
I know that the short-term costs of making a change like this are going
to be high - we aren't even yet at a point where making a stable API for
compute drivers has been judged to meet a cost/benefit analysis. But
maybe if we can do a comprehensive job of articulating the long-term
benefits, we might find that it's still the right thing to do.
> * focus on the commons first.
> * simplify the architecture for ops:
> * make as much as possible stateless and centralize remaining state.
> * stop moving config options around with every release. Make it promote automatically and persist it somewhere.
> * improve serial performance before sharding. k8s can do 5000 nodes on one control plane. No reason to do nova cells and make ops deal with it except for the most huge of clouds
> * consider a reference product (think Linux vanilla kernel. distro's can provide their own variants. thats ok)
> * come up with an architecture team for the whole, not the subsystem. The whole thing needs to work well.
We probably actually need two groups: one to think about the
architecture of the user experience of OpenStack, and one to think about
the internal architecture as a whole.
I'd be very enthusiastic about the TC chartering some group to work on
this. It has worried me for a long time that there is nobody designing
OpenStack as an whole; design is done at the level of individual
projects, and OpenStack is an ad-hoc collection of what they produce.
Unfortunately we did have an Architecture Working Group for a while (in
the sense of the second definition above), and it fizzled out because
there weren't enough people with enough time to work on it. Until we can
identify at least a theoretical reason why a new effort would be more
successful, I don't think there is going to be any appetite for trying
again.
cheers,
Zane.
> * encourage current OpenStack devs to test/deploy Kubernetes. It has some very good ideas that OpenStack could benefit from. If you don't know what they are, you can't adopt them.
>
> And I know its hard to talk about, but consider just adopting k8s as the commons and build on top of it. OpenStack's api's are good. The implementations right now are very very heavy for ops. You could tie in K8s's pod scheduler with vm stuff running in containers and get a vastly simpler architecture for operators to deal with. Yes, this would be a major disruptive change to OpenStack. But long term, I think it would make for a much healthier OpenStack.
>
> Thanks,
> Kevin
> ________________________________________
> From: Zane Bitter [zbitter at redhat.com]
> Sent: Wednesday, June 27, 2018 4:23 PM
> To: openstack-dev at lists.openstack.org
> Subject: Re: [openstack-dev] [tc] [all] TC Report 18-26
>
> On 27/06/18 07:55, Jay Pipes wrote:
>> WARNING:
>>
>> Danger, Will Robinson! Strong opinions ahead!
>
> I'd have been disappointed with anything less :)
>
>> On 06/26/2018 10:00 PM, Zane Bitter wrote:
>>> On 26/06/18 09:12, Jay Pipes wrote:
>>>> Is (one of) the problem(s) with our community that we have too small
>>>> of a scope/footprint? No. Not in the slightest.
>>>
>>> Incidentally, this is an interesting/amusing example of what we talked
>>> about this morning on IRC[1]: you say your concern is that the scope
>>> of *Nova* is too big and that you'd be happy to have *more* services
>>> in OpenStack if they took the orchestration load off Nova and left it
>>> just to handle the 'plumbing' part (which I agree with, while noting
>>> that nobody knows how to get there from here); but here you're
>>> implying that Kata Containers (something that will clearly have no
>>> effect either way on the simplicity or otherwise of Nova) shouldn't be
>>> part of the Foundation because it will take focus away from
>>> Nova/OpenStack.
>>
>> Above, I was saying that the scope of the *OpenStack* community is
>> already too broad (IMHO). An example of projects that have made the
>> *OpenStack* community too broad are purpose-built telco applications
>> like Tacker [1] and Service Function Chaining. [2]
>>
>> I've also argued in the past that all distro- or vendor-specific
>> deployment tools (Fuel, Triple-O, etc [3]) should live outside of
>> OpenStack because these projects are more products and the relentless
>> drive of vendor product management (rightfully) pushes the scope of
>> these applications to gobble up more and more feature space that may or
>> may not have anything to do with the core OpenStack mission (and have
>> more to do with those companies' product roadmap).
>
> I'm still sad that we've never managed to come up with a single way to
> install OpenStack. The amount of duplicated effort expended on that
> problem is mind-boggling. At least we tried though. Excluding those
> projects from the community would have just meant giving up from the
> beginning.
>
> I think Thierry's new map, that collects installer services in a
> separate bucket (that may eventually come with a separate git namespace)
> is a helpful way of communicating to users what's happening without
> forcing those projects outside of the community.
>
>> On the other hand, my statement that the OpenStack Foundation having 4
>> different focus areas leads to a lack of, well, focus, is a general
>> statement on the OpenStack *Foundation* simultaneously expanding its
>> sphere of influence while at the same time losing sight of OpenStack
>> itself -- and thus the push to create an Open Infrastructure Foundation
>> that would be able to compete with the larger mission of the Linux
>> Foundation.
>>
>> [1] This is nothing against Tacker itself. I just don't believe that
>> *applications* that are specially built for one particular industry
>> belong in the OpenStack set of projects. I had repeatedly stated this on
>> Tacker's application to become an OpenStack project, FWIW:
>>
>> https://review.openstack.org/#/c/276417/
>>
>> [2] There is also nothing wrong with service function chains. I just
>> don't believe they belong in *OpenStack*. They more appropriately belong
>> in the (Open)NFV community because they just are not applicable outside
>> of that community's scope and mission.
>>
>> [3] It's interesting to note that Airship was put into its own
>> playground outside the bounds of the OpenStack community (but inside the
>> bounds of the OpenStack Foundation).
>
> I wouldn't say it's inside the bounds of the Foundation, and in fact
> confusion about that is a large part of why I wrote the blog post. It is
> a 100% unofficial project that just happens to be hosted on our infra.
> Saying it's inside the bounds of the Foundation is like saying
> Kubernetes is inside the bounds of GitHub.
>
>> Airship is AT&T's specific
>> deployment tooling for "the edge!". I actually think this was the
>> correct move for this vendor-opinionated deployment tool.
>>
>>> So to answer your question:
>>>
>>> <jaypipes> zaneb: yeah... nobody I know who argues for a small stable
>>> core (in Nova) has ever said there should be fewer higher layer services.
>>> <jaypipes> zaneb: I'm not entirely sure where you got that idea from.
>>
>> Note the emphasis on *Nova* above?
>>
>> Also note that when I've said that *OpenStack* should have a smaller
>> mission and scope, that doesn't mean that higher-level services aren't
>> necessary or wanted.
>
> Thank you for saying this, and could I please ask you to repeat this
> disclaimer whenever you talk about a smaller scope for OpenStack.
> Because for those of us working on higher-level services it feels like
> there has been a non-stop chorus (both inside and outside the project)
> of people wanting to redefine OpenStack as something that doesn't
> include us.
>
> The reason I haven't dropped this discussion is because I really want to
> know if _all_ of those people were actually talking about something else
> (e.g. a smaller scope for Nova), or if it's just you. Because you and I
> are in complete agreement that Nova has grown a lot of obscure
> capabilities that make it fiendishly difficult to maintain, and that in
> many cases might never have been requested if we'd had higher-level
> tools that could meet the same use cases by composing simpler operations.
>
> IMHO some of the contributing factors to that were:
>
> * The aforementioned hostility from some quarters to the existence of
> higher-level projects in OpenStack.
> * The ongoing hostility of operators to deploying any projects outside
> of Keystone/Nova/Glance/Neutron/Cinder (*still* seen playing out in the
> Barbican vs. Castellan debate, where we can't even correct one of
> OpenStack's original sins and bake in a secret store - something k8s
> managed from day one - because people don't want to install another ReST
> API even over a backend that they'll already have to install anyway).
> * The illegibility of public Nova interfaces to potential higher-level
> tools.
>
>> It's just that Nova has been a dumping ground over the past 7+ years for
>> features that, looking back, should never have been added to Nova (or at
>> least, never added to the Compute API) [4].
>>
>> What we were discussing yesterday on IRC was this:
>>
>> "Which parts of the Compute API should have been implemented in other
>> services?"
>>
>> What we are discussing here is this:
>>
>> "Which projects in the OpenStack community expanded the scope of the
>> OpenStack mission beyond infrastructure-as-a-service?"
>>
>> and, following that:
>>
>> "What should we do about projects that expanded the scope of the
>> OpenStack mission beyond infrastructure-as-a-service?"
>>
>> Note that, clearly, my opinion is that OpenStack's mission should be to
>> provide infrastructure as a service projects (both plumbing and porcelain).
>>
>> This is MHO only. The actual OpenStack mission statement [5] is
>> sufficiently vague as to provide no meaningful filtering value for
>> determining new entrants to the project ecosystem.
>
> I think this is inevitable, in that if you want to define cloud
> computing in a single sentence it will necessarily be very vague.
>
> That's the reason for pursuing a technical vision statement
> (brainstorming for which is how this discussion started), so we can
> spell it out in a longer form.
>
> cheers,
> Zane.
>
>> I *personally* believe that should change in order for the *OpenStack*
>> community to have some meaningful definition and differentiation from
>> the broader cloud computing, application development, and network
>> orchestration ecosystems.
>>
>> All the best,
>> -jay
>>
>> [4] ... or never brought into the Compute API to begin with. You know,
>> vestigial tail and all that.
>>
>> [5] for reference: "The OpenStack Mission is to produce a ubiquitous
>> Open Source Cloud Computing platform that is easy to use, simple to
>> implement, interoperable between deployments, works well at all scales,
>> and meets the needs of users and operators of both public and private
>> clouds."
>>
>>> I guess from all the people who keep saying it ;)
>>>
>>> Apparently somebody was saying it a year ago too :D
>>> https://twitter.com/zerobanana/status/883052105791156225
>>>
>>> cheers,
>>> Zane.
>>>
>>> [1]
>>> http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2018-06-26.log.html#t2018-06-26T15:30:33
>>>
>>>
>>> __________________________________________________________________________
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list