Open Stack

Fri Jul 6 16:58:02 UTC 2018

On 02/07/18 19:13, Jay Pipes wrote:
>>> Also note that when I've said that *OpenStack* should have a smaller 
>>> mission and scope, that doesn't mean that higher-level services 
>>> aren't necessary or wanted.
>>
>> Thank you for saying this, and could I please ask you to repeat this 
>> disclaimer whenever you talk about a smaller scope for OpenStack.
> 
> Yes. I shall shout it from the highest mountains. [1]

Thanks. Appreciate it :)

> [1] I live in Florida, though, which has no mountains. But, when I 
> visit, say, North Carolina, I shall certainly shout it from their 
> mountains.

That's where I live, so I'll keep an eye out for you if I hear shouting.

>> Because for those of us working on higher-level services it feels like 
>> there has been a non-stop chorus (both inside and outside the project) 
>> of people wanting to redefine OpenStack as something that doesn't 
>> include us.
> 
> I've said in the past (on Twitter, can't find the link right now, but 
> it's out there somewhere) something to the effect of "at some point, 
> someone just needs to come out and say that OpenStack is, at its core, 
> Nova, Neutron, Keystone, Glance and Cinder".

https://twitter.com/jaypipes/status/875377520224460800 for anyone who 
was curious.

Interestingly, that and my equally off-the-cuff reply 
https://twitter.com/zerobanana/status/875559517731381249 are actually 
pretty close to the minimal descriptions of the two broad camps we were 
talking about in the technical vision etherpad. (Noting for the record 
that cdent disputes that views can be distilled into two camps.)

> Perhaps this is what you were recollecting. I would use a different 
> phrase nowadays to describe what I was thinking with the above.

I don't think I was recalling anything in particular that *you* had 
said. Complaining about the non-core projects (presumably on the logic 
that if we kicked them out of OpenStack all their developers would 
instead go to work on radically simplifying the remaining projects 
instead?</sarcasm>) was a widespread popular pastime for at least 
roughly the 4 years from 2013-2016.

> I would say instead "Nova, Neutron, Cinder, Keystone and Glance [2] are 
> a definitive lower level of an OpenStack deployment. They represent a 
> set of required integrated services that supply the most basic 
> infrastructure for datacenter resource management when deploying 
> OpenStack."
> 
> Note the difference in wording. Instead of saying "OpenStack is X", I'm 
> saying "These particular services represent a specific layer of an 
> OpenStack deployment".

OK great. So this is wrong :) and I will attempt to explain why I think 
that in a second. But first I want to acknowledge what is attractive 
about this viewpoint (even to me). This is a genuinely useful 
observation that leads to a real insight.

The insight, I think, is the same one we all just agreed on in another 
part of the thread: OpenStack is the only open source project 
concentrating on the gap between a rack full of unconfigured equipment 
and somewhere that you could, say, install Kubernetes. We write the bit 
where the rubber meets the road, and if we don't get it done there's 
nobody else to do it! There's an almost infinite variety of different 
applications and they'll all need different parts of the higher layers, 
but ultimately they'll all need to be reified in a physical data center 
and when they do, we'll be there: that's the core of what we're building.

It's honestly only the tiniest of leaps from seeing that idea as 
attractive, useful, and genuinely insightful to seeing it as correct, 
and I don't really blame anybody who made that leap.

I'm going to gloss over the fact that we punted the actual process of 
setting up the data center to a bunch of what turned out to be 
vendor-specific installer projects that you suggest should be punted out 
of OpenStack altogether, because that isn't the biggest problem I have 
with this view.

Back in the '70s there was this idea about AI: even a 2 year old human 
can e.g. recognise images with a high degree of accuracy, but doing e.g. 
calculus is extremely hard in comparison and takes years of training. 
But computers can already do calculus! Ergo, we've solved the hardest 
part already and building the rest out of that will be trivial, AGI is 
just around the corner, &c. &c. (I believe I cribbed this explanation 
from an outdated memory of Marvin Minsky's 1982 paper "Why People Think 
Computers Can't" - specifically the section "Could a Computer Have 
Common Sense?" - so that's a better source if you actually want to learn 
something about AI.) The popularity of this idea arguably helped created 
the AI bubble, and the inevitable collision with the reality of its 
fundamental wrongness led to the AI Winter. Because in fact just because 
you can build logic out of many layers of heuristics (as human brains 
do), it absolutely does not follow that it's trivial to build other 
things that also require many layers of heuristics once you have some 
basic logic building blocks. (This is my conclusion, not Minsky's, and 
probably more influenced by reading summaries of Kahneman. But suffice 
to say the AI technology of the present, which is showing more promise, 
is called Deep Learning because it consists literally of many layers of 
heuristics. It's also still considerably worse at it than any 2 year old 
human.)

I see the problem with the OpenStack-as-layers model as being analogous. 
(I don't think there's going to be a full-on OpenStack Winter, but we've 
certainly hit the Trough of Disillusionment.) With Nova, Neutron, 
Cinder, Keystone and Glance you can build a pretty good VPS hosting 
service. But it's a mistake to think that cloud is something you get by 
layering stuff on top of VPS hosting. It's comparatively easy to build a 
VPS on top of a cloud, just like teaching a child arithmetic. But it's 
enormously difficult to build a cloud on top of VPS (it would involve a 
lot of wasteful layers of abstraction, similar to building artificial 
neurons in software).

Speaking of abstraction, let's try to pull this back to something 
concrete. Kubernetes is event-driven at a very fundamental level: when a 
pod dies, k8s gets a notification immediately and that prompts it to 
reschedule the workload. In contrast, Nova/Cinder/&c. is a black hole. 
You can't even build a sane dashboard for your VPS - let alone 
cloud-style orchestration - over it, because they have to spend all 
their time polling the API to find out if anything happened. There's an 
entire separate project (Masakari) that ~nobody has installed, basically 
dedicated to spelunking in the compute node without Nova's knowledge to 
try to surface this information. I am definitely not disrespecting the 
Masakari team, who are doing something that desperately needs doing in 
the only way that's really open to them, but that's an embarrassingly 
bad architecture for OpenStack as a whole.

So yeah, it's sometimes helpful to think about the fact that there's a 
group of components that own the low level interaction with outside 
systems (hardware, or IdM in the case of Keystone), and that almost 
every application will end up touching those directly or indirectly, 
while each using different subsets of the other functionality... *but* 
only in the awareness that those things also need to be built from the 
ground up to occupy a space in a larger puzzle.

When folks say stuff like these projects represent a "definitive lower 
level of an OpenStack deployment" they invite the listener to ignore the 
bigger picture; to imagine that if those lower level services just take 
care of their own needs then everything else can just build on top. 
That's a mistake, unless you believe (and I know *you* don't believe 
this Jay) that OpenStack needs only to provide enough building blocks to 
build VPS hosting out of, because support for all of those higher-level 
things doesn't just fall out like that. You have to consciously work at it.

Imagine for a moment that, knowing everything we know now, we had 
designed OpenStack around a system of event sources and sinks that's 
reliable in the face of network partitions &c., with components 
connecting into it to provide services to the user and to each other. 
That's what Kubernetes did. That's the key to its success. We need to do 
enable something similar, because OpenStack is still necessary for all 
of the reasons above and more.

In particular, I think one place where OpenStack provides value is that 
we are less opinionated and can allow application developers to choose 
how the event sources and sinks are connected together. That means that 
users can e.g. customise their own failovers in 'userspace' rather than 
the more one-size-fits-all approach of handling everything automatically 
inside k8s. This is theoretically the advantage of having separate 
projects instead of a monolithic design, and one reason why I don't 
think that destroying all of the boundaries between projects is the way 
forward for OpenStack. (I do still think it'd be a great thing for the 
compute node, which is entirely internal to OpenStack and definitely 
does not benefit from fragmentation.)

> Nowadays, I would further add something to the effect of "Depending on 
> the particular use cases and workloads the OpenStack deployer wishes to 
> promote, an additional layer of services provides workload orchestration 
> and workflow management capabilities. This layer of services include 
> Heat, Mistral, Tacker, Service Function Chaining, Murano, etc".

That makes sense, but the key point I want to make is that you can't 
(usefully) provide the porcelain unless the plumbing for it is in place. 
Right now information only flows one way - we have drains connected to 
the porcelain but no running water. Application developers are fetching 
water in buckets and heating it over an open fire medieval-style, while 
there are still (some) people who go around saying 'we have too much 
porcelain, we should just concentrate on making better drains'. Somebody 
dare me to stretch this metaphor even further.

> Does that provide you with some closure on this feeling of "non-stop 
> chorus" of exclusion that you mentioned above?

I'm never letting this go ;)

>> The reason I haven't dropped this discussion is because I really want 
>> to know if _all_ of those people were actually talking about something 
>> else (e.g. a smaller scope for Nova), or if it's just you. Because you 
>> and I are in complete agreement that Nova has grown a lot of obscure 
>> capabilities that make it fiendishly difficult to maintain, and that 
>> in many cases might never have been requested if we'd had higher-level 
>> tools that could meet the same use cases by composing simpler operations.
>>
>> IMHO some of the contributing factors to that were:
>>
>> * The aforementioned hostility from some quarters to the existence of 
>> higher-level projects in OpenStack.
>> * The ongoing hostility of operators to deploying any projects outside 
>> of Keystone/Nova/Glance/Neutron/Cinder (*still* seen playing out in 
>> the Barbican vs. Castellan debate, where we can't even correct one of 
>> OpenStack's original sins and bake in a secret store - something k8s 
>> managed from day one - because people don't want to install another 
>> ReST API even over a backend that they'll already have to install 
>> anyway).
>> * The illegibility of public Nova interfaces to potential higher-level 
>> tools.
> 
> I would like to point something else out here. Something that may not be 
> pleasant to confront.
> 
> Heat's competition (for resources and mindshare) is Kubernetes, plain 
> and simple.

For resources, that's undoubtedly true. For mindshare, that seems a bit 
like saying "Horses' competition for mindshare is cars". I mean, yes, 
but the _competition_ part was over a while back, cars won, and horses 
now fulfil a niche role.

That's actually OK by me. When we first started Heat, it was a project 
to make *OpenStack* resources orchestratable. Once it was up and 
running, a bunch of people came to us (at the Havana summit in Portland 
in early 2013) and said that we needed to build a software orchestration 
system. Personally, I was pretty reluctant at first. Eventually they 
convinced me. But in retrospect while they were right about the fact 
that we needed better ways to deploy software via Heat than to bake it 
into the image and pass minimal configuration in the user_data (and 
Heat's Software Deployments delivered those improvements), the thing 
they really needed was Kubernetes. Once that existed those folks melted 
away from the Heat community, and that's not a terrible outcome. Turning 
Heat into k8s would be hard and distracting; it's better to integrate 
the two together so people can get the all functionality they need from 
the projects in the best position to provide it.

There are still plenty of folks who need to do orchestration across all 
of their virtual infrastructure, and Heat is here to meet their needs. 
The project was always about trying to make OpenStack better and more 
consumable for a certain audience, and users tell us it has succeeded at 
that.

> Heat's competition is not other OpenStack projects.

In practical terms, Heat's competition is Horizon, shell scripts and 
apathy. Not necessarily in that order. Arguably Ansible as well, but 
mostly because we don't have any real integration with it, so people are 
sometimes forced to pick one or the other when they need both.

> Nova's competition is not Kubernetes (despite various people continuing 
> to say that it is).
> 
> Nova is not an orchestration system. Never was and (as long as I'm 
> kicking and screaming) never will be.
> 
> Nova's primary competition is:
> 
> * Stand-alone Ironic
> * oVirt and stand-alone virsh callers
> * Parts of VMWare vCenter [3]
> * MaaS in some respects

Do you see KubeVirt or Kata or Virtlet or RancherVM ending up on this 
list at any point? Because a lot of people* do. And Nova is absolutely 
competing for resources with those projects. Having your VM provisioning 
thing embedded in the user's orchestration system of choice is a serious 
competitive advantage. (BTW I'm currently trying to save your bacon 
here: 
http://lists.openstack.org/pipermail/openstack-dev/2018-June/131183.html)

* https://news.ycombinator.com/item?id=17013779

> * The *compute provisioning* parts of EC2, Azure, and GCP

I agree this is true in practice, but would like to note that the 
compute provisioning parts of those services are tied in to the rest of 
the cloud in ways that Nova is not tied in to the rest of OpenStack, and 
that is a *major* missed opportunity because it largely limits our 
market to the subset of people who need only the compute provisioning bits.

We chose to add features to Nova to compete with vCenter/oVirt, and not 
to add features the would have enabled OpenStack as a whole to compete 
with more than just the compute provisioning subset of EC2/Azure/GCP. 
Meanwhile, the other projects in OpenStack were working on building the 
other parts of an AWS/Azure/GCP competitor. And our vague one-sentence 
mission statement allowed us all to maintain the delusion that we were 
all working on the same thing and pulling in the same direction, when in 
truth we haven't been at all.

We can decide that we want to be one, or the other, or both. But if we 
don't all decide together then a lot of us are going to continue wasting 
our time working at cross-purposes.

> This is why there is a Kubernetes OpenStack cloud provider plugin [4].
> 
> This plugin uses Nova [5] (which can potentially use Ironic), Cinder, 
> Keystone and Neutron to deploy kubelets to act as nodes in a Kubernetes 
> cluster and load balancer objects to act as the proxies that k8s itself 
> uses when deploying Pods and Services.
> 
> Heat's architecture, template language and object constructs are in 
> direct competition with Kubernetes' API and architecture, with the 
> primary difference being a VM-centric [6] vs. a container-centric object 
> model.

Mmmm, I wouldn't really call Heat VM-centric. It's 
infrastructure-centric with a sideline in managing software, where K8s 
is software-centric. Here's a blog post I wrote from back when people 
thought Heat's competition was Puppet(!):

https://www.zerobanana.com/archive/2014/05/08#heat-configuration-management

It's aged pretty well except for the fact that k8s largely owns the 
'Software Orchestration' space now. (Although, really, k8s itself 
doesn't do 'orchestration' as such. It just starts everything up, and 
the application does its own co-ordination using etcd. Helm does do 
orchestration in the traditional sense AIUI.)

> Heat's template language is similar to Helm's chart template YAML 
> structure [7], and with Heat's evolution to the "convergence model", 
> Heat's architecture actually got closer to Kubernetes' architecture: 
> that of continually attempting to converge an observed state with a 
> desired state.

It's important to note that that model change never happened, and likely 
never will. More specifically, the set of changes labelled 'convergence' 
can be grouped into three different buckets, only one of which exists:

1) Feed all resource actions into a task queue for workers to pick up, 
enabling Heat to scale out arbitrarily (limited only by the centralised 
DB); and allow users to update stacks without waiting for previous 
operations to complete. This absolutely happened, has been the default 
since Newton, is used even by TripleO since Queens, and is working 
great. This is what most people mean when they refer to 'convergence' 
now (for a while we used to call it 'phase 1').
2) Update resources by comparing their observed state to the desired 
state and making an incremental change to converge them, then repeat. 
This can itself be divided into several different implementation phases, 
and the first one (comparing to observed rather than last-recorded 
state) actually sort-of exists as an experimental option. That said, 
this is probably never going to be completed for a number of reasons: 
lack of developers/reviewers; the need to write new resource 
implementations, thus throwing away years worth of corner-case fixes and 
resulting stability; and an inability to get events in an efficient 
(i.e. filtered at source) and reliable way.
3) Doing this constantly all the time ('continuous convergence') even 
when a stack update is not in progress. We agreed not to ever do this 
because I argued that the user needed control over the process - it's 
enough that Heat could recognise that something had changed and fix it 
during a stack update (bucket #2), after that it's better to let the 
application decide when to run a stack update, either on a timer or in 
response to events (probably via Mistral in both cases). Maybe if we 
could (efficiently) get event notifications for everything it might be a 
different story, but there's no way we can justify constant polling of 
every resource in every stack whether the user needs it or not.

> So, what is Heat to do?

One thing we need to do is to integrate with other ways of deploying 
software (especially Kubernetes and Ansible), to build better bridges 
between the infrastructure and the software running on it.

The challenging part is authenticating to those other systems. 
Unfortunately Keystone has never become a standard way of authenticating 
services outside the immediate OpenStack ecosystem. One option I want to 
explore more is just having the user put credentials for those other 
systems into Barbican. That's not an especially elegant solution, and it 
requires operators to actually install Barbican, but it least it's 
something and Heat wouldn't have to store the user's credentials itself. 
We're working on adding support for creating stacks in remote OpenStack 
cloud using this method, so that should help provide a model we can reuse.

> The hype and marketing machine is never-ending, I'm afraid. [8]
> 
> I'm not sure there's actually anything that can be done about this. 
> Perhaps it is a fait accomplis that Kubernetes/Helm will/has become 
> synonymous with "orchestration of things". Perhaps not. I'm not an 
> oracle, unfortunately.

Me neither. There are even folks who think that the Zun model of 
container deployment is going to take over the world:

https://medium.com/@steve.yegge/honestly-i-cant-stand-k8s-48c9a600e405

Who knows? He was right about Javascript. We're going to find out.

> Maybe the only thing that Heat can do to fend off the coming doom is to 
> make a case that Heat's performance, reliability, feature set or 
> integration with OpenStack's other services make it a better candidate 
> for orchestrating virtual machine or baremetal workloads on an OpenStack 
> deployment than Kubernetes is.
> 
> Sorry to be the bearer of bad news,

I assure you that, contrary to popular opinion, I have not been living 
under a rock ;)

cheers,
Zane.

Open Stack

[openstack-dev] [tc] [all] TC Report 18-26

OpenStack

Community

Documentation

Branding & Legal