[openstack-dev] [tc] [all] OpenStack moving both too fast and too slow at the same time

Matt Riedemann mriedemos at gmail.com
Tue May 9 02:55:16 UTC 2017

On 5/8/2017 1:10 PM, Octave J. Orgeron wrote:
> I do agree that scalability and high-availability are definitely issues
> for OpenStack when you dig deeper into the sub-components. There is a
> lot of re-inventing of the wheel when you look at how distributed
> services are implemented inside of OpenStack and deficiencies. For some
> services you have a scheduler that can scale-out, but the conductor or
> worker process doesn't. A good example is cinder, where cinder-volume
> doesn't scale-out in a distributed manner and doesn't have a good
> mechanism for recovering when an instance fails. All across the services
> you see different methods for coordinating requests and tasks such as
> rabbitmq, redis, memcached, tooz, mysql, etc. So for an operator, you
> have to sift through those choices and configure the per-requisite
> infrastructure. This is a good example of a problem that should be
> solved with a single architecturally sound solution that all services
> can standardize on.

There was an architecture workgroup specifically designed to understand 
past architectural decisions in OpenStack, and what the differences are 
in the projects, and how to address some of those issues, but from lack 
of participation the group dissolved shortly after the Barcelona summit. 
This is, again, another example of if you want to make these kinds of 
massive changes, it's going to take massive involvement and leadership.

> The problem in a lot of those cases comes down to development being
> detached from the actual use cases customers and operators are going to
> use in the real world. Having a distributed control plane with multiple
> instances of the api, scheduler, coordinator, and other processes is
> typically not testable without a larger hardware setup. When you get to
> large scale deployments, you need an active/active setup for the control
> plane. It's definitely not something you could develop for or test
> against on a single laptop with devstack. Especially, if you want to use
> more than a handful of the OpenStack services.

I think we can all agree with this. Developers don't have a lab with 
1000 nodes lying around to hack on. There was OSIC but that's gone. I've 
been requesting help in Nova from companies to do scale testing and help 
us out with knowing what the major issues are, and report those back in 
a form so we can work on those issues. People will report there are 
issues, but not do the profiling, or at least not report the results of 
profiling, upstream to help us out. So again, this is really up to 
companies that have the resources to do this kind of scale testing and 
report back and help fix the issues upstream in the community. That 
doesn't require OpenStack 2.0.




More information about the OpenStack-dev mailing list