[openstack-dev] [all] [tc] Multi-clouds integration by OpenStack cascading

henry hly henry4hly at gmail.com
Thu Oct 9 06:49:07 UTC 2014


Hi Joshua,

Absolutely internally improvement of single openstack is the first
important thing, and there are already much effort in the community.
For example, without optimization of security group, 200 nodes cluster
would have serve performance problem, now with several patchs in Juno
it's very easy to scale the number up to 500 or more.

The Cascading has no conflicts with that, in fact hierarchical scale
depends on square of single child scale. If a single child can deal
with 00's to 000's, cascading on it would then deal with 00,000's. And
besides ultra high scalability, the Cascading also care about
geo-location distribution, zone fault isolation, modularization plug
and play, and software maintenance isolation.

Conceptually the Cascading would not introduce extra consistence
problem because of its tree-like topology, but not peer mesh topology.
The parent openstack is the central processing point of all
user-facing request driven events, just as what happens in a single
openstack. From the top level view, each child openstack is just a
agent running on a "big host", while currently agent side state is
natural asynchronous with controller side db, so there would not be
extra consistence problem for cascading compared with single layer
openstack today.

Best Regads
Wu Hongning


On Thu, Oct 9, 2014 at 12:27 PM, Joshua Harlow <harlowja at outlook.com> wrote:
> On Oct 7, 2014, at 6:24 AM, joehuang <joehuang at huawei.com> wrote:
>
>> Hello, Joshua,
>>
>> Thank you for your concerns on OpenStack cascading. I am afraid that I am not proper person to give comment on cells, but I would like to speak a little about cascading for you mentioned "with its own set of consistency warts I'm sure" .
>>
>> 1. For small scale or a cloud within one data centers, one OpenStack instance (including cells) without cascading feature can work just like it work today. OpenStack cascading just introduces Nova-proxy, Cinder-proxy, L2/L3 proxy... like other vendor specific agent/driver( for example, vcenter driver, hyper-v driver, linux-agent.. ovs-agent ), and does not change the current architecture for Nova/Cinder/Neutron..., and does not affect the aleady developed features and deployment capability. The cloud operators can skip the existence of OpenStack cascading if they don't want to use it, just like they don't want to use some kinds of hypervisor / sdn controller ....
>
> Sure, I understand the niceness that u can just connect clouds into other clouds and so-on (the prettyness of the fractal that results from this). That's a neat approach and its cool that openstack can do this (so +1 for that). The bigger question I have though is around 'should we' do this. This introduces a bunch of proxies that from what I can tell are just making it so that nova, cinder, neutron can scale by plugging more little cascading components together. This kind of connecting them together is very much what I guess could be called an 'external' scaling mechanism, one that plugs into the external API's of one service from the internal of another (and repeat). The question I have is why an 'external' solution in the first place, why not just work on scaling the projects internally first and when that ends up not being good enough switch to an 'external' scaling solution. Lets take an analogy, your queries to mysql are acting slow, do you first, add in X more mysql servers or do you instead try to tune your existing mysql server and queries before scaling out? I just want to make sure we are not prematurely adding in X more layers when we can gain scalability in a more solveable & manageable manner first...
>
>>
>> 2. Could you provide concrete inconsistency issues you are worried about in OpenStack cascading? Although we did not implement inconsistency check in the PoC source code completely, but because logical VM/Volume/Port/Network... objects are stored in the cascading OpenStack, and the physical objects are stored in the cascaded OpenStack, uuid mapping between logical object and physical object had been built,  it's possible and easy to solve the inconsistency issues. Even for flavor, host aggregate, we have method to solve the inconsistency issue.
>
> When you add more levels/layers, by the very nature of adding in those levels the number of potential failure points has now increased (there is probably a theorem or proof somewhere in literature about this). If you want to see inconsistencies that already exists just watch the gate issues and bugs and so-on for a while, you will eventually see why it may not be the right time to add in more potential failure points instead of fixing the existing failure points we already have. I (and I think others) would rather see effort focused on those existing failure points vs. adding a set of new ones in (make what exists reliable and scalable *first* then move on to scaling things out via something like cascading, cells, other...). Overall those same existing inconsistencies also make cascading inconsistent (by the very nature of the cascading model just being a combination of connected components, aka your fractal), since it's typically very hard to create consistent & stable out of components that are themselves not consistent and stable...
>
>>
>> Best Regards
>>
>> Chaoyi Huang ( joehuang )
>> ________________________________________
>> From: Joshua Harlow [harlowja at outlook.com]
>> Sent: 07 October 2014 12:21
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: Re: [openstack-dev] [all] [tc] Multi-clouds integration by     OpenStack cascading
>>
>> On Oct 3, 2014, at 2:44 PM, Monty Taylor <mordred at inaugust.com> wrote:
>>
>>> On 09/30/2014 12:07 PM, Tim Bell wrote:
>>>>> -----Original Message-----
>>>>> From: John Garbutt [mailto:john at johngarbutt.com]
>>>>> Sent: 30 September 2014 15:35
>>>>> To: OpenStack Development Mailing List (not for usage questions)
>>>>> Subject: Re: [openstack-dev] [all] [tc] Multi-clouds integration by OpenStack
>>>>> cascading
>>>>>
>>>>> On 30 September 2014 14:04, joehuang <joehuang at huawei.com> wrote:
>>>>>> Hello, Dear TC and all,
>>>>>>
>>>>>> Large cloud operators prefer to deploy multiple OpenStack instances(as
>>>>> different zones), rather than a single monolithic OpenStack instance because of
>>>>> these reasons:
>>>>>>
>>>>>> 1) Multiple data centers distributed geographically;
>>>>>> 2) Multi-vendor business policy;
>>>>>> 3) Server nodes scale up modularized from 00's up to million;
>>>>>> 4) Fault and maintenance isolation between zones (only REST
>>>>>> interface);
>>>>>>
>>>>>> At the same time, they also want to integrate these OpenStack instances into
>>>>> one cloud. Instead of proprietary orchestration layer, they want to use standard
>>>>> OpenStack framework for Northbound API compatibility with HEAT/Horizon or
>>>>> other 3rd ecosystem apps.
>>>>>>
>>>>>> We call this pattern as "OpenStack Cascading", with proposal described by
>>>>> [1][2]. PoC live demo video can be found[3][4].
>>>>>>
>>>>>> Nova, Cinder, Neutron, Ceilometer and Glance (optional) are involved in the
>>>>> OpenStack cascading.
>>>>>>
>>>>>> Kindly ask for cross program design summit session to discuss OpenStack
>>>>> cascading and the contribution to Kilo.
>>>>>>
>>>>>> Kindly invite those who are interested in the OpenStack cascading to work
>>>>> together and contribute it to OpenStack.
>>>>>>
>>>>>> (I applied for “other projects” track [5], but it would be better to
>>>>>> have a discussion as a formal cross program session, because many core
>>>>>> programs are involved )
>>>>>>
>>>>>>
>>>>>> [1] wiki: https://wiki.openstack.org/wiki/OpenStack_cascading_solution
>>>>>> [2] PoC source code: https://github.com/stackforge/tricircle
>>>>>> [3] Live demo video at YouTube:
>>>>>> https://www.youtube.com/watch?v=OSU6PYRz5qY
>>>>>> [4] Live demo video at Youku (low quality, for those who can't access
>>>>>> YouTube):http://v.youku.com/v_show/id_XNzkzNDQ3MDg4.html
>>>>>> [5]
>>>>>> http://www.mail-archive.com/openstack-dev@lists.openstack.org/msg36395
>>>>>> .html
>>>>>
>>>>> There are etherpads for suggesting cross project sessions here:
>>>>> https://wiki.openstack.org/wiki/Summit/Planning
>>>>> https://etherpad.openstack.org/p/kilo-crossproject-summit-topics
>>>>>
>>>>> I am interested at comparing this to Nova's cells concept:
>>>>> http://docs.openstack.org/trunk/config-reference/content/section_compute-
>>>>> cells.html
>>>>>
>>>>> Cells basically scales out a single datacenter region by aggregating multiple child
>>>>> Nova installations with an API cell.
>>>>>
>>>>> Each child cell can be tested in isolation, via its own API, before joining it up to
>>>>> an API cell, that adds it into the region. Each cell logically has its own database
>>>>> and message queue, which helps get more independent failure domains. You can
>>>>> use cell level scheduling to restrict people or types of instances to particular
>>>>> subsets of the cloud, if required.
>>>>>
>>>>> It doesn't attempt to aggregate between regions, they are kept independent.
>>>>> Except, the usual assumption that you have a common identity between all
>>>>> regions.
>>>>>
>>>>> It also keeps a single Cinder, Glance, Neutron deployment per region.
>>>>>
>>>>> It would be great to get some help hardening, testing, and building out more of
>>>>> the cells vision. I suspect we may form a new Nova subteam to trying and drive
>>>>> this work forward in kilo, if we can build up enough people wanting to work on
>>>>> improving cells.
>>>>>
>>>>
>>>> At CERN, we've deployed cells at scale but are finding a number of architectural issues that need resolution in the short term to attain feature parity. A vision of "we all run cells but some of us have only one" is not there yet. Typical examples are flavors, security groups and server groups, all of which are not yet implemented to the necessary levels for cell parent/child.
>>>>
>>>> We would be very keen on agreeing the strategy in Paris so that we can ensure the gap is closed, test it in the gate and that future features cannot 'wishlist' cell support.
>>>
>>> I agree with this. I know that there are folks who don't like cells -
>>> but I think that ship has sailed. It's there - which means we need to
>>> make it first class.
>>
>> Just out of curiosity, would you prioritize cells over split out unified quotas, or a split out scheduler, or split out virt drivers, or a split out ..., or upgrades that work reliably or db quota consistency ([2]) or the other X things that need to be done to keep the 'openstack' ship afloat (neutron integration/migrations... the list can go on and on)?
>>
>> To me that's the part that has always bugged me about cells, it seems like we have bigger 'fish to fry' to get the whole system working in a good manner instead of adding yet another fish in to the already overwhelmed fishery (this is an analogy, not reality, ha). It somehow didn't/doesn't feel right that we have so many other pieces of the puzzle that need work just to operate at a small scale that we should add another component in that makes the system as a whole that much harder to get right (with lovely commits such as [1]). Aka, focus on a building a solid 'small house' first before trying to build a mansion if you will, because if you can't build a stable small house you'll likely build a mansion that will crumble in on itself...
>>
>> It also doesn't help that cells are a nova-specific concept that it doesn't seem like other projects have adopted (for better or worse) although this cascading design can be an alternative that could span across projects (with its own set of consistency warts I'm sure).
>>
>> [1] https://github.com/openstack/nova/blob/master/nova/compute/api.py#L1553 (what u wanted quota consistency?)
>> [2] https://review.openstack.org/#/c/125181/
>>
>>>
>>>> Resources can be made available if we can agree the direction but current reviews are not progressing (such as https://bugs.launchpad.net/nova/+bug/1211011)
>>>>
>>>> Tim
>>>>
>>>>> Thanks,
>>>>> John
>>>>>
>>>>> _______________________________________________
>>>>> OpenStack-dev mailing list
>>>>> OpenStack-dev at lists.openstack.org
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list