Open Stack

Thu Sep 1 02:24:21 UTC 2016

I just pointed out the issues for RPC which is used between API cell and child cell if we deploy child cells in edge clouds. For this thread is about massively distributed cloud, so the RPC issues inside current Nova/Cinder/Neutron are not the main focus(it could be another important and interesting topic), for example, how to guarantee the reliability for rpc message:

> Cells is a good enhancement for Nova scalability, but there are some issues
>  in deployment Cells for massively distributed edge clouds:
>
> 1) using RPC for inter-data center communication will bring the difficulty
> in inter-dc troubleshooting and maintenance, and some critical issue in
> operation.  No CLI or restful API or other tools to manage a child cell
> directly. If the link between the API cell and child cells is broken, then
> the child cell in the remote edge cloud is unmanageable, no matter locally
> or remotely.
>
> 2). The challenge in security management for inter-site RPC communication.
> Please refer to the slides[1] for the challenge 3: Securing OpenStack over
> the Internet, Over 500 pin holes had to be opened in the firewall to allow
> this to work – Includes ports for VNC and SSH for CLIs. Using RPC in cells
> for edge cloud will face same security challenges.
>
> 3)only nova supports cells. But not only Nova needs to support edge clouds,
> Neutron, Cinder should be taken into account too. How about Neutron to
> support service function chaining in edge clouds? Using RPC? how to address
> challenges mentioned above? And Cinder?
>
> 4). Using RPC to do the production integration for hundreds of edge cloud is
> quite challenge idea, it's basic requirements that these edge clouds may
> be bought from multi-vendor, hardware/software or both.
> That means using cells in production for massively distributed edge clouds
> is quite bad idea. If Cells provide RESTful interface between API cell and
> child cell, it's much more acceptable, but it's still not enough, similar
> in Cinder, Neutron. Or just deploy lightweight OpenStack instance in each
> edge cloud, for example, one rack. The question is how to manage the large
> number of OpenStack instance and provision service.
>
> [1]https://www.openstack.org/assets/presentation-media/OpenStack-2016-Austin-D-NFV-vM.pdf

That's also my suggestion to collect all candidate proposals, and discuss these proposals and compare their cons. and pros. in the Barcelona summit.

I propose to use Nova/Cinder/Neutron restful API for inter-site communication for edge clouds, and provide Nova/Cinder/Neutron API as the umbrella for all edge clouds. This is the pattern of Tricircle: https://github.com/openstack/tricircle/

If there is other proposal, please don't hesitate to share and let's compare.

Best Regards
Chaoyi Huang(joehuang)

________________________________
From: Duncan Thomas [duncan.thomas at gmail.com]
Sent: 01 September 2016 2:03
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [all][massively distributed][architecture]Coordination between actions/WGs

On 31 August 2016 at 18:54, Joshua Harlow <harlowja at fastmail.com<mailto:harlowja at fastmail.com>> wrote:
Duncan Thomas wrote:
On 31 August 2016 at 11:57, Bogdan Dobrelya <bdobrelia at mirantis.com<mailto:bdobrelia at mirantis.com>
<mailto:bdobrelia at mirantis.com<mailto:bdobrelia at mirantis.com>>> wrote:

    I agree that RPC design pattern, as it is implemented now, is a major
    blocker for OpenStack in general. It requires a major redesign,
    including handling of corner cases, on both sides, *especially* RPC call
    clients. Or may be it just have to be abandoned to be replaced by a more
    cloud friendly pattern.

Is there a writeup anywhere on what these issues are? I've heard this
sentiment expressed multiple times now, but without a writeup of the
issues and the design goals of the replacement, we're unlikely to make
progress on a replacement - even if somebody takes the heroic approach
and writes a full replacement themselves, the odds of getting community
by-in are very low.

+2 to that, there are a bunch of technologies that could replace the rabbit+rpc, aka, gRPC, then there is http2 and thrift and ... so a writeup IMHO would help at least clear the waters a little bit, and explain the blocker of the current RPC design pattern (which is multidimensional because most people are probably thinking RPC == rabbit when it's actually more than that now, ie zeromq and amqp1.0 and ...) and try to centralize on a better replacement.

Is anybody who dislikes the current pattern(s) and implementation(s) volunteering to start this documentation? I really am not aware of the issues, and I'd like to begin to understand them.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160901/8112c5c3/attachment.html>

Open Stack

[openstack-dev] [all][massively distributed][architecture]Coordination between actions/WGs

OpenStack

Community

Documentation

Branding & Legal