[openstack-dev] [all][massively distributed][architecture]Coordination between actions/WGs

joehuang joehuang at huawei.com
Fri Sep 2 01:51:56 UTC 2016


Hi, Adrien,

+1, 

and the comment in the https://etherpad.openstack.org/p/massively-distributed_WG_description has been updated.

- Security management over the WAN: how manage the inter-site communications and edge cloud securely.

- Fault tolerant issues (each edge cloud should be able to run independently, a crash or an isolation of one (or several sites) should not impact other DCs. 

- Maintainability: each edge cloud installation/upgrading/patch should be able to be managed independently, don't have to upgrade all edge clouds at the same time)
ad_rien_: why not ? I would rather reformulate as: Appropriate/automatic mechanisms should enable the upgrade of the different sites in a consistency way (considering that upgrading the complete infrastructure can last a significant amount of time while facing crash and disconnection issues). 

- Service Operation-able:  resoures like VM, Container, Volume, etc in each edge cloud can still be manipulated locally even if the link to other cloud temporay broken.
ad_rien: could you please clarify/reword the above sentence it is not clear for me whether there is (or not) a difference with the maintainability aspect described above. 
joehuang: updated as above

- Easy integration: need to support easy integration for multi-vendors for hundreds or thousands of edge cloud.
ad_rien: same here, could you please clarify what do you mean by multiple vendors? You mean to be able to ''merge/federate'' DCs from Huawei, Orange and Rackspace for example. This looks like to be peering agreement challenge ? I'm not sure whether it is a technical challenge ? 
joehuang: in telecom industry, multi-vendor inter-operable is a basic requirement, even for edge clouds. So the interface between edge clouds should be inter-operable, and compared stable/easy to be certified and integrated. Binary RPC which is varied a lot in each version is not good for multi-verndor certification and integration, that's why in telecom industry, the standard is required.

- Consistency: eventually consistent information(stable status) should be achieved for distributed system.
ad_rien: let's reword as: Consistency: the states of the system should  be globally consistent. This means that if one project/vm/... is created on one site, the states of the other sites should be consistent to avoid for instance double assigmnent of Ids/IPs/...

Best Regards
Chaoyi Huang(joehuang)

________________________________________
From: lebre.adrien at free.fr [lebre.adrien at free.fr]
Sent: 01 September 2016 20:47
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [all][massively distributed][architecture]Coordination between actions/WGs

May I suggest to open one (or more) thread(s) with the correct subject(s)?

There are at least three points discussed here:

- one related to the proposal of the Massively distributed group
- one related to the Architecture WG with the communication issue between services (RPC+Rabbit MQ, REST API..)
- one that mainly focuses on TriCircle.

While all of them are interesting it is a bit tedious to follow them in one large thread.

Regarding the Massively distributed WG (which was the initial topic ;)), I replied to some comments that have been done in the pad and I added a new action to discuss the single vs multi-endpoint questions.

Finally regarding the comparison between proposal (the link that has been added at the end), I think it is a good idea but that should be done after (or at least meanwhile) analyzing the current OpenStack ecosystem. As it has been written in some comments of the TriCircle Big Tent application, it is  important to first identify pro/cons of the federation proposal before we go ahead holus-bolus.

My two cents
Ad_rien_

----- Mail original -----
> De: "joehuang" <joehuang at huawei.com>
> À: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Envoyé: Jeudi 1 Septembre 2016 11:18:17
> Objet: Re: [openstack-dev] [all][massively distributed][architecture]Coordination between actions/WGs
>
> > What is the REST API for tricircle?
> > When looking at the github I see:
> > ''Documentation: TBD''
> > Getting a feel for its REST API would really be helpful in
> > determine how
> > much of a proxy/request router it is vs being an actual API. I
> > don't
> > really want/like a proxy/request router (if that wasn't obvious,
> > ha).
>
> For Nova API-GW/Cinder API-GW, Nova API/Cinder API will be accepted
> and forwarded.
> For Neutron with Tricircle Neutron plugin in Tricircle, it's Neutron
> API,
> just like any other Neutron plugin, doesn't change Neutron API.
>
> Tricircle reuse tempest test cases to ensure the API accepted kept
> consistent
> with Nova/Cinder/Neutron. So no special documentation for these
> APIs(if we
> provide, documentation inconsistency will be introduced)
>
> Except that, Tricircle Admin API provides its own API to manage
> bottom
> OpenStack instance, the documentation is in review:
> https://review.openstack.org/#/c/356291/
>
> > Looking at say:
> > https://github.com/openstack/tricircle/blob/master/tricircle/nova_apigw/controllers/server.py
> > That doesn't inspire me so much, since that appears to be more of a
> > fork/join across many different clients, and creating a nova like
> > API
> > out of the joined results of those clients (which feels sort of
> > ummm,
> > wrong). This is where I start to wonder about what the right API is
> > here, and trying to map 1 `create_server` top-level API onto M
> > child
> > calls feels a little off (because that mapping will likely never be
> > correct due to the nature of the child clouds, ie u have to assume
> > a
> > very strict homogenous nature to even get close to this working).
>
> > Where there other alternative ways of doing this that were
> > discussed?
>
> > Perhaps even a new API that doesn't try to 1:1 map onto child
> > calls,
> > something along the line of make an API that more directly suits
> > what
> > this project is trying to do (vs trying to completely hide that
> > there M
> > child calls being made underneath).
>
> > I get the idea of becoming a uber-openstack-API and trying to unify
> > X
> > other other openstacks under that API with this uber-API but it
> > just
> > feels like the wrong way to tackle this.
>
> > -Josh
>
> This is an interesting phenomenon here: cloud operators and end users
> often asked for single endpoint for the multi-site cloud. But for
> technology guys often think multi-region mode(each region with
> separate
> endpoint) is not an issue for end user.
> During the Tricircle big-tent application
> https://review.openstack.org/#/c/338796/
> , Anne Gentle commented "Rackspace
> public cloud has had multiple endpoints for regions for years now. I
> know
> from supporting end users for years we had to document it, and
> explain it
> often, but end-users worked with it."  In another comment, "I want to
> be sure
> I'm clear that I want many of the problems solved that you mention in
> your
>  application. In my view, Tricircle has so far been a bit of an
>  isolated effort that
>  I hadn't heard of until now. Hence the amount of discussion and
>  further work
> we may need to get to the end goal, which isn't accepting another
> project but
>  making sure we clearly write down what an acceptable solution means
>  for many,
> many clouds and teams.
>
> I can't say who is correct or not, the requirement is true or
> false(or fake?). But
> single end point for multi-site cloud provides another choice for the
> end
> user and cloud operators.
>
> Except that, Tricircle tries to provide quota management over
> multi-region
> and global resource management like flavor/volume-type, the end user
> and cloud operators not see each seperate/disjoint
> flavor/volume-type.
>
> And moreover, Tricircle wants to support large scale cloud too.
> In AWS, one AZ(Avaialbility Zone) includes 1 or more DC, and 1
> DC typically with more than 50 thousands of physical servers,
> and one region have 2 or more AZs. The size of one region >=
> 50k physical servers
> http://www.slideshare.net/AmazonWebServices/spot301-aws-innovation-at-scale-aws-reinvent-2014
>
> So in Tricircle, provides a model that one AZ can includes multiple
> bottom OpenStack instances(one OpenStack instances scalability is
> limited
> by many factors). Through this model, Tricircle can support one
> region with many AZs, and many OpenStack instances to support large
> scale
> cloud. Here is the spec for dynamic pod binding(one bottom OpenStack
> instance is called a pod in Tricircle)
> https://github.com/openstack/tricircle/blob/master/specs/dynamic-pod-binding.rst
>
> Nova/Cinder API routing itself is no value to OpenStack community, no
> value
> to end user or cloud operators.
>
> But through the Nova/Cinder API routing, we can add additional value
> for
> quota management/gloabl objects like flavor/volume type/utra-large
> AZ, to extend
> current Nova/Cinder/Neutron API with the capability to reach Amazon
> level scalability,
> and don't ruin already built OpenStack API eco-system.
>
> That means single model to deal with multi-site and large scale cloud
> need:
> multi-site often means large-scale too.
>
> We proposed to add plugin mechanism in Nova/Cinder API layer to
> remove
> the in-consistency worry, but it'll take long time to get consensus
> in
> community wide. So Tricircle will be divided into two independent and
> decoupled projects, only one of the projects which deal with
> networking
> automation will try to become an big-tent project, And Nova/Cinder
> API-GW
> will be removed from the scope of big-tent project application, and
> put
> them into another project:
> https://docs.google.com/presentation/d/1kpVo5rsL6p_rq9TvkuczjommJSsisDiKJiurbhaQg7E
>
> TricircleNetworking: Dedicated for cross Neutron networking
> automation in
> multi-region OpenStack deployment, run without or with
> TricircleGateway.
> Try to become big-tent project in the current application of
> https://review.openstack.org/#/c/338796/.
>
> TricircleGateway: Dedicated to provide API gateway for those who need
> single Nova/Cinder API endpoint in multi-region OpenStack deployment,
> run without or with TricircleNetworking. Live as non-big-tent,
> non-offical-openstack project, just like Tricircle toady’s status.
> And not pursue big-tent only if the consensus can be achieved in
> OpenStack
> community, including Arch WG and TCs, then decide how to get it on
> board
> in OpenStack. A new repository is needed to be applied for this
> project.
>
> If you want to use other APIs to manage edge clouds, at last we have
> to support
> all operations and attributes which provided in OpenStack, and it
> will grow to a collection
> of API sets which includes all in OpenStack APIs. Can we simplify and
> ignore
> some features which are already supported in Nova/Cinder/Neutron?
> This
> is a question.
>
> Best Regards
> Chaoyi Huang (joehuang)
> ________________________________________
> From: Joshua Harlow [harlowja at fastmail.com]
> Sent: 01 September 2016 12:17
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [all][massively
> distributed][architecture]Coordination between actions/WGs
>
> joehuang wrote:
> > I just pointed out the issues for RPC which is used between API
> > cell and
> > child cell if we deploy child cells in edge clouds. For this thread
> > is
> > about massively distributed cloud, so the RPC issues inside current
> > Nova/Cinder/Neutron are not the main focus(it could be another
> > important
> > and interesting topic), for example, how to guarantee the
> > reliability
> > for rpc message:
>
> +1 although I'd like to also discuss this, but so be it, perhaps a
> different topic :)
>
> >
> >      > Cells is a good enhancement for Nova scalability, but there
> >      > are
> >     some issues
> >      > in deployment Cells for massively distributed edge clouds:
> >      >
> >      > 1) using RPC for inter-data center communication will bring
> >      > the
> >     difficulty
> >      > in inter-dc troubleshooting and maintenance, and some
> >      > critical
> >     issue in
> >      > operation. No CLI or restful API or other tools to manage a
> >      > child
> >     cell
> >      > directly. If the link between the API cell and child cells
> >      > is
> >     broken, then
> >      > the child cell in the remote edge cloud is unmanageable, no
> >     matter locally
> >      > or remotely.
> >      >
> >      > 2). The challenge in security management for inter-site RPC
> >     communication.
> >      > Please refer to the slides[1] for the challenge 3: Securing
> >     OpenStack over
> >      > the Internet, Over 500 pin holes had to be opened in the
> >      > firewall
> >     to allow
> >      > this to work – Includes ports for VNC and SSH for CLIs.
> >      > Using RPC
> >     in cells
> >      > for edge cloud will face same security challenges.
> >      >
> >      > 3)only nova supports cells. But not only Nova needs to
> >      > support
> >     edge clouds,
> >      > Neutron, Cinder should be taken into account too. How about
> >     Neutron to
> >      > support service function chaining in edge clouds? Using RPC?
> >      > how
> >     to address
> >      > challenges mentioned above? And Cinder?
> >      >
> >      > 4). Using RPC to do the production integration for hundreds
> >      > of
> >     edge cloud is
> >      > quite challenge idea, it's basic requirements that these
> >      > edge
> >     clouds may
> >      > be bought from multi-vendor, hardware/software or both.
> >      > That means using cells in production for massively
> >      > distributed
> >     edge clouds
> >      > is quite bad idea. If Cells provide RESTful interface
> >      > between API
> >     cell and
> >      > child cell, it's much more acceptable, but it's still not
> >      > enough,
> >     similar
> >      > in Cinder, Neutron. Or just deploy lightweight OpenStack
> >      > instance
> >     in each
> >      > edge cloud, for example, one rack. The question is how to
> >      > manage
> >     the large
> >      > number of OpenStack instance and provision service.
> >      >
> >      >
> >     [1]https://www.openstack.org/assets/presentation-media/OpenStack-2016-Austin-D-NFV-vM.pdf
> >
> >
> > That's also my suggestion to collect all candidate proposals, and
> > discuss these proposals and compare their cons. and pros. in the
> > Barcelona summit.
> >
> > I propose to use Nova/Cinder/Neutron restful API for inter-site
> > communication for edge clouds, and provide Nova/Cinder/Neutron API
> > as
> > the umbrella for all edge clouds. This is the pattern of Tricircle:
> > https://github.com/openstack/tricircle/
> >
>
> What is the REST API for tricircle?
>
> When looking at the github I see:
>
> ''Documentation: TBD''
>
> Getting a feel for its REST API would really be helpful in determine
> how
> much of a proxy/request router it is vs being an actual API. I don't
> really want/like a proxy/request router (if that wasn't obvious, ha).
>
> Looking at say:
>
> https://github.com/openstack/tricircle/blob/master/tricircle/nova_apigw/controllers/server.py
>
> That doesn't inspire me so much, since that appears to be more of a
> fork/join across many different clients, and creating a nova like API
> out of the joined results of those clients (which feels sort of ummm,
> wrong). This is where I start to wonder about what the right API is
> here, and trying to map 1 `create_server` top-level API onto M child
> calls feels a little off (because that mapping will likely never be
> correct due to the nature of the child clouds, ie u have to assume a
> very strict homogenous nature to even get close to this working).
>
> Where there other alternative ways of doing this that were discussed?
>
> Perhaps even a new API that doesn't try to 1:1 map onto child calls,
> something along the line of make an API that more directly suits what
> this project is trying to do (vs trying to completely hide that there
> M
> child calls being made underneath).
>
> I get the idea of becoming a uber-openstack-API and trying to unify X
> other other openstacks under that API with this uber-API but it just
> feels like the wrong way to tackle this.
>
> -Josh
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list