[openstack-dev] 答复: [neutron] Neutron scaling datapoints?

joehuang joehuang at huawei.com
Wed Apr 15 11:32:38 UTC 2015


Hi, Joshua,

This is a long discussion thread, may we come back to the scalability topic? 

As you confirmed, Tooz only addresses the issue of agent status management, not solve the concurrent dynamic load impact on large scale ( for example 100k managed nodes with the dynamic load like security group rule update, routers_updated, etc ). 

So even if Tooz is implemented in Neutron, that doesn't mean the scalability issue totally being addressed. 

So what's the goal and the whole picture to address the Neutron scalability? And Tooz will help the picture to be completed.
 
Best Regards
Chaoyi Huang ( Joe Huang )

(send again for I did not see the mail in the list)

________________________________________
From: Joshua Harlow [harlowja at outlook.com]
Sent: 14 April 2015 23:33
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] 答复: [neutron] Neutron scaling datapoints?

Daniel Comnea wrote:
> Joshua,
>
> those are old and have been fixed/ documented on Consul side.
> As for ZK, i have nothing against it, just wish you good luck running it
> in a multi cross-DC setup :)

Totally fair, although I start to question a cross-DC setup of things,
and why that's needed in this (and/or any) architecture, but to each
there own ;)

>
> Dani
>
> On Mon, Apr 13, 2015 at 11:37 PM, Joshua Harlow <harlowja at outlook.com
> <mailto:harlowja at outlook.com>> wrote:
>
>     Did the following get addressed?
>
>     https://aphyr.com/posts/316-__call-me-maybe-etcd-and-consul
>     <https://aphyr.com/posts/316-call-me-maybe-etcd-and-consul>
>
>     Seems like quite a few things got raised in that post about etcd/consul.
>
>     Maybe they are fixed, idk...
>
>     https://aphyr.com/posts/291-__call-me-maybe-zookeeper
>     <https://aphyr.com/posts/291-call-me-maybe-zookeeper> though worked
>     as expected (and without issue)...
>
>     I quote:
>
>     '''
>     Recommendations
>
>     Use Zookeeper. It’s mature, well-designed, and battle-tested.
>     Because the consequences of its connection model and linearizability
>     properties are subtle, you should, wherever possible, take advantage
>     of tested recipes and client libraries like Curator, which do their
>     best to correctly handle the complex state transitions associated
>     with session and connection loss.
>     '''
>
>     Daniel Comnea wrote:
>
>         My $2 cents:
>
>         I like the 3rd party backend however instead of ZK wouldn't
>         Consul [1]
>         fit better due to lighter/ out of box multi DC awareness?
>
>         Dani
>
>         [1] Consul - https://www.consul.io/
>
>
>         On Mon, Apr 13, 2015 at 9:51 AM, Wangbibo <wangbibo at huawei.com
>         <mailto:wangbibo at huawei.com>
>         <mailto:wangbibo at huawei.com <mailto:wangbibo at huawei.com>>> wrote:
>
>              Hi Kevin,____
>
>              __ __
>
>              Totally agree with you that heartbeat from each agent is
>         something
>              that we cannot eliminate currently. Agent status depends on
>         it, and
>              further scheduler and HA depends on agent status.____
>
>              __ __
>
>              I proposed a Liberty spec for introducing open
>         framework/pluggable
>              agent status drivers.[1][2]  It allows us to use some other
>         3^rd
>              party backend to monitor agent status, such as zookeeper,
>         memcached.
>              Meanwhile, it guarantees backward compatibility so that
>         users could
>              still use db-based status monitoring mechanism as their default
>              choice.____
>
>              __ __
>
>              Base on that, we may do further optimization on issues
>         Attila and
>              you mentioned. Thanks. ____
>
>              __ __
>
>              [1] BP  -
>         https://blueprints.launchpad.__net/neutron/+spec/agent-group-__and-status-drivers____
>         <https://blueprints.launchpad.net/neutron/+spec/agent-group-and-status-drivers____>
>
>              [2] Liberty Spec proposed -
>         https://review.openstack.org/#__/c/168921/____
>         <https://review.openstack.org/#/c/168921/____>
>
>              __ __
>
>              Best,____
>
>              Robin____
>
>              __ __
>
>              __ __
>
>              __ __
>
>              __ __
>
>              *发件人:*Kevin Benton [mailto:blak111 at gmail.com
>         <mailto:blak111 at gmail.com>
>         <mailto:blak111 at gmail.com <mailto:blak111 at gmail.com>>]
>              *发送时间:*2015年4月11日12:35
>              *收件人:*OpenStack Development Mailing List (not for usage
>         questions)
>              *主题:*Re: [openstack-dev] [neutron] Neutron scaling
>         datapoints?____
>
>              __ __
>
>              Which periodic updates did you have in mind to eliminate?
>         One of the
>              few remaining ones I can think of is sync_routers but it
>         would be
>              great if you can enumerate the ones you observed because
>         eliminating
>              overhead in agents is something I've been working on as
>         well.____
>
>              __ __
>
>              One of the most common is the heartbeat from each agent.
>         However, I
>              don't think we can't eliminate them because they are used to
>              determine if the agents are still alive for scheduling
>         purposes. Did
>              you have something else in mind to determine if an agent is
>         alive?____
>
>              __ __
>
>              On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas
>         <afazekas at redhat.com <mailto:afazekas at redhat.com>
>         <mailto:afazekas at redhat.com <mailto:afazekas at redhat.com>>>
>         wrote:____
>
>              I'm 99.9% sure, for scaling above 100k managed node,
>              we do not really need to split the openstack to multiple
>         smaller
>              openstack,
>              or use significant number of extra controller machine.
>
>              The problem is openstack using the right tools SQL/AMQP/(zk),
>              but in a wrong way.
>
>              For example.:
>              Periodic updates can be avoided almost in all cases
>
>              The new data can be pushed to the agent just when it needed.
>              The agent can know when the AMQP connection become
>         unreliable (queue
>              or connection loose),
>              and needs to do full sync.
>         https://bugs.launchpad.net/__neutron/+bug/1438159
>         <https://bugs.launchpad.net/neutron/+bug/1438159>
>
>              Also the agents when gets some notification, they start
>         asking for
>              details via the
>              AMQP -> SQL. Why they do not know it already or get it with the
>              notification ?
>
>
>              ----- Original Message -----
>         >   From: "Neil Jerram" <Neil.Jerram at metaswitch.com
>         <mailto:Neil.Jerram at metaswitch.com>
>         <mailto:Neil.Jerram at __metaswitch.com
>         <mailto:Neil.Jerram at metaswitch.com>>>____
>
>         >   To: "OpenStack Development Mailing List (not for usage
>         questions)"
>         <openstack-dev at lists.__openstack.org
>         <mailto:openstack-dev at lists.openstack.org>
>         <mailto:openstack-dev at lists.__openstack.org
>         <mailto:openstack-dev at lists.openstack.org>>>
>
>          >  Sent: Thursday, April 9, 2015 5:01:45 PM
>          >  Subject: Re: [openstack-dev] [neutron] Neutron scaling
>         datapoints?
>          >
>          >  Hi Joe,
>          >
>          >  Many thanks for your reply!
>          >
>          >  On 09/04/15 03:34, joehuang wrote:
>          > > Hi, Neil,
>          > >
>          > >  From theoretic, Neutron is like a "broadcast" domain, for
>         example,
>          > >  enforcement of DVR and security group has to touch each
>              regarding host
>          > >  where there is VM of this project resides. Even using SDN
>              controller, the
>          > > "touch" to regarding host is inevitable. If there are plenty of
>              physical
>          > >  hosts, for example, 10k, inside one Neutron, it's very hard to
>              overcome
>          > >  the "broadcast storm" issue under concurrent operation,
>         that's the
>          > >  bottleneck for scalability of Neutron.
>          >
>          >  I think I understand that in general terms - but can you be more
>          >  specific about the broadcast storm?  Is there one particular
>         message
>          >  exchange that involves broadcasting?  Is it only from the
>         server to
>          >  agents, or are there 'broadcasts' in other directions as well?
>          >
>          >  (I presume you are talking about control plane messages
>         here, i.e.
>          >  between Neutron components.  Is that right?  Obviously there can
>              also be
>          >  broadcast storm problems in the data plane - but I don't
>         think that's
>          >  what you are talking about here.)
>          >
>          > > We need layered architecture in Neutron to solve the "broadcast
>              domain"
>          > > bottleneck of scalability. The test report from OpenStack
>              cascading shows
>          > > that through layered architecture "Neutron cascading",
>         Neutron can
>          > > supports up to million level ports and 100k level physical
>              hosts. You can
>          > > find the report here:
>          > >
>         http://www.slideshare.net/__JoeHuang7/test-report-for-__open-stack-cascading-solution-__to-support-1-million-v-ms-in-__100-data-centers
>         <http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers>
>          >
>          >  Many thanks, I will take a look at this.
>          >
>          > > "Neutron cascading" also brings extra benefit: One cascading
>              Neutron can
>          > > have many cascaded Neutrons, and different cascaded Neutron can
>              leverage
>          > > different SDN controller, maybe one is ODL, the other one is
>              OpenContrail.
>          > >
>          > > ----------------Cascading Neutron-------------------
>          > >              /         \
>          > > --cascaded Neutron--   --cascaded Neutron-----
>          > >         |                  |
>          > > ---------ODL------       ----OpenContrail--------
>          > >
>          > >
>          > > And furthermore, if using Neutron cascading in multiple data
>              centers, the
>          > > DCI controller (Data center inter-connection controller) can
>              also be used
>          > > under cascading Neutron, to provide NaaS ( network as a service
>              ) across
>          > > data centers.
>          > >
>          > > ---------------------------__Cascading
>              Neutron-----------------------__---
>          > >              /            |          \
>          > > --cascaded Neutron--  -DCI controller-  --cascaded Neutron-----
>          > >         |                 |            |
>          > > ---------ODL------           |         ----OpenContrail--------
>          > >                           |
>          > > --(Data center 1)--   --(DCI networking)--  --(Data center 2)--
>          > >
>          > > Is it possible for us to discuss this in OpenStack
>         Vancouver summit?
>          >
>          >  Most certainly, yes.  I will be there from mid Monday afternoon
>              through
>          >  to end Friday.  But it will be my first summit, so I have no
>         idea
>              yet as
>          >  to how I might run into you - please can you suggest!
>          >
>          > > Best Regards
>          > > Chaoyi Huang ( Joe Huang )
>          >
>          >  Regards,
>          >        Neil
>          >
>          >
>
>         ______________________________________________________________________________
>          >  OpenStack Development Mailing List (not for usage questions)
>          >  Unsubscribe:
>         OpenStack-dev-request at lists.__openstack.org?subject:__unsubscribe <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>         <http://OpenStack-dev-request@__lists.openstack.org?subject:__unsubscribe
>         <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>>
>         >
>         http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>         >
>
>
>         ______________________________________________________________________________
>              OpenStack Development Mailing List (not for usage questions)
>              Unsubscribe:
>         OpenStack-dev-request at lists.__openstack.org?subject:__unsubscribe <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>         <http://OpenStack-dev-request@__lists.openstack.org?subject:__unsubscribe
>         <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>>
>         http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev____
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev____>
>
>
>
>              ____
>
>              __ __
>
>              -- ____
>
>              Kevin Benton____
>
>
>
>         ______________________________________________________________________________
>              OpenStack Development Mailing List (not for usage questions)
>              Unsubscribe:
>         OpenStack-dev-request at lists.__openstack.org?subject:__unsubscribe <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>         <http://OpenStack-dev-request@__lists.openstack.org?subject:__unsubscribe
>         <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>>
>         http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>         ______________________________________________________________________________
>         OpenStack Development Mailing List (not for usage questions)
>         Unsubscribe:
>         OpenStack-dev-request at lists.__openstack.org?subject:__unsubscribe <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>         http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>     ______________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.__openstack.org?subject:__unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


More information about the OpenStack-dev mailing list