[openstack-dev] [neutron] Neutron scaling datapoints?

joehuang joehuang at huawei.com
Mon Apr 13 09:28:29 UTC 2015


-----Original Message-----
From: Attila Fazekas [mailto:afazekas at redhat.com] 
Sent: Monday, April 13, 2015 3:19 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?


----- Original Message -----
> From: "joehuang" <joehuang at huawei.com>
> To: "OpenStack Development Mailing List (not for usage questions)" 
> <openstack-dev at lists.openstack.org>
> Sent: Sunday, April 12, 2015 1:20:48 PM
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> 
> 
> Hi, Kevin,
> 
> 
> 
> I assumed that all agents are connected to same IP address of 
> RabbitMQ, then the connection will exceed the port ranges limitation.
> 
https://news.ycombinator.com/item?id=1571300

"TCP connections are identified by the (src ip, src port, dest ip, dest port) tuple."

"The server doesn't need multiple IPs to handle > 65535 connections. All the server connections to a given IP are to the same port. For a given client, the unique key for an http connection is (client-ip, PORT, server-ip, 80). The only number that can vary is PORT, and that's a value on the client. So, the client is limited to 65535 connections to the server. But, a second client could also have another 65K connections to the same server-ip:port."


[[joehuang]] Sorry, long time not writing socket based app, I may make a mistake for HTTP server to spawn a thread to handle a new connection. I'll check again.

> 
> For a RabbitMQ cluster, for sure the client can connect to any one of 
> member in the cluster, but in this case, the client has to be designed 
> in fail-safe
> manner: the client should be aware of the cluster member failure, and 
> reconnect to other survive member. No such mechnism has been 
> implemented yet.
> 
> 
> 
> Other way is to use LVS or DNS based like load balancer, or something else.
> If you put one load balancer ahead of a cluster, then we have to take 
> care of the port number limitation, there are so many agents will 
> require connection concurrently, 100k level, and the requests can not be rejected.
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [blak111 at gmail.com]
> Sent: 12 April 2015 9:59
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> The TCP/IP stack keeps track of connections as a combination of IP + 
> TCP port. The two byte port limit doesn't matter unless all of the 
> agents are connecting from the same IP address, which shouldn't be the 
> case unless compute nodes connect to the rabbitmq server via one IP 
> address running port address translation.
> 
> Either way, the agents don't connect directly to the Neutron server, 
> they connect to the rabbit MQ cluster. Since as many Neutron server 
> processes can be launched as necessary, the bottlenecks will likely 
> show up at the messaging or DB layer.
> 
> On Sat, Apr 11, 2015 at 6:46 PM, joehuang < joehuang at huawei.com > wrote:
> 
> 
> 
> 
> 
> As Kevin talking about agents, I want to remind that in TCP/IP stack, 
> port ( not Neutron Port ) is a two bytes field, i.e. port ranges from 
> 0 ~ 65535, supports maximum 64k port number.
> 
> 
> 
> " above 100k managed node " means more than 100k L2 agents/L3 
> agents... will be alive under Neutron.
> 
> 
> 
> Want to know the detail design how to support 99.9% possibility for 
> scaling Neutron in this way, and PoC and test would be a good support for this idea.
> 
> 
> 
> "I'm 99.9% sure, for scaling above 100k managed node, we do not really 
> need to split the openstack to multiple smaller openstack, or use 
> significant number of extra controller machine."
> 
> 
> 
> Best Regards
> 
> 
> 
> Chaoyi Huang ( joehuang )
> 
> 
> 
> From: Kevin Benton [ blak111 at gmail.com ]
> Sent: 11 April 2015 12:34
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> 
> Which periodic updates did you have in mind to eliminate? One of the 
> few remaining ones I can think of is sync_routers but it would be 
> great if you can enumerate the ones you observed because eliminating 
> overhead in agents is something I've been working on as well.
> 
> One of the most common is the heartbeat from each agent. However, I 
> don't think we can't eliminate them because they are used to determine 
> if the agents are still alive for scheduling purposes. Did you have 
> something else in mind to determine if an agent is alive?
> 
> On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas < afazekas at redhat.com 
> >
> wrote:
> 
> 
> I'm 99.9% sure, for scaling above 100k managed node, we do not really 
> need to split the openstack to multiple smaller openstack, or use 
> significant number of extra controller machine.
> 
> The problem is openstack using the right tools SQL/AMQP/(zk), but in a 
> wrong way.
> 
> For example.:
> Periodic updates can be avoided almost in all cases
> 
> The new data can be pushed to the agent just when it needed.
> The agent can know when the AMQP connection become unreliable (queue 
> or connection loose), and needs to do full sync.
> https://bugs.launchpad.net/neutron/+bug/1438159
> 
> Also the agents when gets some notification, they start asking for 
> details via the AMQP -> SQL. Why they do not know it already or get it 
> with the notification ?
> 
> 
> ----- Original Message -----
> > From: "Neil Jerram" < Neil.Jerram at metaswitch.com >
> > To: "OpenStack Development Mailing List (not for usage questions)" < 
> > openstack-dev at lists.openstack.org >
> > Sent: Thursday, April 9, 2015 5:01:45 PM
> > Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?
> > 
> > Hi Joe,
> > 
> > Many thanks for your reply!
> > 
> > On 09/04/15 03:34, joehuang wrote:
> > > Hi, Neil,
> > > 
> > > From theoretic, Neutron is like a "broadcast" domain, for example, 
> > > enforcement of DVR and security group has to touch each regarding 
> > > host where there is VM of this project resides. Even using SDN 
> > > controller, the "touch" to regarding host is inevitable. If there 
> > > are plenty of physical hosts, for example, 10k, inside one 
> > > Neutron, it's very hard to overcome the "broadcast storm" issue 
> > > under concurrent operation, that's the bottleneck for scalability of Neutron.
> > 
> > I think I understand that in general terms - but can you be more 
> > specific about the broadcast storm? Is there one particular message 
> > exchange that involves broadcasting? Is it only from the server to 
> > agents, or are there 'broadcasts' in other directions as well?
> > 
> > (I presume you are talking about control plane messages here, i.e.
> > between Neutron components. Is that right? Obviously there can also 
> > be broadcast storm problems in the data plane - but I don't think 
> > that's what you are talking about here.)
> > 
> > > We need layered architecture in Neutron to solve the "broadcast domain"
> > > bottleneck of scalability. The test report from OpenStack 
> > > cascading shows that through layered architecture "Neutron 
> > > cascading", Neutron can supports up to million level ports and 
> > > 100k level physical hosts. You can find the report here:
> > > http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cas
> > > cading-solution-to-support-1-million-v-ms-in-100-data-centers
> > 
> > Many thanks, I will take a look at this.
> > 
> > > "Neutron cascading" also brings extra benefit: One cascading 
> > > Neutron can have many cascaded Neutrons, and different cascaded 
> > > Neutron can leverage different SDN controller, maybe one is ODL, 
> > > the other one is OpenContrail.
> > > 
> > > ----------------Cascading Neutron------------------- / \ 
> > > --cascaded Neutron-- --cascaded Neutron-----
> > > | | 
> > > ---------ODL------ ----OpenContrail--------
> > > 
> > > 
> > > And furthermore, if using Neutron cascading in multiple data 
> > > centers, the DCI controller (Data center inter-connection 
> > > controller) can also be used under cascading Neutron, to provide 
> > > NaaS ( network as a service ) across data centers.
> > > 
> > > ---------------------------Cascading 
> > > Neutron-------------------------- / | \ --cascaded Neutron-- -DCI 
> > > controller- --cascaded Neutron-----
> > > | | | 
> > > ---------ODL------ | ----OpenContrail--------
> > > | 
> > > --(Data center 1)-- --(DCI networking)-- --(Data center 2)--
> > > 
> > > Is it possible for us to discuss this in OpenStack Vancouver summit?
> > 
> > Most certainly, yes. I will be there from mid Monday afternoon 
> > through to end Friday. But it will be my first summit, so I have no 
> > idea yet as to how I might run into you - please can you suggest!
> > 
> > > Best Regards
> > > Chaoyi Huang ( Joe Huang )
> > 
> > Regards,
> > Neil
> > 
> > ____________________________________________________________________
> > ______ OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: 
> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> 
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> --
> Kevin Benton
> 
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> --
> Kevin Benton
> 
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list