<div dir="ltr"><p dir="ltr">>I assumed that all agents are connected to same IP address of RabbitMQ, then the connection will exceed the port ranges limitation.</p>
<p dir="ltr">Only if the clients are all using the same IP address. If connections weren't scoped by source IP, busy servers would be completely unreliable because clients would keep having source port collisions. </p><p dir="ltr">For example, the following is a netstat output from a server with two connections to a service running on port 4000 with both clients using source port 50000: <a href="http://paste.openstack.org/show/203211/">http://paste.openstack.org/show/203211/</a></p>
<p dir="ltr">>the client should be aware of the cluster member failure, and reconnect to other survive member. No such mechnism has been implemented yet.</p>
<p dir="ltr">If I understand what you are suggesting, it already has been implemented that way. The neutron agents and servers can be configured with multiple rabbitmq servers and they will cycle through the list whenever there is a failure. </p><p dir="ltr">The only downside to that approach is that every neutron agent and server has to be configured with every rabbitmq server address. This gets tedious to manage if you want to add cluster members dynamically so using a load balancer can help relieve that.</p>
<div style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div>
<div style="direction:ltr;font-family:Tahoma;color:rgb(0,0,0);font-size:10pt">
<p>Hi, Kevin,</p>
<p> </p>
<p>I assumed that all agents are connected to same IP address of RabbitMQ, then the connection will exceed the port ranges limitation.</p>
<p> </p>
<p>For a RabbitMQ cluster, for sure the client can connect to any one of member in the cluster, but in this case, the client has to be designed in fail-safe manner: the client should be aware of the cluster member failure, and reconnect to other survive member.
No such mechnism has been implemented yet.</p>
<p> </p>
<p>Other way is to use LVS or DNS based like load balancer, or something else. If you put one load balancer ahead of a cluster, then we have to take care of the port number limitation, there are so many agents will require connection concurrently, 100k level,
and the requests can not be rejected. </p>
<p> </p>
<p>Best Regards</p>
<p> </p>
<p>Chaoyi Huang ( joehuang )</p>
<p> </p>
<div style="font-family:'Times New Roman';color:rgb(0,0,0);font-size:16px">
<hr>
<div style="direction:ltr"><font color="#000000" size="2" face="Tahoma"><b>From:</b> Kevin Benton [<a href="mailto:blak111@gmail.com" target="_blank">blak111@gmail.com</a>]<br>
<b>Sent:</b> 12 April 2015 9:59<br>
<b>To:</b> OpenStack Development Mailing List (not for usage questions)<br>
<b>Subject:</b> Re: [openstack-dev] [neutron] Neutron scaling datapoints?<br>
</font><br>
</div>
<div></div>
<div>
<div dir="ltr">The TCP/IP stack keeps track of connections as a combination of IP + TCP port. The two byte port limit doesn't matter unless all of the agents are connecting from the same IP address, which shouldn't be the case unless compute nodes connect to
the rabbitmq server via one IP address running port address translation.
<div><br>
</div>
<div>Either way, the agents don't connect directly to the Neutron server, they connect to the rabbit MQ cluster. Since as many Neutron server processes can be launched as necessary, the bottlenecks will likely show up at the messaging or DB layer.</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Sat, Apr 11, 2015 at 6:46 PM, joehuang <span dir="ltr">
<<a href="mailto:joehuang@huawei.com" target="_blank">joehuang@huawei.com</a>></span> wrote:<br>
<blockquote style="border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid;margin:0px 0px 0px 0.8ex;padding-left:1ex" class="gmail_quote">
<div>
<div style="font-family:Tahoma;direction:ltr;color:rgb(0,0,0);font-size:10pt">
<p>As Kevin talking about agents, I want to remind that in TCP/IP stack, port ( not Neutron Port ) is a two bytes field, i.e. port ranges from 0 ~ 65535, supports maximum 64k port number.</p>
<p> </p>
<p>" above 100k managed node " means more than 100k L2 agents/L3 agents... will be alive under Neutron.</p>
<p> </p>
<p>Want to know the detail design how to support 99.9% possibility for scaling Neutron in this way, and PoC and test would be a good support for this idea.</p>
<span>
<p> </p>
<p>"I'm 99.9% sure, for scaling above 100k managed node,<br>
we do not really need to split the openstack to multiple smaller openstack,<br>
or use significant number of extra controller machine."</p>
<p> </p>
</span>
<p>Best Regards</p>
<p> </p>
<p>Chaoyi Huang ( joehuang )</p>
<p> </p>
<div style="font-family:'Times New Roman';color:rgb(0,0,0);font-size:16px">
<hr>
<div style="direction:ltr"><font color="#000000" size="2" face="Tahoma"><b>From:</b> Kevin Benton [<a href="mailto:blak111@gmail.com" target="_blank">blak111@gmail.com</a>]<br>
<b>Sent:</b> 11 April 2015 12:34<span><br>
<b>To:</b> OpenStack Development Mailing List (not for usage questions)<br>
</span>
<div>
<div><b>Subject:</b> Re: [openstack-dev] [neutron] Neutron scaling datapoints?<br>
</div>
</div>
</font><br>
</div>
<div>
<div>
<div></div>
<div>
<div dir="ltr">Which periodic updates did you have in mind to eliminate? One of the few remaining ones I can think of is sync_routers but it would be great if you can enumerate the ones you observed because eliminating overhead in agents is something I've been
working on as well.
<div><br>
</div>
<div>One of the most common is the heartbeat from each agent. However, I don't think we can't eliminate them because they are used to determine if the agents are still alive for scheduling purposes. Did you have something else in mind to determine if an agent
is alive?</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Apr 10, 2015 at 2:18 AM, Attila Fazekas <span dir="ltr">
<<a href="mailto:afazekas@redhat.com" target="_blank">afazekas@redhat.com</a>></span> wrote:<br>
<blockquote style="border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid;margin:0px 0px 0px 0.8ex;padding-left:1ex" class="gmail_quote">
I'm 99.9% sure, for scaling above 100k managed node,<br>
we do not really need to split the openstack to multiple smaller openstack,<br>
or use significant number of extra controller machine.<br>
<br>
The problem is openstack using the right tools SQL/AMQP/(zk),<br>
but in a wrong way.<br>
<br>
For example.:<br>
Periodic updates can be avoided almost in all cases<br>
<br>
The new data can be pushed to the agent just when it needed.<br>
The agent can know when the AMQP connection become unreliable (queue or connection loose),<br>
and needs to do full sync.<br>
<a href="https://bugs.launchpad.net/neutron/+bug/1438159" target="_blank">https://bugs.launchpad.net/neutron/+bug/1438159</a><br>
<br>
Also the agents when gets some notification, they start asking for details via the<br>
AMQP -> SQL. Why they do not know it already or get it with the notification ?<br>
<span><br>
<br>
----- Original Message -----<br>
> From: "Neil Jerram" <<a href="mailto:Neil.Jerram@metaswitch.com" target="_blank">Neil.Jerram@metaswitch.com</a>><br>
</span>
<div>
<div>> To: "OpenStack Development Mailing List (not for usage questions)" <<a href="mailto:openstack-dev@lists.openstack.org" target="_blank">openstack-dev@lists.openstack.org</a>><br>
> Sent: Thursday, April 9, 2015 5:01:45 PM<br>
> Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?<br>
><br>
> Hi Joe,<br>
><br>
> Many thanks for your reply!<br>
><br>
> On 09/04/15 03:34, joehuang wrote:<br>
> > Hi, Neil,<br>
> ><br>
> > From theoretic, Neutron is like a "broadcast" domain, for example,<br>
> > enforcement of DVR and security group has to touch each regarding host<br>
> > where there is VM of this project resides. Even using SDN controller, the<br>
> > "touch" to regarding host is inevitable. If there are plenty of physical<br>
> > hosts, for example, 10k, inside one Neutron, it's very hard to overcome<br>
> > the "broadcast storm" issue under concurrent operation, that's the<br>
> > bottleneck for scalability of Neutron.<br>
><br>
> I think I understand that in general terms - but can you be more<br>
> specific about the broadcast storm? Is there one particular message<br>
> exchange that involves broadcasting? Is it only from the server to<br>
> agents, or are there 'broadcasts' in other directions as well?<br>
><br>
> (I presume you are talking about control plane messages here, i.e.<br>
> between Neutron components. Is that right? Obviously there can also be<br>
> broadcast storm problems in the data plane - but I don't think that's<br>
> what you are talking about here.)<br>
><br>
> > We need layered architecture in Neutron to solve the "broadcast domain"<br>
> > bottleneck of scalability. The test report from OpenStack cascading shows<br>
> > that through layered architecture "Neutron cascading", Neutron can<br>
> > supports up to million level ports and 100k level physical hosts. You can<br>
> > find the report here:<br>
> > <a href="http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers" target="_blank">
http://www.slideshare.net/JoeHuang7/test-report-for-open-stack-cascading-solution-to-support-1-million-v-ms-in-100-data-centers</a><br>
><br>
> Many thanks, I will take a look at this.<br>
><br>
> > "Neutron cascading" also brings extra benefit: One cascading Neutron can<br>
> > have many cascaded Neutrons, and different cascaded Neutron can leverage<br>
> > different SDN controller, maybe one is ODL, the other one is OpenContrail.<br>
> ><br>
> > ----------------Cascading Neutron-------------------<br>
> > / \<br>
> > --cascaded Neutron-- --cascaded Neutron-----<br>
> > | |<br>
> > ---------ODL------ ----OpenContrail--------<br>
> ><br>
> ><br>
> > And furthermore, if using Neutron cascading in multiple data centers, the<br>
> > DCI controller (Data center inter-connection controller) can also be used<br>
> > under cascading Neutron, to provide NaaS ( network as a service ) across<br>
> > data centers.<br>
> ><br>
> > ---------------------------Cascading Neutron--------------------------<br>
> > / | \<br>
> > --cascaded Neutron-- -DCI controller- --cascaded Neutron-----<br>
> > | | |<br>
> > ---------ODL------ | ----OpenContrail--------<br>
> > |<br>
> > --(Data center 1)-- --(DCI networking)-- --(Data center 2)--<br>
> ><br>
> > Is it possible for us to discuss this in OpenStack Vancouver summit?<br>
><br>
> Most certainly, yes. I will be there from mid Monday afternoon through<br>
> to end Friday. But it will be my first summit, so I have no idea yet as<br>
> to how I might run into you - please can you suggest!<br>
><br>
> > Best Regards<br>
> > Chaoyi Huang ( Joe Huang )<br>
><br>
> Regards,<br>
> Neil<br>
><br>
> __________________________________________________________________________<br>
> OpenStack Development Mailing List (not for usage questions)<br>
> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
><br>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div>
<div>Kevin Benton</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div>
<div>Kevin Benton</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></div>
</div>