[openstack-dev] [neutron] Neutron scaling datapoints?

joehuang joehuang at huawei.com
Tue Apr 14 03:02:40 UTC 2015


Tooz provides a mechanism for grouping agents and agent status/liveness management, multiple coordinator services may be required in large scale deployment, especially for 100k nodes level. We can't make assumption that only one coordinator service is enough to manage all nodes, that means tooz may need to support multiple coordinate backend.

And Nova already supports several segregation concepts, for example, Cells, Availability Zone, Host Aggregates, .... Where the coordinate backend will resides? How to group agents? It's weird to put coordinator in availability zone(AZ in short) 1, but all managed agents in AZ 2. If AZ 1 is power off, then all agents in AZ2 lost management. Do we need segregation concept for agents, or reuse Nova concept, or build mapping between them? Especially if multiple coordinate backend will work under one Neutron.

Best Regards
Chaoyi Huang ( Joe Huang )

-----Original Message-----
From: Joshua Harlow [mailto:harlowja at outlook.com] 
Sent: Monday, April 13, 2015 11:11 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [neutron] Neutron scaling datapoints?

joehuang wrote:
> Hi, Kevin and Joshua,
>
> As my understanding, Tooz only addresses the issue of agent status 
> management, but how to solve the concurrent dynamic load impact on 
> large scale ( for example 100k managed nodes with the dynamic load 
> like security goup rule update, routers_updated, etc )

Yes, that is correct, let's not confuse status/liveness management with updates... since IMHO they are to very different things (the latter can be eventually consistent IMHO will the liveness 'question' probably should not be...).

>
> And one more question is, if we have 100k managed nodes, how to do the 
> partition? Or all nodes will be managed by one Tooz service, like 
> Zookeeper? Can Zookeeper manage 100k nodes status?

I can get u some data/numbers from some studies I've seen, but what u are talking about is highly specific as to what u are doing with zookeeper... There is no one solution for all the things IMHO; choose what's best from your tool-belt for each problem...

>
> Best Regards
>
> Chaoyi Huang ( Joe Huang )
>
> *From:*Kevin Benton [mailto:blak111 at gmail.com]
> *Sent:* Monday, April 13, 2015 3:52 AM
> *To:* OpenStack Development Mailing List (not for usage questions)
> *Subject:* Re: [openstack-dev] [neutron] Neutron scaling datapoints?
>
>>Timestamps are just one way (and likely the most primitive), using 
>>redis
> (or memcache) key/value and expiry are another (and letting memcache 
> or redis expire using its own internal algorithms), using zookeeper 
> ephemeral nodes[1] are another... The point being that its backend 
> specific and tooz supports varying backends.
>
> Very cool. Is the backend completely transparent so a deployer could 
> choose a service they are comfortable maintaining, or will that change 
> the properties WRT to resiliency of state on node restarts, partitions, etc?
>
> The Nova implementation of Tooz seemed pretty straight-forward, 
> although it looked like it had pluggable drivers for service management already.
> Before I dig into it much further I'll file a spec on the Neutron side 
> to see if I can get some other cores onboard to do the review work if 
> I push a change to tooz.
>
> On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow <harlowja at outlook.com 
> <mailto:harlowja at outlook.com>> wrote:
>
> Kevin Benton wrote:
>
> So IIUC tooz would be handling the liveness detection for the agents.
> That would be nice to get ride of that logic in Neutron and just 
> register callbacks for rescheduling the dead.
>
> Where does it store that state, does it persist timestamps to the DB 
> like Neutron does? If so, how would that scale better? If not, who 
> does a given node ask to know if an agent is online or offline when 
> making a scheduling decision?
>
>
> Timestamps are just one way (and likely the most primitive), using 
> redis (or memcache) key/value and expiry are another (and letting 
> memcache or redis expire using its own internal algorithms), using 
> zookeeper ephemeral nodes[1] are another... The point being that its 
> backend specific and tooz supports varying backends.
>
>
> However, before (what I assume is) the large code change to implement 
> tooz, I would like to quantify that the heartbeats are actually a 
> bottleneck. When I was doing some profiling of them on the master 
> branch a few months ago, processing a heartbeat took an order of 
> magnitude less time (<50ms) than the 'sync routers' task of the l3 
> agent (~300ms). A few query optimizations might buy us a lot more 
> headroom before we have to fall back to large refactors.
>
>
> Sure, always good to avoid prematurely optimizing things...
>
> Although this is relevant for u I think anyway:
>
> https://review.openstack.org/#/c/138607/ (same thing/nearly same in nova)...
>
> https://review.openstack.org/#/c/172502/ (a WIP implementation of the 
> latter).
>
> [1]
> https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephem
> eral+Nodes 
> <https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#Ephe
> meral+Nodes>
>
>
> Kevin Benton wrote:
>
>
> One of the most common is the heartbeat from each agent. However, I 
> don't think we can't eliminate them because they are used to determine 
> if the agents are still alive for scheduling purposes. Did you have 
> something else in mind to determine if an agent is alive?
>
>
> Put each agent in a tooz[1] group; have each agent periodically 
> heartbeat[2], have whoever needs to schedule read the active members 
> of that group (or use [3] to get notified via a callback), profit...
>
> Pick from your favorite (supporting) driver at:
>
> http://docs.openstack.org/__developer/tooz/compatibility.__html
> <http://docs.openstack.org/developer/tooz/compatibility.html>
>
> [1]
> http://docs.openstack.org/__developer/tooz/compatibility.__html#groupi
> ng 
> <http://docs.openstack.org/developer/tooz/compatibility.html#grouping>
> [2]
> https://github.com/openstack/__tooz/blob/0.13.1/tooz/__coordination.py
> #L315 
> <https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L3
> 15>
> [3]
> http://docs.openstack.org/__developer/tooz/tutorial/group___membership
> .html#watching-__group-changes 
> <http://docs.openstack.org/developer/tooz/tutorial/group_membership.ht
> ml#watching-group-changes>
>
>
> ______________________________________________________________________
> ________ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.__openstack.org?subject:__unsubscribe
> <http://openstack.org?subject:__unsubscribe>
> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> --
>
> Kevin Benton
>
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list