[openstack-dev] [neutron] Neutron scaling datapoints?

Kevin Benton blak111 at gmail.com
Sun Apr 12 19:51:30 UTC 2015


>Timestamps are just one way (and likely the most primitive), using redis
(or memcache) key/value and expiry are another (and letting memcache or
redis expire using its own internal algorithms), using zookeeper ephemeral
nodes[1] are another... The point being that its backend specific and tooz
supports varying backends.

Very cool. Is the backend completely transparent so a deployer could choose
a service they are comfortable maintaining, or will that change the
properties WRT to resiliency of state on node restarts, partitions, etc?

The Nova implementation of Tooz seemed pretty straight-forward, although it
looked like it had pluggable drivers for service management already. Before
I dig into it much further I'll file a spec on the Neutron side to see if I
can get some other cores onboard to do the review work if I push a change
to tooz.


On Sun, Apr 12, 2015 at 9:38 AM, Joshua Harlow <harlowja at outlook.com> wrote:

> Kevin Benton wrote:
>
>> So IIUC tooz would be handling the liveness detection for the agents.
>> That would be nice to get ride of that logic in Neutron and just
>> register callbacks for rescheduling the dead.
>>
>> Where does it store that state, does it persist timestamps to the DB
>> like Neutron does? If so, how would that scale better? If not, who does
>> a given node ask to know if an agent is online or offline when making a
>> scheduling decision?
>>
>
> Timestamps are just one way (and likely the most primitive), using redis
> (or memcache) key/value and expiry are another (and letting memcache or
> redis expire using its own internal algorithms), using zookeeper ephemeral
> nodes[1] are another... The point being that its backend specific and tooz
> supports varying backends.
>
>
>> However, before (what I assume is) the large code change to implement
>> tooz, I would like to quantify that the heartbeats are actually a
>> bottleneck. When I was doing some profiling of them on the master branch
>> a few months ago, processing a heartbeat took an order of magnitude less
>> time (<50ms) than the 'sync routers' task of the l3 agent (~300ms). A
>> few query optimizations might buy us a lot more headroom before we have
>> to fall back to large refactors.
>>
>
> Sure, always good to avoid prematurely optimizing things...
>
> Although this is relevant for u I think anyway:
>
> https://review.openstack.org/#/c/138607/ (same thing/nearly same in
> nova)...
>
> https://review.openstack.org/#/c/172502/ (a WIP implementation of the
> latter).
>
> [1] https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#
> Ephemeral+Nodes
>
>
>> Kevin Benton wrote:
>>
>>
>>     One of the most common is the heartbeat from each agent. However, I
>>     don't think we can't eliminate them because they are used to determine
>>     if the agents are still alive for scheduling purposes. Did you have
>>     something else in mind to determine if an agent is alive?
>>
>>
>> Put each agent in a tooz[1] group; have each agent periodically
>> heartbeat[2], have whoever needs to schedule read the active members of
>> that group (or use [3] to get notified via a callback), profit...
>>
>> Pick from your favorite (supporting) driver at:
>>
>> http://docs.openstack.org/__developer/tooz/compatibility.__html
>> <http://docs.openstack.org/developer/tooz/compatibility.html>
>>
>> [1]
>> http://docs.openstack.org/__developer/tooz/compatibility.__html#grouping
>> <http://docs.openstack.org/developer/tooz/compatibility.html#grouping>
>> [2]
>> https://github.com/openstack/__tooz/blob/0.13.1/tooz/__
>> coordination.py#L315
>> <https://github.com/openstack/tooz/blob/0.13.1/tooz/coordination.py#L315>
>> [3]
>> http://docs.openstack.org/__developer/tooz/tutorial/group_
>> __membership.html#watching-__group-changes
>> <http://docs.openstack.org/developer/tooz/tutorial/group_
>> membership.html#watching-group-changes>
>>
>>
>> ____________________________________________________________
>> __________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.__openstack.org?subject:__unsubscribe
>> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:
>> unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Kevin Benton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150412/c5c3d2dc/attachment.html>


More information about the OpenStack-dev mailing list