[openstack-dev] [neutron] Some findings while profiling instances boot

Kevin Benton kevin at benton.pub
Thu Feb 16 07:23:33 UTC 2017


Thanks for the stats and the nice diagram. I did some profiling and I'm
sure it's the RPC handler on the Neutron server-side behaving like garbage.

There are several causes that I have a string of patches up to address that
mainly stem from the fact that l2pop requires multiple port status updates
to function correctly:

* The DHCP notifier will trigger a notification to the DHCP agents on the
network on a port status update. This wouldn't be too problematic on it's
own, but it does several queries for networks and segments to determine
which agents it should talk to. Patch to address it here:
https://review.openstack.org/#/c/434677/

* The OVO notifier will also generate a notification on any port data model
change, including the status. This is ultimately the desired behavior, but
until we eliminate the frivolous status flipping, it's going to incur a
performance hit. Patch here to put it asynced into the background so it
doesn't block the port update process:
https://review.openstack.org/#/c/434678/

* A wasteful DB query in the ML2 PortContext:
https://review.openstack.org/#/c/434679/

* More unnecessary  queries for the status update case in the ML2
PortContext: https://review.openstack.org/#/c/434680/

* Bulking up the DB queries rather than retrieving port details one by one.
https://review.openstack.org/#/c/434681/
https://review.openstack.org/#/c/434682/

The top two accounted for more than 60% of the overhead in my profiling and
they are pretty simple, so we may be able to get them into Ocata for RC
depending on how other cores feel. If not, they should be good candidates
for back-porting later. Some of the others start to get more invasive so we
may be stuck.

Cheers,
Kevin Benton

On Wed, Feb 15, 2017 at 12:25 PM, Jay Pipes <jaypipes at gmail.com> wrote:

> On 02/15/2017 12:46 PM, Daniel Alvarez Sanchez wrote:
>
>> Hi there,
>>
>> We're trying to figure out why, sometimes, rpc_loop takes over 10
>> seconds to process an iteration when booting instances. So we deployed
>> devstack on a 8GB, 4vCPU VM and did some profiling on the following
>> command:
>>
>> nova boot --flavor m1.nano --image cirros-0.3.4-x86_64-uec --nic
>> net-name=private --min-count 8 instance
>>
>
> Hi Daniel, thanks for posting the information here. Quick request of you,
> though... can you try re-running the test but doing 8 separate calls to
> nova boot instead of using the --min-count 8 parameter? I'm curious to see
> if you notice any difference in contention/performance.
>
> Best,
> -jay
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170215/df44f0f5/attachment.html>


More information about the OpenStack-dev mailing list