[neutron] security group list regression

James Denton james.denton at rackspace.com
Mon Mar 2 22:25:12 UTC 2020


Thanks for continuing to push this on the ML and in the bug report.

Happy to report that the client and SDK patches you provided have drastically reduced the SG list time from ~90-120s to ~12-14s within Stein and Train lab environments.

One last thing... when you perform an 'openstack security group delete <name>', the initial lookup by name fails. In Train, the client falls back to using the 'name' parameter (/security-groups?name=<name>). This lookup is quick and the security group is found and deleted. However, on Rocky/Stein (e.g. client 3.18.1), instead of searching by parameter, the client appears to perform a GET /security-groups without limiting the fields and takes a long time.

'openstack security group list' with patch:
REQ: curl -g -i -X GET "" -H "Accept: application/json" -H "User-Agent: openstacksdk/0.27.0 keystoneauth1/3.13.1 python-requests/2.21.0 CPython/2.7.17" -H "X-Auth-Token: {SHA256}3e747da939e8c4befe72d5ca7105971508bd56cdf36208ba6b960d1aee6d19b6"

'openstack security group delete <name>':

Train (notice the name param):
REQ: curl -g -i -X GET -H "User-Agent: openstacksdk/0.36.0 keystoneauth1/3.17.1 python-requests/2.22.0 CPython/3.6.7" -H "X-Auth-Token: {SHA256}bf291d5f12903876fc69151db37d295da961ba684a575e77fb6f4829b55df1bf" "GET /v2.0/security-groups/train-test-1755 HTTP/1.1" 404 125
REQ: curl -g -i -X GET "" -H "Accept: application/json" -H "User-Agent: openstacksdk/0.36.0 keystoneauth1/3.17.1 python-requests/2.22.0 CPython/3.6.7" -H "X-Auth-Token: {SHA256}bf291d5f12903876fc69151db37d295da961ba684a575e77fb6f4829b55df1bf" "GET /v2.0/security-groups?name=train-test-1755 HTTP/1.1" 200 1365

Stein & below (notice lack of fields):
REQ: curl -g -i -X GET -H "User-Agent: openstacksdk/0.27.0 keystoneauth1/3.13.1 python-requests/2.21.0 CPython/2.7.17" -H "X-Auth-Token: {SHA256}e9f87afe851ff5380d8402ee81199c466be9c84fe67ed0302e8b178f33aa1fc2" "GET /v2.0/security-groups/stein-test-5189 HTTP/1.1" 404 125
REQ: curl -g -i -X GET -H "Accept: application/json" -H "User-Agent: openstacksdk/0.27.0 keystoneauth1/3.13.1 python-requests/2.21.0 CPython/2.7.17" -H "X-Auth-Token: {SHA256}e9f87afe851ff5380d8402ee81199c466be9c84fe67ed0302e8b178f33aa1fc2"
<wait awhile while it compiles and returns the full list, then the single SG object is deleted>

Haven't quite figured out where fields can be used to speed up the delete process on the older client, or if the newer client would be backwards-compatible (and how far back).


On 3/2/20, 9:31 AM, "James Denton" <james.denton at rackspace.com> wrote:

    CAUTION: This message originated externally, please use caution when clicking on links or opening attachments!
    Thanks, Rodolfo. I'll take a look at each of these after coffee and clarify my position (if needed).
    On 3/2/20, 6:27 AM, "Rodolfo Alonso" <ralonsoh at redhat.com> wrote:
        CAUTION: This message originated externally, please use caution when clicking on links or opening attachments!
        Hello James:
        Just to make a quick summary of the status of the commented bugs/regressions:
        1) https://bugs.launchpad.net/neutron/+bug/1810563: adding rules to security groups is slow
        That was addressed in https://review.opendev.org/#/c/633145/ and
        https://review.opendev.org/#/c/637407/, removing the O^2 check and using lazy loading.
        2) https://bugzilla.redhat.com/show_bug.cgi?id=1788749: Neutron List networks API regression
        The last reply was marked as private. I've undone this and you can read now c#2. Testing with a
        similar scenario, I don't see any performance degradation between Queens and Train.
        3) https://bugzilla.redhat.com/show_bug.cgi?id=1721273: Neutron API List Ports Performance
        That problem was solved in https://review.opendev.org/#/c/667981/ and
        https://review.opendev.org/#/c/667998/, by refactoring how the port QoS extension was reading and
        applying the QoS info in the port dict.
        4) https://bugs.launchpad.net/neutron/+bug/1865223: regression for security group list between
        Newton and Rocky+
        This is similar to https://bugs.launchpad.net/neutron/+bug/1863201. In this case, the regression was
        detected from R to S. The performance dropped from 3 secs to 110 secs (36x). That issue was
        addressed by https://review.opendev.org/#/c/708695/.
        But while 1865223 is talking about *SG list*, 1863201 is related to *SG rule list*. I would like to
        make this differentiation, because both retrieval commands are not related.
        In this bug (1863201), the performance degradation multiplies by x3 (N->Q) the initial time. This
        could be caused by the OVO integration (O->P: https://review.opendev.org/#/c/284738/). Instead of
        using the DB object now we make this call using the OVO object containing the DB register (something
        like a DB view). That's something I still need to check.
        Just to make a concretion: the patch 708695 improves the *SG rule* retrieval, not the SG list
        command. Another punctualization is that this patch will help in the case of having a balance
        between SG rules and SG. This patch will help to retrieve from the DB only those SG rules belonging
        to the project. If, as you state in https://bugs.launchpad.net/neutron/+bug/1865223/comments/4, most
        of those SG rules belong to the same project, there is little improvement there.
        As commented, I'm still looking at improving the SG OVO performance.
        On Mon, 2020-03-02 at 03:03 +0000, Erik Olof Gunnar Andersson wrote:
        > When we went from Mitaka to Rocky in August last year and we saw an exponential increase in api
        > times for listing security group rules.
        > I think I last commented on this bug https://bugs.launchpad.net/neutron/+bug/1810563, but I have
        > brought it up on a few other occasions as well.
        >  Bug #1810563 “adding rules to security groups is slow” : Bugs : neutron Sometime between liberty
        > and pike, adding rules to SG's got slow, and slower with every rule added. Gerrit review with
        > fixes is incoming. You can repro with a vanilla devstack install on master, and this script:
        > #!/bin/bash OPENSTACK_TOKEN=$(openstack token issue | grep '| id' | awk '{print $4}') export
        > OPENSTACK_TOKEN CCN1= CCN3= export ENDPOINT=localhost make_rules() {
        > iter=$1 prefix=$2 file="$3" echo "generating rules" cat >$file <<EOF
        > {... bugs.launchpad.net
        > From: Slawek Kaplonski <skaplons at redhat.com>
        > Sent: Saturday, February 29, 2020 12:44 AM
        > To: James Denton <james.denton at rackspace.com>
        > Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
        > Subject: Re: [neutron] security group list regression
        > Hi,
        > I just replied in Your bug report. Can You try to apply patch
        > https://urldefense.com/v3/__https://review.opendev.org/*/c/708695/__;Iw!!Ci6f514n9QsL8ck!2GsBjp6V_V3EzrzAbWgNfsURfCm2tZmlUaw2J6OxFwJZUCV71lSP1b9jg8Ul-OlUqQ$
        >   to see if that will help with this problem?
        > > On 29 Feb 2020, at 02:41, James Denton <james.denton at rackspace.com> wrote:
        > >
        > > Hello all,
        > >
        > > We recently upgraded an environment from Newton -> Rocky, and have noticed a pretty severe
        > regression in the time it takes the API to return the list of security groups. This environment
        > has roughly 8,000+ security groups, and it takes nearly 75 seconds for the ‘openstack security
        > group list’ command to complete. I don’t have actual data from the same environment running
        > Newton, but was able to replicate this behavior with the following lab environments running a mix
        > of virtual and baremetal machines:
        > >
        > > Newton (VM)
        > > Rocky (BM)
        > > Stein (VM)
        > > Train (BM)
        > >
        > > Number of sec grps vs time in seconds:
        > >
        > > #     Newton Rocky Stein  Train
        > > 200   4.1     3.7     5.4     5.2
        > > 500   5.3     7       11      9.4
        > > 1000  7.2     12.4    19.2    16
        > > 2000  9.2     24.2    35.3    30.7
        > > 3000  12.1    36.5    52      44
        > > 4000  16.1    47.2    73      58.9
        > > 5000  18.4    55      90      69
        > >
        > > As you can see (hopefully), the response time increased significantly between Newton and Rocky,
        > and has grown slightly ever since. We don't know, yet, if this behavior can be seen with other
        > 'list' commands or is limited to secgroups. We're currently verifying on some intermediate
        > releases to see where things went wonky.
        > >
        > > There are some similar recent reports out in the wild with little feedback:
        > >
        > >
        > https://urldefense.com/v3/__https://bugzilla.redhat.com/show_bug.cgi?id=1788749__;!!Ci6f514n9QsL8ck!2GsBjp6V_V3EzrzAbWgNfsURfCm2tZmlUaw2J6OxFwJZUCV71lSP1b9jg8Vx5jGlrA$
        > >
        > https://urldefense.com/v3/__https://bugzilla.redhat.com/show_bug.cgi?id=1721273__;!!Ci6f514n9QsL8ck!2GsBjp6V_V3EzrzAbWgNfsURfCm2tZmlUaw2J6OxFwJZUCV71lSP1b9jg8U9NbN_LA$
        > >
        > > I opened a bug here, too:
        > >
        > >
        > https://urldefense.com/v3/__https://bugs.launchpad.net/neutron/*bug/1865223__;Kw!!Ci6f514n9QsL8ck!2GsBjp6V_V3EzrzAbWgNfsURfCm2tZmlUaw2J6OxFwJZUCV71lSP1b9jg8UtMQ2-Dw$
        > >
        > > Bottom line: Has anyone else experienced similar regressions in recent releases? If so, were you
        > able to address them with any sort of tuning?
        > >
        > > Thanks in advance,
        > > James
        > >
        > —
        > Slawek Kaplonski
        > Senior software engineer
        > Red Hat

More information about the openstack-discuss mailing list