[openstack-dev] [QA][gate][all] dsvm gate stability and scenario tests

Morales, Victor victor.morales at intel.com
Fri Mar 17 15:31:00 UTC 2017


Well my crazy idea was the addition[10] of an extra argument(—men-trace) on the pbr binary creation.  The idea is to be able to use it from any openstack binary and print those methods that are differences in the memory consumption[11].

Regards/Saludos
Victor Morales
irc: electrocucaracha

[10] https://review.openstack.org/#/c/433947/
[11] http://paste.openstack.org/show/599087/



From:  Jordan Pittier <jordan.pittier at scality.com>
Reply-To:  "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
Date:  Friday, March 17, 2017 at 7:27 AM
To:  "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
Subject:  Re: [openstack-dev] [QA][gate][all] dsvm gate stability and	scenario tests


The patch that reduced the number of Tempest Scenarios we run in every job and also reduce the test run concurrency [0] was merged 13 days ago. Since, the situation (i.e the high number of false negative job results) has not improved significantly. We
 need to keep looking collectively at this.


There seems to be an agreement that we are hitting some memory limit. Several of our most frequent failures are memory related [1]. So we should either reduce our memory usage or ask for bigger VMs, with more than 8GB of RAM.


There was/is several attempts to reduce our memory usage, by reducing the Mysql memory consumption ([2] but quickly reverted [3]), reducing the number of Apache workers ([4], [5]), more apache2 tuning [6]. If you have any crazy idea to help in this regard,
 please help. This is high priority for the whole openstack project, because it's plaguing many projects.


We have some tools to investigate memory consumption, like some regular "dstat" output [7], a home-made memory tracker [8] and stackviz [9].


Best,

Jordan

[0]: https://review.openstack.org/#/c/439698/
[1]: http://status.openstack.org/elastic-recheck/gate.html
[2] : https://review.openstack.org/#/c/438668/
[3]: https://review.openstack.org/#/c/446196/
[4]: https://review.openstack.org/#/c/426264/
[5]: https://review.openstack.org/#/c/445910/
[6]: https://review.openstack.org/#/c/446741/
[7]: 
http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/dstat-csv_log.txt.gz <http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/dstat-csv_log.txt.gz>
[8]: 
http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/screen-peakmem_tracker.txt.gz <http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/screen-peakmem_tracker.txt.gz>
[9] : 
http://logs.openstack.org/41/446741/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/fa4d2e6/logs/stackviz/#/stdin/timeline <http://logs.openstack.org/41/446741/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/fa4d2e6/logs/stackviz/#/stdin/timeline>







On Sat, Mar 4, 2017 at 4:19 PM, Andrea Frittoli 
<andrea.frittoli at gmail.com> wrote:

Quick update on this, the change is now merged, so we now have a smaller number of scenario tests running serially after the api test run.
We'll monitor gate stability for the next week or so and decide whether further actions are required.
Please keep categorizing failures via elastic recheck as usual.
thank you
andrea

On Fri, 3 Mar 2017, 8:02 a.m. Ghanshyam Mann, <ghanshyammann at gmail.com> wrote:


Thanks. +1. i added my list in ethercalc.

Left put scenario tests can be run on periodic and experimental job. IMO on both ( periodic and experimental) to monitor their status periodically as well as on particular patch if we need to. 

-gmann






On Fri, Mar 3, 2017 at 4:28 PM, Andrea Frittoli
<andrea.frittoli at gmail.com> wrote:




Hello folks,

we discussed a lot since the PTG about issues with gate stability; we need a stable and reliable gate to ensure smooth progress in Pike.

One of the issues that stands out is that most of the times during test runs our test VMs are under heavy load.
This can be the common cause behind several failures we've seen in the gate, so we agreed during the QA meeting yesterday [0] that we're going to try reducing the load and see whether that improves stability.


Next steps are:

- select a subset of scenario tests to be executed in the gate, based on [1], and run them serially only 
- the patch for this is [2] and we will approve this by the end of the day
- we will monitor stability for a week - if needed we may reduce concurrency a bit on API tests as well, and identify "heavy" tests candidate for removal / refactor
- the QA team won't approve any new test (scenario or heavy resource consuming api) until gate stability is ensured 

Thanks for your patience and collaboration!

Andrea

---
irc: andreaf

[0] http://eavesdrop.openstack.org/meetings/qa/2017/qa.2017-03-02-17.00.txt
[1] https://ethercalc.openstack.org/nu56u2wrfb2b
[2] https://review.openstack.org/#/c/439698/








__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
OpenStack-dev-request at lists.openstack.org?subject:unsubscribe <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
OpenStack-dev-request at lists.openstack.org?subject:unsubscribe <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
OpenStack-dev-request at lists.openstack.org?subject:unsubscribe <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev








More information about the OpenStack-dev mailing list