<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Mar 17, 2017 at 3:11 PM, Sean Dague <span dir="ltr"><<a href="mailto:sean@dague.net" target="_blank">sean@dague.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="m_-2116658493498778639gmail-">On 03/17/2017 09:24 AM, Jordan Pittier wrote:<br>

><br>

><br>

> On Fri, Mar 17, 2017 at 1:58 PM, Sean Dague <<a href="mailto:sean@dague.net" target="_blank">sean@dague.net</a><br>

</span><div><div class="m_-2116658493498778639gmail-h5">> <mailto:<a href="mailto:sean@dague.net" target="_blank">sean@dague.net</a>>> wrote:<br>

><br>

>     On 03/17/2017 08:27 AM, Jordan Pittier wrote:<br>

>     > The patch that reduced the number of Tempest Scenarios we run in every<br>

>     > job and also reduce the test run concurrency [0] was merged 13 days ago.<br>

>     > Since, the situation (i.e the high number of false negative job results)<br>

>     > has not improved significantly. We need to keep looking collectively at<br>

>     > this.<br>

><br>

>     While the situation hasn't completely cleared out -<br>

>     <a href="http://tinyurl.com/mdmdxlk" rel="noreferrer" target="_blank">http://tinyurl.com/mdmdxlk</a> - since we've merged this we've not seen that<br>

>     job go over 25% failure rate in the gate, which it was regularly<br>

>     crossing in the prior 2 week period. That does feel like progress. In<br>

>     spot checking I we are also rarely failing in scenario tests now, but<br>

>     the fails tend to end up inside heavy API tests running in parallel.<br>

><br>

><br>

>     > There seems to be an agreement that we are hitting some memory limit.<br>

>     > Several of our most frequent failures are memory related [1]. So we<br>

>     > should either reduce our memory usage or ask for bigger VMs, with more<br>

>     > than 8GB of RAM.<br>

>     ><br>

>     > There was/is several attempts to reduce our memory usage, by reducing<br>

>     > the Mysql memory consumption ([2] but quickly reverted [3]), reducing<br>

>     > the number of Apache workers ([4], [5]), more apache2 tuning [6]. If you<br>

>     > have any crazy idea to help in this regard, please help. This is high<br>

>     > priority for the whole openstack project, because it's plaguing many<br>

>     > projects.<br>

><br>

>     Interesting, I hadn't seen the revert. It is also curious that it was<br>

>     largely limitted to the neutron-api test job. It's also notable that the<br>

>     sort buffers seem to have been set to the minimum allowed limit of mysql<br>

>     -<br>

>     <a href="https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_sort_buffer_size" rel="noreferrer" target="_blank">https://dev.mysql.com/doc/ref<wbr>man/5.6/en/innodb-parameters.<wbr>html#sysvar_innodb_sort_<wbr>buffer_size</a><br>

>     <<a href="https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_sort_buffer_size" rel="noreferrer" target="_blank">https://dev.mysql.com/doc/re<wbr>fman/5.6/en/innodb-parameters.<wbr>html#sysvar_innodb_sort_<wbr>buffer_size</a>><br>

>     - and is over an order of magnitude decrease from the existing default.<br>

><br>

>     I wonder about redoing the change with everything except it and seeing<br>

>     how that impacts the neutron-api job.<br>

><br>

> Yes, that would be great because mysql is by far our biggest memory<br>

> consumer so we should target this first.<br>

<br>

</div></div>While it is the single biggest process, weighing in at 500 MB, the<br>

python services are really our biggest memory consumers. They are<br>

collectively far outweighing either mysql or rabbit, and are the reason<br>

that even with 64MB guests we're running out of memory. So we want to<br>

keep that under perspective.<br></blockquote><div>Absolutely. I have <a href="https://review.openstack.org/#/c/446986/" target="_blank">https://review.openstack.org/#<wbr>/c/446986/</a> in that vain.  And if someone wants to start the work of not running the several Swift *auditor*, *updater*, *reaper*, *replicator* services, in case the Swift Replication factor is set to 1, that's also a good memory saving. <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<div class="m_-2116658493498778639gmail-HOEnZb"><div class="m_-2116658493498778639gmail-h5"><br>

        -Sean<br>

<br>

--<br>

Sean Dague<br>

<a href="http://dague.net" rel="noreferrer" target="_blank">http://dague.net</a><br>

<br>

______________________________<wbr>______________________________<wbr>______________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.op<wbr>enstack.org?subject:unsubscrib<wbr>e</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi<wbr>-bin/mailman/listinfo/openstac<wbr>k-dev</a><br>

</div></div></blockquote></div><br></div></div>