<div dir="ltr"><div><div>Hi, Derek<br><br></div>I suspect Sahara can cause it, it started to run on overcloud since my patch was merged: <a href="https://review.openstack.org/#/c/352598/">https://review.openstack.org/#/c/352598/</a><br></div>I don't think it ever ran on jobs, because was either improperly configured or disabled. And according to reports it's most memory consuming service on overcloud controllers.<br><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Aug 19, 2016 at 12:41 PM, Derek Higgins <span dir="ltr"><<a href="mailto:derekh@redhat.com" target="_blank">derekh@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 19 August 2016 at 00:07, Sagi Shnaidman <<a href="mailto:sshnaidm@redhat.com">sshnaidm@redhat.com</a>> wrote:<br>

> Hi,<br>

><br>

> we have a problem again with not enough memory in HA jobs, all of them<br>

> constantly fails in CI: <a href="http://status-tripleoci.rhcloud.com/" rel="noreferrer" target="_blank">http://status-tripleoci.<wbr>rhcloud.com/</a><br>

<br>

</span>Have we any idea why we need more memory all of a sudden? For months<br>

the overcloud nodes have had 5G of RAM, then last week[1] we bumped it<br>

too 5.5G now we need it bumped too 6G.<br>

<br>

If a new service has been added that is needed on the overcloud then<br>

bumping to 6G is expected and probably the correct answer but I'd like<br>

to see us avoiding blindly increasing the resources each time we see<br>

out of memory errors without investigating if there was a regression<br>

causing something to start hogging memory.<br>

<br>

Sorry if it seems like I'm being picky about this (I seem to resist<br>

these bumps every time they come up) but there are two good reasons to<br>

avoid this if possible<br>

o at peak we are currently configured to run 75 simultaneous jobs<br>

(although we probably don't reach that at the moment), and each HA job<br>

has 5 baremetal nodes so bumping from 5G too 6G increases the amount<br>

of RAM ci can use at peak by 375G<br>

o When we bump the RAM usage of baremetal nodes from 5G too 6G what<br>

we're actually doing is increasing the minimum requirements for<br>

developers from 28G(or whatever the number is now) too 32G<br>

<br>

So before we bump the number can we just check first if its justified,<br>

as I've watched this number increase from 2G since we started running<br>

tripleo-ci<br>

<br>

thanks,<br>

Derek.<br>

<br>

[1] - <a href="https://review.openstack.org/#/c/353655/" rel="noreferrer" target="_blank">https://review.openstack.org/#<wbr>/c/353655/</a><br>

<div class="HOEnZb"><div class="h5"><br>

> I've created a patch that will increase it[1], but we need to increase it<br>

> right now on rh1.<br>

> I can't do it now, because unfortunately I'll not be able to watch this if<br>

> it works and no problems appear.<br>

> TripleO CI cloud admins, please increase the memory for baremetal flavor on<br>

> rh1 tomorrow (to 6144?).<br>

><br>

> Thanks<br>

><br>

> [1] <a href="https://review.openstack.org/#/c/357532/" rel="noreferrer" target="_blank">https://review.openstack.org/#<wbr>/c/357532/</a><br>

> --<br>

> Best regards<br>

> Sagi Shnaidman<br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>Best regards<br></div>Sagi Shnaidman<br></div></div>

</div>