[openstack-dev] [TripleO][CI] Memory shortage in HA jobs, please increase it

Sagi Shnaidman sshnaidm at redhat.com
Fri Aug 19 14:04:15 UTC 2016


Hi, Derek

I suspect Sahara can cause it, it started to run on overcloud since my
patch was merged: https://review.openstack.org/#/c/352598/
I don't think it ever ran on jobs, because was either improperly configured
or disabled. And according to reports it's most memory consuming service on
overcloud controllers.


On Fri, Aug 19, 2016 at 12:41 PM, Derek Higgins <derekh at redhat.com> wrote:

> On 19 August 2016 at 00:07, Sagi Shnaidman <sshnaidm at redhat.com> wrote:
> > Hi,
> >
> > we have a problem again with not enough memory in HA jobs, all of them
> > constantly fails in CI: http://status-tripleoci.rhcloud.com/
>
> Have we any idea why we need more memory all of a sudden? For months
> the overcloud nodes have had 5G of RAM, then last week[1] we bumped it
> too 5.5G now we need it bumped too 6G.
>
> If a new service has been added that is needed on the overcloud then
> bumping to 6G is expected and probably the correct answer but I'd like
> to see us avoiding blindly increasing the resources each time we see
> out of memory errors without investigating if there was a regression
> causing something to start hogging memory.
>
> Sorry if it seems like I'm being picky about this (I seem to resist
> these bumps every time they come up) but there are two good reasons to
> avoid this if possible
> o at peak we are currently configured to run 75 simultaneous jobs
> (although we probably don't reach that at the moment), and each HA job
> has 5 baremetal nodes so bumping from 5G too 6G increases the amount
> of RAM ci can use at peak by 375G
> o When we bump the RAM usage of baremetal nodes from 5G too 6G what
> we're actually doing is increasing the minimum requirements for
> developers from 28G(or whatever the number is now) too 32G
>
> So before we bump the number can we just check first if its justified,
> as I've watched this number increase from 2G since we started running
> tripleo-ci
>
> thanks,
> Derek.
>
> [1] - https://review.openstack.org/#/c/353655/
>
> > I've created a patch that will increase it[1], but we need to increase it
> > right now on rh1.
> > I can't do it now, because unfortunately I'll not be able to watch this
> if
> > it works and no problems appear.
> > TripleO CI cloud admins, please increase the memory for baremetal flavor
> on
> > rh1 tomorrow (to 6144?).
> >
> > Thanks
> >
> > [1] https://review.openstack.org/#/c/357532/
> > --
> > Best regards
> > Sagi Shnaidman
>



-- 
Best regards
Sagi Shnaidman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160819/7c2205a4/attachment.html>


More information about the OpenStack-dev mailing list