[openstack-dev] [TripleO][CI] Memory shortage in HA jobs, please increase it

Emilien Macchi emilien at redhat.com
Tue Aug 23 12:50:27 UTC 2016


On Fri, Aug 19, 2016 at 10:04 AM, Sagi Shnaidman <sshnaidm at redhat.com> wrote:
> Hi, Derek
>
> I suspect Sahara can cause it, it started to run on overcloud since my patch
> was merged: https://review.openstack.org/#/c/352598/
> I don't think it ever ran on jobs, because was either improperly configured
> or disabled. And according to reports it's most memory consuming service on
> overcloud controllers.

I have a patch to disable Sahara by default in upstream CI:
https://review.openstack.org/#/c/352886/

Though it will make upgrades failing because Sahara was installed by
default before.
Should we consider this patch or should I abandon it?

>
> On Fri, Aug 19, 2016 at 12:41 PM, Derek Higgins <derekh at redhat.com> wrote:
>>
>> On 19 August 2016 at 00:07, Sagi Shnaidman <sshnaidm at redhat.com> wrote:
>> > Hi,
>> >
>> > we have a problem again with not enough memory in HA jobs, all of them
>> > constantly fails in CI: http://status-tripleoci.rhcloud.com/
>>
>> Have we any idea why we need more memory all of a sudden? For months
>> the overcloud nodes have had 5G of RAM, then last week[1] we bumped it
>> too 5.5G now we need it bumped too 6G.
>>
>> If a new service has been added that is needed on the overcloud then
>> bumping to 6G is expected and probably the correct answer but I'd like
>> to see us avoiding blindly increasing the resources each time we see
>> out of memory errors without investigating if there was a regression
>> causing something to start hogging memory.
>>
>> Sorry if it seems like I'm being picky about this (I seem to resist
>> these bumps every time they come up) but there are two good reasons to
>> avoid this if possible
>> o at peak we are currently configured to run 75 simultaneous jobs
>> (although we probably don't reach that at the moment), and each HA job
>> has 5 baremetal nodes so bumping from 5G too 6G increases the amount
>> of RAM ci can use at peak by 375G
>> o When we bump the RAM usage of baremetal nodes from 5G too 6G what
>> we're actually doing is increasing the minimum requirements for
>> developers from 28G(or whatever the number is now) too 32G
>>
>> So before we bump the number can we just check first if its justified,
>> as I've watched this number increase from 2G since we started running
>> tripleo-ci
>>
>> thanks,
>> Derek.
>>
>> [1] - https://review.openstack.org/#/c/353655/
>>
>> > I've created a patch that will increase it[1], but we need to increase
>> > it
>> > right now on rh1.
>> > I can't do it now, because unfortunately I'll not be able to watch this
>> > if
>> > it works and no problems appear.
>> > TripleO CI cloud admins, please increase the memory for baremetal flavor
>> > on
>> > rh1 tomorrow (to 6144?).
>> >
>> > Thanks
>> >
>> > [1] https://review.openstack.org/#/c/357532/
>> > --
>> > Best regards
>> > Sagi Shnaidman
>
>
>
>
> --
> Best regards
> Sagi Shnaidman
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Emilien Macchi



More information about the OpenStack-dev mailing list