[Openstack-operators] Compute nodes reboot periodically by their own

Juan José Pavlik Salles jjpavlik at gmail.com
Thu Jul 24 17:25:27 UTC 2014


No problem Arne. I checked the ipmi config:

*root at cebolla:~# grep -v "^#" /etc/default/openipmi*
*IPMI_SI=yes*
*DEV_IPMI=yes*
*IPMI_WATCHDOG=no*
*IPMI_WATCHDOG_OPTIONS="timeout=60"*
*IPMI_POWEROFF=no*
*IPMI_POWERCYCLE=no*
*IPMI_IMB=no*
*root at cebolla:~# *

Even though the IPMI interface is on, the watchdog is disabled. I'd like to
try with another hardware just to check, but right now I haven't got any.


2014-07-24 13:30 GMT-03:00 Arne Wiebalck <Arne.Wiebalck at cern.ch>:

>  Oops, I apparently wasn't reading carefully enough and mixed your issue
> with hosts and mine with guests.
>
> Sorry for the noise!
> Arne
> Am 24.07.2014 17:42 schrieb Tim Bell <Tim.Bell at cern.ch>:
>
> If it is the hypervisors rebooting, a possible scenario would be if you
> have a BMC and enabled watchdog. This will reboot the server if it does not
> call home to the BMC every 'n' seconds.
>
> If you have a very busy hypervisor, you may need to tune the watchdog
> timeout.
>
> I suspect something would be logged in the BMC ipmi sel logs but not sure.
>
> Tim
>
> > -----Original Message-----
> > From: Arne Wiebalck [mailto:Arne.Wiebalck at cern.ch
> <Arne.Wiebalck at cern.ch>]
> > Sent: 24 July 2014 17:10
> > To: Juan José Pavlik Salles
> > Cc: openstack-operators at lists.openstack.org
> > Subject: Re: [Openstack-operators] Compute nodes reboot periodically by
> their
> > own
> >
> > Hi,
> >
> > Your compute nodes reboot or are shut off?
> >
> > I am currently looking at some cases where VMs seem to spontaneously shut
> > themselves off. At least from the nova logs’ perspective there is no
> difference to
> > a normal shutdown, VM owners however confirm that they did not touch
> their
> > VMs. So far I was unable to explain this.
> >
> > This is with Havana on a RHEL6 derivative, though.
> >
> > Cheers,
> >  Arne
> >
> > --
> > Arne Wiebalck
> > CERN IT
> >
> > On 24 Jul 2014, at 16:46, Juan José Pavlik Salles <jjpavlik at gmail.com>
> wrote:
> >
> > > Hello guys, We have got a small Grizzly cloud running since the
> begging of
> > 2013 with Ubuntu 12.04. 2 compute nodes, a storage node and a controller,
> > nothing too fancy. Everything works just fine, but... the compute nodes
> reboot
> > themselves periodically, sometimes every 2 weeks, some times once a
> month.
> > I've done almost everything I can think of: memory checks, analysed the
> logs,
> > moved all the VMs to one node, and I just can't find the problem.
> > >
> > > Have you ever heard this kind of behaviour on compute nodes? Any ideas
> > where I should look for the problem?
> > >
> > > Thanks in advance.
> > >
> > > --
> > > Pavlik Salles Juan José
> > > Blog - http://viviendolared.blogspot.com
> > > _______________________________________________
> > > OpenStack-operators mailing list
> > > OpenStack-operators at lists.openstack.org
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operator
> > > s
> >
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>



-- 
Pavlik Salles Juan José
Blog - http://viviendolared.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140724/8ddb6c8d/attachment.html>


More information about the OpenStack-operators mailing list