Open Stack

Sun May 14 09:59:58 UTC 2017

Hello, this morning I connected to my office by remote to send information
you requested.

Attached here there are:
status: results of command pcs status
resources: results of command pcs resources
hosts: controllers /etc/hosts where I added aliases for compute nodes every
time I rebooted a compute node simulating an unexpected reboot
As you can see last simulation was on compute-node1. Infact it is marked
offline but its remote pacemaker service is online.

[root at compute-1 ~]# systemctl status pacemaker_remote.service
● pacemaker_remote.service - Pacemaker Remote Service
   Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service;
enabled; vendor preset: disabled)
   Active: active (running) since ven 2017-05-12 09:30:08 EDT; 1 day 20h ago
     Docs: man:pacemaker_remoted

http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Remote/index.html
 Main PID: 3756 (pacemaker_remot)
   CGroup: /system.slice/pacemaker_remote.service
           └─3756 /usr/sbin/pacemaker_remoted

mag 12 09:30:08 compute-1 systemd[1]: Started Pacemaker Remote Service.
mag 12 09:30:08 compute-1 systemd[1]: Starting Pacemaker Remote Service...
mag 12 09:30:08 compute-1 pacemaker_remoted[3756]:   notice: Additional
loggi...
mag 12 09:30:08 compute-1 pacemaker_remoted[3756]:   notice: Starting a tls
l...
mag 12 09:30:08 compute-1 pacemaker_remoted[3756]:   notice: Listening on
add...
Hint: Some lines were ellipsized, use -l to show in full.

Regards Ignazio

2017-05-13 16:55 GMT+02:00 Sam P <sam47priya at gmail.com>:

> Hi,
>
>  This might not what exactly you are looking for... but... you may extend
> this.
>  In Masakari [0], we use pacemaker-remote in masakari-monitors[1] to
> monitor node failures.
>  In [1], there is hostmonitor.sh, which will gonna deprecate in next
> cycle, but straightforward way to do this.
>  [0] https://wiki.openstack.org/wiki/Masakari
>  [1] https://github.com/openstack/masakari-monitors/tree/master/
> masakarimonitors/hostmonitor
>
>  Then there is pacemaker-resources agents,
>  https://github.com/openstack/openstack-resource-agents/tree/master/ocf
>
> > I have already tried "pcs resource cleanup" but it cleans fine all
> resources
> > but not remote nodes.
> > Anycase on monday I'll send what you requested.
> Hope we can get more details on Monday.
>
> --- Regards,
> Sampath
>
>
>
> On Sat, May 13, 2017 at 9:52 PM, Ignazio Cassano
> <ignaziocassano at gmail.com> wrote:
> > Thanks Curtis.
> > I have already tried "pcs resource cleanup" but it cleans fine all
> resources
> > but not remote nodes.
> > Anycase on monday I'll send what you requested.
> > Regards
> > Ignazio
> >
> > Il 13/Mag/2017 14:27, "Curtis" <serverascode at gmail.com> ha scritto:
> >
> > On Fri, May 12, 2017 at 10:23 PM, Ignazio Cassano
> > <ignaziocassano at gmail.com> wrote:
> >> Hi Curtis, at this time I am using remote pacemaker only for controlli
> ng
> >> openstack services on compute nodes (neutron openvswitch-agent,
> >> nova-compute, ceilometer compute). I wrote my own ansible playbooks to
> >> install and configure all components.
> >> Second step could  be expand it for vm high availability.
> >> I did not find any procedure for cleaning up compute node after
> rebooting
> >> and I googled a lot without luck.
> >
> > Can you paste some putput of something like "pcs status" and I can try
> > to take a look?
> >
> > I've only used pacemaker a little, but I'm fairly sure it's going to
> > be something like "pcs resource cleanup <resource_id>"
> >
> > Thanks,
> > Curtis.
> >
> >> Regards
> >> Ignazio
> >>
> >> Il 13/Mag/2017 00:32, "Curtis" <serverascode at gmail.com> ha scritto:
> >>
> >> On Fri, May 12, 2017 at 8:51 AM, Ignazio Cassano
> >> <ignaziocassano at gmail.com> wrote:
> >>> Hello All,
> >>> I installed openstack newton p
> >>> with a pacemaker cluster made up of 3 controllers and 2 compute nodes.
> >>> All
> >>> computer have centos 7.3.
> >>> Compute nodes are provided with remote pacemaker ocf resource.
> >>> If before shutting down a compute node I disable the compute node
> >>> resource
> >>> in the cluster and enable it when the compute returns up, it work fine
> >>> and
> >>> cluster shows it online.
> >>> If the compute node goes down before disabling the compute node
> resource
> >>> in
> >>> the cluster, it remains offline also after it is powered up.
> >>> The only solution I found is removing the compute node resource in the
> >>> cluster and add it again with a different name (adding this new name in
> >>> all
> >>> controllers /etc/hosts file).
> >>> With the above workaround it returns online for the cluster and all its
> >>> resources  (openstack-nova-compute etc etc....) return to work fine.
> >>> Please,  does anyone know a better solution ?
> >>
> >> What are you using pacemaker for on the compute nodes? I have not done
> >> that personally, but my impression is that sometimes people do that in
> >> order to have virtual machines restarted somewhere else should the
> >> compute node go down outside of a maintenance window (ie. "instance
> >> high availability"). Is that your use case? If so, I would imagine
> >> there is some kind of clean up procedure to put the compute node back
> >> into use when pacemaker thinks it has failed. Did you use some kind of
> >> openstack distribution or follow a particular installation document to
> >> enable this pacemaker setup?
> >>
> >> It sounds like everything is working as expected (if my guess is
> >> right) and you just need the right steps to bring the node back into
> >> the cluster.
> >>
> >> Thanks,
> >> Curtis.
> >>
> >>
> >>> Regards
> >>> Ignazio
> >>>
> >>>
> >>> _______________________________________________
> >>> OpenStack-operators mailing list
> >>> OpenStack-operators at lists.openstack.org
> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-operators
> >>>
> >>
> >>
> >>
> >> --
> >> Blog: serverascode.com
> >>
> >>
> >
> >
> >
> > --
> > Blog: serverascode.com
> >
> >
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170514/fd53c0d6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hosts
Type: application/octet-stream
Size: 754 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170514/fd53c0d6/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: status
Type: application/octet-stream
Size: 6851 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170514/fd53c0d6/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: resources
Type: application/octet-stream
Size: 9449 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170514/fd53c0d6/attachment-0002.obj>

Open Stack

[Openstack-operators] Remote pacemaker on coHi mpute nodes

OpenStack

Community

Documentation

Branding & Legal