<div dir="ltr"><div><div><div><div><div><div>Hello, this morning I connected to my office by remote to send information you requested.<br><br></div>Attached here there are:<br></div>status: results of command pcs status<br></div>resources: results of command pcs resources<br></div>hosts: controllers /etc/hosts where I added aliases for compute nodes every time I rebooted a compute node simulating an unexpected reboot<br></div>As you can see last simulation was on compute-node1. Infact it is marked offline but its remote pacemaker service is online.<br><br>[root@compute-1 ~]# systemctl status pacemaker_remote.service<br>● pacemaker_remote.service - Pacemaker Remote Service<br> Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled; vendor preset: disabled)<br> Active: active (running) since ven 2017-05-12 09:30:08 EDT; 1 day 20h ago<br> Docs: man:pacemaker_remoted<br> <a href="http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Remote/index.html">http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Remote/index.html</a><br> Main PID: 3756 (pacemaker_remot)<br> CGroup: /system.slice/pacemaker_remote.service<br> └─3756 /usr/sbin/pacemaker_remoted<br><br>mag 12 09:30:08 compute-1 systemd[1]: Started Pacemaker Remote Service.<br>mag 12 09:30:08 compute-1 systemd[1]: Starting Pacemaker Remote Service...<br>mag 12 09:30:08 compute-1 pacemaker_remoted[3756]: notice: Additional loggi...<br>mag 12 09:30:08 compute-1 pacemaker_remoted[3756]: notice: Starting a tls l...<br>mag 12 09:30:08 compute-1 pacemaker_remoted[3756]: notice: Listening on add...<br>Hint: Some lines were ellipsized, use -l to show in full.<br><br></div>Regards Ignazio<br><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2017-05-13 16:55 GMT+02:00 Sam P <span dir="ltr"><<a href="mailto:sam47priya@gmail.com" target="_blank">sam47priya@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
This might not what exactly you are looking for... but... you may extend this.<br>
In Masakari [0], we use pacemaker-remote in masakari-monitors[1] to<br>
monitor node failures.<br>
In [1], there is hostmonitor.sh, which will gonna deprecate in next<br>
cycle, but straightforward way to do this.<br>
[0] <a href="https://wiki.openstack.org/wiki/Masakari" rel="noreferrer" target="_blank">https://wiki.openstack.org/<wbr>wiki/Masakari</a><br>
[1] <a href="https://github.com/openstack/masakari-monitors/tree/master/masakarimonitors/hostmonitor" rel="noreferrer" target="_blank">https://github.com/openstack/<wbr>masakari-monitors/tree/master/<wbr>masakarimonitors/hostmonitor</a><br>
<br>
Then there is pacemaker-resources agents,<br>
<a href="https://github.com/openstack/openstack-resource-agents/tree/master/ocf" rel="noreferrer" target="_blank">https://github.com/openstack/<wbr>openstack-resource-agents/<wbr>tree/master/ocf</a><br>
<span class=""><br>
> I have already tried "pcs resource cleanup" but it cleans fine all resources<br>
> but not remote nodes.<br>
> Anycase on monday I'll send what you requested.<br>
</span>Hope we can get more details on Monday.<br>
<br>
--- Regards,<br>
Sampath<br>
<br>
<br>
<br>
On Sat, May 13, 2017 at 9:52 PM, Ignazio Cassano<br>
<div class="HOEnZb"><div class="h5"><<a href="mailto:ignaziocassano@gmail.com">ignaziocassano@gmail.com</a>> wrote:<br>
> Thanks Curtis.<br>
> I have already tried "pcs resource cleanup" but it cleans fine all resources<br>
> but not remote nodes.<br>
> Anycase on monday I'll send what you requested.<br>
> Regards<br>
> Ignazio<br>
><br>
> Il 13/Mag/2017 14:27, "Curtis" <<a href="mailto:serverascode@gmail.com">serverascode@gmail.com</a>> ha scritto:<br>
><br>
> On Fri, May 12, 2017 at 10:23 PM, Ignazio Cassano<br>
> <<a href="mailto:ignaziocassano@gmail.com">ignaziocassano@gmail.com</a>> wrote:<br>
>> Hi Curtis, at this time I am using remote pacemaker only for controlli ng<br>
>> openstack services on compute nodes (neutron openvswitch-agent,<br>
>> nova-compute, ceilometer compute). I wrote my own ansible playbooks to<br>
>> install and configure all components.<br>
>> Second step could be expand it for vm high availability.<br>
>> I did not find any procedure for cleaning up compute node after rebooting<br>
>> and I googled a lot without luck.<br>
><br>
> Can you paste some putput of something like "pcs status" and I can try<br>
> to take a look?<br>
><br>
> I've only used pacemaker a little, but I'm fairly sure it's going to<br>
> be something like "pcs resource cleanup <resource_id>"<br>
><br>
> Thanks,<br>
> Curtis.<br>
><br>
>> Regards<br>
>> Ignazio<br>
>><br>
>> Il 13/Mag/2017 00:32, "Curtis" <<a href="mailto:serverascode@gmail.com">serverascode@gmail.com</a>> ha scritto:<br>
>><br>
>> On Fri, May 12, 2017 at 8:51 AM, Ignazio Cassano<br>
>> <<a href="mailto:ignaziocassano@gmail.com">ignaziocassano@gmail.com</a>> wrote:<br>
>>> Hello All,<br>
>>> I installed openstack newton p<br>
>>> with a pacemaker cluster made up of 3 controllers and 2 compute nodes.<br>
>>> All<br>
>>> computer have centos 7.3.<br>
>>> Compute nodes are provided with remote pacemaker ocf resource.<br>
>>> If before shutting down a compute node I disable the compute node<br>
>>> resource<br>
>>> in the cluster and enable it when the compute returns up, it work fine<br>
>>> and<br>
>>> cluster shows it online.<br>
>>> If the compute node goes down before disabling the compute node resource<br>
>>> in<br>
>>> the cluster, it remains offline also after it is powered up.<br>
>>> The only solution I found is removing the compute node resource in the<br>
>>> cluster and add it again with a different name (adding this new name in<br>
>>> all<br>
>>> controllers /etc/hosts file).<br>
>>> With the above workaround it returns online for the cluster and all its<br>
>>> resources (openstack-nova-compute etc etc....) return to work fine.<br>
>>> Please, does anyone know a better solution ?<br>
>><br>
>> What are you using pacemaker for on the compute nodes? I have not done<br>
>> that personally, but my impression is that sometimes people do that in<br>
>> order to have virtual machines restarted somewhere else should the<br>
>> compute node go down outside of a maintenance window (ie. "instance<br>
>> high availability"). Is that your use case? If so, I would imagine<br>
>> there is some kind of clean up procedure to put the compute node back<br>
>> into use when pacemaker thinks it has failed. Did you use some kind of<br>
>> openstack distribution or follow a particular installation document to<br>
>> enable this pacemaker setup?<br>
>><br>
>> It sounds like everything is working as expected (if my guess is<br>
>> right) and you just need the right steps to bring the node back into<br>
>> the cluster.<br>
>><br>
>> Thanks,<br>
>> Curtis.<br>
>><br>
>><br>
>>> Regards<br>
>>> Ignazio<br>
>>><br>
>>><br>
>>> ______________________________<wbr>_________________<br>
>>> OpenStack-operators mailing list<br>
>>> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.<wbr>openstack.org</a><br>
>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-operators</a><br>
>>><br>
>><br>
>><br>
>><br>
>> --<br>
>> Blog: <a href="http://serverascode.com" rel="noreferrer" target="_blank">serverascode.com</a><br>
>><br>
>><br>
><br>
><br>
><br>
> --<br>
> Blog: <a href="http://serverascode.com" rel="noreferrer" target="_blank">serverascode.com</a><br>
><br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> OpenStack-operators mailing list<br>
> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.<wbr>openstack.org</a><br>
> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-operators</a><br>
><br>
</div></div></blockquote></div><br></div>