VMs shut themselves down
Hello Team, I have an openstack environment with 4 compute Nodes however yesterday out of the blue, the Instances on one of the compute nodes started shutting themselves down. Then I realized the the compute service was disabled when I checked in Horizon. I enabled it and restarted the instances and all was well but I cant figure out what caused it. I have attached the nova_compute logs here incase someone would like to support https://paste.openstack.org/show/bxqpgjvqT3DlOYWoeejc/ Regards Tony Karera
This looks exactly same as the one seen in the list yesterday, and I see Eugen already gave you his insights. https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.... What's the difference from the previous mail ? Could you explain what you additionally need now ? I'd strongly suggest that you do not copy the same email to get more responses. If you need further help then being specific about your current problem may help people helping you. I'd also request you to be more patient and expect best effort basis response in mailing lists maintained by communities, in general. On 1/26/24 17:04, Karera Tony wrote:
Hello Team, I have an openstack environment with 4 compute Nodes however yesterday out of the blue, the Instances on one of the compute nodes started shutting themselves down. Then I realized the the compute service was disabled when I checked in Horizon. I enabled it and restarted the instances and all was well but I cant figure out what caused it. I have attached the nova_compute logs here incase someone would like to support
https://paste.openstack.org/show/bxqpgjvqT3DlOYWoeejc/ Regards
Tony Karera
Adding a few points I agree with Eugen and what he said in his reply to your another email. nova-compute log contains EHOSTUNREACH in connection to RabbitMQ, which indicates that there might be a problem with network connectivity. I see you asked a few questions about Masakari in the list. In case you have Masakari or any other technologies to automate instance HA in that deployment, then I'd suggest you look into these components to check if automated evacuation was triggered or not. On 1/26/24 18:24, Takashi Kajinami wrote:
This looks exactly same as the one seen in the list yesterday, and I see Eugen already gave you his insights.
https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack....
What's the difference from the previous mail ? Could you explain what you additionally need now ?
I'd strongly suggest that you do not copy the same email to get more responses. If you need further help then being specific about your current problem may help people helping you. I'd also request you to be more patient and expect best effort basis response in mailing lists maintained by communities, in general.
On 1/26/24 17:04, Karera Tony wrote:
Hello Team, I have an openstack environment with 4 compute Nodes however yesterday out of the blue, the Instances on one of the compute nodes started shutting themselves down. Then I realized the the compute service was disabled when I checked in Horizon. I enabled it and restarted the instances and all was well but I cant figure out what caused it. I have attached the nova_compute logs here incase someone would like to support
https://paste.openstack.org/show/bxqpgjvqT3DlOYWoeejc/ Regards
Tony Karera
From the log, three vms evacuate to other compute node, and the rest vms still remain on this compute node and reboot when nova-compute start again. IMO the compute node meeted host failure. You can dig it out from system log. Did you enabled HA(Masakari)? Did the compute host power reset or reboot? Did the host system crash? ... More clue helps. 发件人: Karera Tony <tonykarera@gmail.com> 发送时间: 2024年1月26日 16:04 收件人: openstack-discuss <openstack-discuss@lists.openstack.org> 主题: VMs shut themselves down Hello Team, I have an openstack environment with 4 compute Nodes however yesterday out of the blue, the Instances on one of the compute nodes started shutting themselves down. Then I realized the the compute service was disabled when I checked in Horizon. I enabled it and restarted the instances and all was well but I cant figure out what caused it. I have attached the nova_compute logs here incase someone would like to support https://paste.openstack.org/show/bxqpgjvqT3DlOYWoeejc/ Regards Tony Karera
Hello Sam, Yes Masakari is enabled. My main issue is what caused the VMs fo shutdown as there was no power reset or reboot. I checked on Horizon and found out that compute 1 was disabled and all is did was to re enable it. But I dont know what made it to be disabled. Regards Tony Karera On Fri, Jan 26, 2024 at 11:30 AM Sam Su (苏正伟) <suzhengwei@inspur.com> wrote:
From the log, three vms evacuate to other compute node, and the rest vms still remain on this compute node and reboot when nova-compute start again.
IMO the compute node meeted host failure. You can dig it out from system log.
Did you enabled HA(Masakari)?
Did the compute host power reset or reboot?
Did the host system crash?
...
More clue helps.
*发件人:* Karera Tony <tonykarera@gmail.com> *发送时间:* 2024年1月26日 16:04 *收件人:* openstack-discuss <openstack-discuss@lists.openstack.org> *主题:* VMs shut themselves down
Hello Team, I have an openstack environment with 4 compute Nodes however yesterday out of the blue, the Instances on one of the compute nodes started shutting themselves down. Then I realized the the compute service was disabled when I checked in Horizon. I enabled it and restarted the instances and all was well but I cant figure out what caused it. I have attached the nova_compute logs here incase someone would like to support
https://paste.openstack.org/show/bxqpgjvqT3DlOYWoeejc/
Regards
Tony Karera
If there was no instance power reset or reboot actions, power reset or reboot it was shutdown by libvirt/quemu(the instance's qemu process could abnormally exit). There is period task in nova-compute to sync instance state with libvirt. https://opendev.org/openstack/nova/src/branch/master/nova/compute/manager.py... To HA(Masakari), if one compute node meets host failure, HA(Masakari) will disable the nova-compute service. It needs administrator to enable the nova-compute service again after host failure is completely fixed. 发件人: Karera Tony <tonykarera@gmail.com> 发送时间: 2024年1月26日 20:11 收件人: Sam Su (苏正伟) <suzhengwei@inspur.com> 抄送: openstack-discuss@lists.openstack.org 主题: Re: VMs shut themselves down Hello Sam, Yes Masakari is enabled. My main issue is what caused the VMs fo shutdown as there was no power reset or reboot. I checked on Horizon and found out that compute 1 was disabled and all is did was to re enable it. But I dont know what made it to be disabled. Regards Tony Karera On Fri, Jan 26, 2024 at 11:30 AM Sam Su (苏正伟) <suzhengwei@inspur.com <mailto:suzhengwei@inspur.com> > wrote: From the log, three vms evacuate to other compute node, and the rest vms still remain on this compute node and reboot when nova-compute start again. IMO the compute node meeted host failure. You can dig it out from system log. Did you enabled HA(Masakari)? Did the compute host power reset or reboot? Did the host system crash? ... More clue helps. 发件人: Karera Tony <tonykarera@gmail.com <mailto:tonykarera@gmail.com> > 发送时间: 2024年1月26日 16:04 收件人: openstack-discuss <openstack-discuss@lists.openstack.org <mailto:openstack-discuss@lists.openstack.org> > 主题: VMs shut themselves down Hello Team, I have an openstack environment with 4 compute Nodes however yesterday out of the blue, the Instances on one of the compute nodes started shutting themselves down. Then I realized the the compute service was disabled when I checked in Horizon. I enabled it and restarted the instances and all was well but I cant figure out what caused it. I have attached the nova_compute logs here incase someone would like to support https://paste.openstack.org/show/bxqpgjvqT3DlOYWoeejc/ Regards Tony Karera
Hi Tony, If there was no instance power reset or reboot actions, probably it was shutdown by libvirt/quemu(the instance's qemu process could abnormally exit, if so you could find clue in libvirt or qemu logs). There is period task in nova-compute to sync instance state with libvirt. https://opendev.org/openstack/nova/src/branch/master/nova/compute/manager.py... To HA(Masakari), if one compute node meets host failure, HA(Masakari) will disable the nova-compute service. It needs administrator to enable the nova-compute service again after host failure is completely fixed. 发件人: Karera Tony <tonykarera@gmail.com> 发送时间: 2024年1月26日 20:11 收件人: Sam Su (苏正伟) <suzhengwei@inspur.com> 抄送: openstack-discuss@lists.openstack.org 主题: Re: VMs shut themselves down Hello Sam, Yes Masakari is enabled. My main issue is what caused the VMs fo shutdown as there was no power reset or reboot. I checked on Horizon and found out that compute 1 was disabled and all is did was to re enable it. But I dont know what made it to be disabled. Regards Tony Karera On Fri, Jan 26, 2024 at 11:30 AM Sam Su (苏正伟) <suzhengwei@inspur.com <mailto:suzhengwei@inspur.com> > wrote: From the log, three vms evacuate to other compute node, and the rest vms still remain on this compute node and reboot when nova-compute start again. IMO the compute node meeted host failure. You can dig it out from system log. Did you enabled HA(Masakari)? Did the compute host power reset or reboot? Did the host system crash? ... More clue helps. 发件人: Karera Tony <tonykarera@gmail.com <mailto:tonykarera@gmail.com> > 发送时间: 2024年1月26日 16:04 收件人: openstack-discuss <openstack-discuss@lists.openstack.org <mailto:openstack-discuss@lists.openstack.org> > 主题: VMs shut themselves down Hello Team, I have an openstack environment with 4 compute Nodes however yesterday out of the blue, the Instances on one of the compute nodes started shutting themselves down. Then I realized the the compute service was disabled when I checked in Horizon. I enabled it and restarted the instances and all was well but I cant figure out what caused it. I have attached the nova_compute logs here incase someone would like to support https://paste.openstack.org/show/bxqpgjvqT3DlOYWoeejc/ Regards Tony Karera
participants (3)
-
Karera Tony
-
Sam Su (苏正伟)
-
Takashi Kajinami