Folks, I'm running into an issue with instances appearing to be running, even though they shouldn't be. The environment in this case is a virtualized OpenStack Kolla (2023.1) cluster, with three compute nodes and three control plane nodes. I am trying to see what happens when I create and terminate instances, in an attempt to figure out failover behavior. The three computes are in a single segment. I created four instances: +--------------------+--------+---------------------------+ | Name | Status | Host | +--------------------+--------+---------------------------+ | rj_testinstance_04 | ACTIVE | devtest-libvirt1-muon1005 | | rj_testinstance_03 | ACTIVE | devtest-libvirt1-muon1004 | | rj_testinstance_02 | ACTIVE | devtest-libvirt1-muon1006 | | rj_testinstance | ACTIVE | devtest-libvirt1-muon1005 | +--------------------+--------+---------------------------+ I decided to see what would happen if I went in and shutdown devtest-libvirt1-muon-1005 (which has two instances running). So I ssh'd into the machine and entered (as root): `echo c > /proc/sysrq-trigger` to force a crash, which duly occurred. I then made sure the host was seen as down: +---------------------------+-------+ | Hypervisor Hostname | State | +---------------------------+-------+ | devtest-libvirt1-muon1005 | down | | devtest-libvirt1-muon1004 | up | | devtest-libvirt1-muon1006 | up | +---------------------------+-------+ But even after this, the instances seem to still be up, including on the now crashed machine: +--------------------+--------+---------------------------+ | Name | Status | Host | +--------------------+--------+---------------------------+ | rj_testinstance_04 | ACTIVE | devtest-libvirt1-muon1005 | | rj_testinstance_03 | ACTIVE | devtest-libvirt1-muon1004 | | rj_testinstance_02 | ACTIVE | devtest-libvirt1-muon1006 | | rj_testinstance | ACTIVE | devtest-libvirt1-muon1005 | +--------------------+--------+---------------------------+ I would have expected the instances to have been regenerated on other compute hosts, or some sort of error. The nodes are on the same segment, so Masakari should have at least attempted to roll over the instances to other computes in the segment. But checking the instance details suggests that nothing has happened. The instances are just... lost. The Masakari host monitor log file reflects reality: 2023-12-11 19:21:49.953 8 INFO masakarimonitors.hostmonitor.host_handler.handle_host [-] 'devtest-libvirt1-muon1005' is 'offline' (current: 'offline'). Needless to say, I'm very confused here and am sure what's going on. Any advice would help. Thanks, Rob