I also tried to switch from using "Database ServiceGroup driver" to using "Memcache ServiceGroup driver" per doc at https://docs.openstack.org/nova/rocky/admin/service-groups.html, by modifying the following entry in nova.conf for all hosts, including controller host: servicegroup_driver = "mc" memcached_servers = <None> service_down_time = 60 But it failed, the controller host could not see any host's nova-compute service is actually up.. So I am stuck with having to use database servicegroup driver .. On Fri, Sep 10, 2021 at 10:53 AM hai wu <haiwu.us@gmail.com> wrote:
This is Openstack train release. Nova live migration is extremely painful to deal with. Everytime we complete a live migration, nova-compute service on the source host would still be up and runnning, but it would always fail to further increment the report_count database field in nova database, host heartbeat purpose, thus controller would think the hypervisor host failed via its is_up() function checking on report_count and updated_at fields iirc. So we end up having to manually migrate one vm at a time, then restart service, then manually migrate the next vm ..
Any ideas? I already tried setting debug=True for nova.conf even for database tracing, but thus far I could not find any obvious error message. Each time the live migration (no shared storage) would succeed, but each time we will have to restart nova-compute service. This is so bad ..
Any suggestions on this would be highly appreciated.
Thanks, Hai