<div dir="ltr">Hi Vladimir,<div><br></div><div>The packages are available on launchpad here: </div><div><a href="https://launchpad.net/ubuntu/+source/qemu/1:2.5+dfsg-5ubuntu10.5/+build/10938755">https://launchpad.net/ubuntu/+source/qemu/1:2.5+dfsg-5ubuntu10.5/+build/10938755</a></div></div><br><div class="gmail_quote"><div dir="ltr">On Tue, 20 Dec 2016 at 12:49 Vladimir Prokofev <<a href="mailto:v@prokofev.me">v@prokofev.me</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg"><div class="gmail_msg">Using</div><div class="gmail_msg">compute1:~$ dpkg-query -W qemu-system-x86</div><div class="gmail_msg">qemu-system-x86 1:2.5+dfsg-5ubuntu10.6</div><div class="gmail_msg">compute1:~$ qemu-system-x86_64 --version</div><div class="gmail_msg">QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.6), Copyright (c) 2003-2008 Fabrice Bellard</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">So I guess you may be right. Now I have to confirm this.</div><div class="gmail_msg">Did you build your package from source? <span style="font-size:12.8px" class="gmail_msg">1:2.5+dfsg-5ubuntu10.5 is no longer available in official repository, and I can't find it anywhere except for the source code.</span></div><div class="gmail_msg"><br class="gmail_msg"></div></div><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg">2016-12-20 12:17 GMT+03:00 Sean Redmond <span dir="ltr" class="gmail_msg"><<a href="mailto:sean.redmond1@gmail.com" class="gmail_msg" target="_blank">sean.redmond1@gmail.com</a>></span>:<br class="gmail_msg"><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">What version of qemu are you running? I was hit by this bug[1] that seems to give off the same faults you are reporting and had to downgrade qemu packages to version '1:2.5+dfsg-5ubuntu10.5'<div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">[1] <a href="https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1647389" class="gmail_msg" target="_blank">https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1647389</a> </div></div><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg"><div class="gmail_msg"><div class="m_-8702364127425024875h5 gmail_msg">On Tue, Dec 20, 2016 at 12:10 AM, Vladimir Prokofev <span dir="ltr" class="gmail_msg"><<a href="mailto:v@prokofev.me" class="gmail_msg" target="_blank">v@prokofev.me</a>></span> wrote:<br class="gmail_msg"></div></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_msg"><div class="m_-8702364127425024875h5 gmail_msg"><div dir="ltr" class="gmail_msg">Hello Ops.<div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">I want to pick your brains on a live migration issue, cause I'm kinda stuck atm.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">I'm running a small HA Ubuntu Openstack cloud - 3 controllers(VMs on Ubuntu/KVM) with ha-proxy LB, corosync/pacemaker manages VIP, Galera DB, 3 compute nodes(using KVM hypervisor), 3 network nodes(VMs on Ubuntu/KVM), CEPH cluster(4 OSD nodes, 3 MON nodes(VMs on Ubuntu/KVM)).</div><div class="gmail_msg">Nova, cinder, and glance use CEPH as backend. Neutron uses linux-bridge.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">We were running Ubuntu Trusty 14.04/Mitaka, and decided to upgrade to Ubuntu Xenial 16.04/Newton. The issue appeared sometime after upgrade of all Mitaka packages to the latest version prior to Xenial upgrade, and stayed even after upgrade to Xenial/Newton. It's hard to tell for sure as we didn't use LM for a while. It worked fine under Trusty/Mitaka, and broke under Xenial/latest-Mitaka.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">With a chance of 20-30% live migration will fail. The instance will pause on a source node, but will not be resumed on target node. Target node will destroy instance assuming LM failed, and on source node instance will stay paused. On source node no new messages appear in nova-compute.log, and commands such as "virsh list" won't provide any output, or even exit for that matter. nova-compute can be restarted, but after normal startup messages it once again doesn't provide any new log entries. Nova considers compute node as up and running.</div><div class="gmail_msg">In dashboard you can see that instance is now residing at a new host, and is in shutoff state.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">If I restart libvirtd on target or source node then whole system "unfreezes". Source host unpauses instance, and it is live again. But it now resides on a source node, while database thinks it's on a target node and is shutdown. Warning messages will appear on source host:</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><div class="gmail_msg">2016-12-20 01:13:37.095 20025 WARNING nova.compute.manager [req-b3879539-989f-4075-9ef0-d23ef8868102 - - - - -] While synchronizing instance power states, found 3 instances in the database and 4 instances on the hypervisor.</div></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Currenly I'm stopping/destroying/undefining instance on a source node, and launching it via standart Openstack means, but this leads to an instance reboot.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Last messages in nova-compute.log on source node after LM start:</div><div class="gmail_msg"><div class="gmail_msg">2016-12-20 00:09:35.961 16834 INFO nova.virt.libvirt.migration [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Increasing downtime to 1251 ms after 0 sec elapsed time</div><div class="gmail_msg">2016-12-20 00:09:36.127 16834 INFO nova.virt.libvirt.driver [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Migration running for 0 secs, memory 100% remaining; (bytes processed=0, remaining=0, total=0)</div><div class="gmail_msg">2016-12-20 00:09:36.894 16834 INFO nova.compute.manager [req-3a2a828f-8b76-4a65-b49b-ea8d232a3de5 - - - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] VM Paused (Lifecycle Event)</div><div class="gmail_msg">2016-12-20 00:09:37.046 16834 INFO nova.compute.manager [req-3a2a828f-8b76-4a65-b49b-ea8d232a3de5 - - - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] During sync_power_state the instance has a pending task (migrating). Skip.</div><div class="gmail_msg">2016-12-20 00:09:37.300 16834 INFO nova.virt.libvirt.driver [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Migration operation has completed</div><div class="gmail_msg">2016-12-20 00:09:37.301 16834 INFO nova.compute.manager [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] _post_live_migration() is started..</div><div class="gmail_msg">2016-12-20 00:09:39.718 16834 INFO os_vif [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] Successfully unplugged vif VIFBridge(active=False,address=fa:16:3e:e3:c4:c7,bridge_name='brq0b7e75ee-b6',has_traffic_filtering=True,id=2a141727-77a7-4b43-b4cd-21fb415a6a55,network=Network(0b7e75ee-b691-4851-9b29-08c3c62a96e0),plugin='linux_bridge',port_profile=<?>,preserve_on_delete=False,vif_name='tap2a141727-77')</div><div class="gmail_msg">2016-12-20 00:09:39.777 16834 INFO nova.virt.libvirt.driver [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Deleting instance files /var/lib/nova/instances/cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53_del</div><div class="gmail_msg">2016-12-20 00:09:39.778 16834 INFO nova.virt.libvirt.driver [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Deletion of /var/lib/nova/instances/cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53_del complete</div><div class="gmail_msg">2016-12-20 00:10:33.644 16834 INFO nova.compute.resource_tracker [req-00c4c403-d039-4347-b7ad-3098834a891c - - - - -] Auditing locally available compute resources for node compute1.cloud1.local</div></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">No new messages will appear until libvirtd restart.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Messages during LM in nova-compute.log on target node:</div><div class="gmail_msg"><div class="gmail_msg">2016-12-20 00:09:32.234 29438 INFO nova.virt.libvirt.driver [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] Instance launched has CPU info: {"vendor": "Intel", "model": "Nehalem", "arch": "x86_64", "features": ["pge", "clflush", "sep", "syscall", "vme", "dtes64", "msr", "vmx", "xtpr", "cmov", "ssse3", "est", "pat", "monitor", "smx", "pbe", "lm", "tsc", "nx", "fxsr", "tm", "sse4.1", "pae", "sse4.2", "acpi", "mmx", "cx8", "mce", "de", "rdtscp", "ht", "dca", "lahf_lm", "pdcm", "mca", "pdpe1gb", "apic", "sse", "pse", "ds", "invtsc", "pni", "tm2", "sse2", "ss", "ds_cpl", "pcid", "fpu", "cx16", "pse36", "mtrr", "popcnt"], "topology": {"cores": 6, "cells": 2, "threads": 2, "sockets": 1}}</div><div class="gmail_msg">2016-12-20 00:09:34.476 29438 INFO os_vif [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] Successfully plugged vif VIFBridge(active=True,address=fa:16:3e:e3:c4:c7,bridge_name='brq0b7e75ee-b6',has_traffic_filtering=True,id=2a141727-77a7-4b43-b4cd-21fb415a6a55,network=Network(0b7e75ee-b691-4851-9b29-08c3c62a96e0),plugin='linux_bridge',port_profile=<?>,preserve_on_delete=False,vif_name='tap2a141727-77')</div><div class="gmail_msg">2016-12-20 00:09:36.711 29438 INFO nova.compute.manager [req-75abcda4-701a-4b84-8f28-c11f6ad6f933 - - - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] VM Started (Lifecycle Event)</div><div class="gmail_msg">2016-12-20 00:09:39.777 29438 INFO nova.compute.manager [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Post operation of migration started</div><div class="gmail_msg">2016-12-20 00:09:40.381 29438 INFO nova.network.neutronv2.api [req-51801b5e-b77a-4d76-ad87-176326ac910e 84498aa7e26443c4908d973f3e86d530 ecee1197f46e453dba25669554226ce5 - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Updating port 2a141727-77a7-4b43-b4cd-21fb415a6a55 with attributes {'binding:profile': {}, 'binding:host_id': 'compute3'}</div><div class="gmail_msg">2016-12-20 00:09:52.589 29438 INFO nova.compute.manager [-] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] VM Stopped (Lifecycle Event)</div><div class="gmail_msg">2016-12-20 00:09:52.747 29438 WARNING nova.compute.manager [req-e945fe0d-dea0-47ee-8b87-b88991b9796e - - - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Instance shutdown by itself. Calling the stop API. Current vm_state: active, current task_state: None, original DB power_state: 4, current VM power_state: 4</div><div class="gmail_msg">2016-12-20 00:09:52.964 29438 INFO nova.compute.manager [req-e945fe0d-dea0-47ee-8b87-b88991b9796e - - - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Instance is already powered off in the hypervisor when stop is called.</div><div class="gmail_msg">2016-12-20 00:09:53.039 29438 INFO nova.virt.libvirt.driver [req-e945fe0d-dea0-47ee-8b87-b88991b9796e - - - - -] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Instance already shutdown.</div><div class="gmail_msg">2016-12-20 00:09:53.043 29438 INFO nova.virt.libvirt.driver [-] [instance: cd8cb1db-dca3-4b0f-a03e-c0befbbd7b53] Instance destroyed successfully.</div><div class="gmail_msg">2016-12-20 00:09:56.696 29438 INFO nova.compute.resource_tracker [req-53e5b7a1-aba9-4b26-a3fb-b9b293bf2f17 - - - - -] Auditing locally available compute resources for node compute3.cloud1.local</div><div class="gmail_msg">2016-12-20 00:09:57.295 29438 INFO nova.compute.resource_tracker [req-53e5b7a1-aba9-4b26-a3fb-b9b293bf2f17 - - - - -] Total usable vcpus: 24, total allocated vcpus: 39</div><div class="gmail_msg">2016-12-20 00:09:57.295 29438 INFO nova.compute.resource_tracker [req-53e5b7a1-aba9-4b26-a3fb-b9b293bf2f17 - - - - -] Final resource view: name=compute3.cloud1.local phys_ram=145060MB used_ram=94720MB phys_disk=14710GB used_disk=45GB total_vcpus=24 used_vcpus=39 pci_stats=[]</div></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Target node will proceed normal operations.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Nothing interesting appears in libvirtd log or syslog on any compute node. I've tried running debug log for libvirt, but it's unreadable for me - all I can figure out is that libvirt is still running and other instances are unaffected.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">So, my questions.</div><div class="gmail_msg">- Is there any workaround to sync the state of DB and compute node? I wish at least to avoid instance reboot, and for that I need some way to tell DB that instance is on a source node and running, without service interruption. Is it ok to update database directly via SQL?</div><div class="gmail_msg">- How to further troubleshoot this issue? As I said. I'm kinda stuck and have no idea where to look further. If the issue appeared 100% of the time it would be a lot easier, but there's no pattern I can figure out - it happens randomly with heavy loaded instances, and idle instances alike.</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Any suggestions appreciated. I need a boot in a right direction.</div><div class="gmail_msg">I will provide any logs and/or config files per request.</div></div>
<br class="gmail_msg"></div></div>_______________________________________________<br class="gmail_msg">
OpenStack-operators mailing list<br class="gmail_msg">
<a href="mailto:OpenStack-operators@lists.openstack.org" class="gmail_msg" target="_blank">OpenStack-operators@lists.openstack.org</a><br class="gmail_msg">
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" class="gmail_msg" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br class="gmail_msg">
<br class="gmail_msg"></blockquote></div><br class="gmail_msg"></div>
</blockquote></div><br class="gmail_msg"></div>
_______________________________________________<br class="gmail_msg">
OpenStack-operators mailing list<br class="gmail_msg">
<a href="mailto:OpenStack-operators@lists.openstack.org" class="gmail_msg" target="_blank">OpenStack-operators@lists.openstack.org</a><br class="gmail_msg">
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" class="gmail_msg" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br class="gmail_msg">
</blockquote></div>