[openstack-dev] [ironic] [infra] Nested KVM + the gate

Jay Faulkner jay at jvf.cc
Tue Jan 17 23:41:31 UTC 2017


Hi all,

Back in late October, Vasyl wrote support for devstack to auto detect, and when possible, use kvm to power Ironic gate jobs (0036d83b330d98e64d656b156001dd2209ab1903). This has lowered some job time when it works, but has caused failures — how many? It’s hard to quantify as the log messages that show the error don’t appear to be indexed by elastic search. It’s something seen often enough that the issue has become a permanent staple on our gate whiteboard, and doesn’t appear to be decreasing in quantity.

I pushed up a patch, https://review.openstack.org/#/c/421581, which keeps the auto detection behavior, but defaults devstack to use qemu emulation instead of kvm.

I have two questions:
1) Is there any way I’m not aware of we can quantify the number of failures this is causing? The key log message, "KVM: entry failed, hardware error 0x0”, shows up in logs/libvirt/qemu/node-*.txt.gz.
2) Are these failures avoidable or visible in any way?

IMO, if we can’t fix these failures, in my opinion, we have to do a change to avoid using nested KVM altogether. Lower reliability for our jobs is not worth a small decrease in job run time.

Thanks,
Jay Faulkner


More information about the OpenStack-dev mailing list