<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div class="">Blair,</div>
<div class=""><br class="">
</div>
<div class="">We’ve seen these errors in our deployment as well, on CentOS 7.3 with 3.10 kernels, when looking into instance</div>
<div class="">issues. So far we’ve always discarded them as not relevant to the problems observed, so I’d be very interested if</div>
<div class="">it turns out that these should better not be ignored.</div>
<div class=""><br class="">
</div>
<div class="">Cheers,</div>
<div class=""> Arne</div>
<div class=""><br class="">
</div>
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On 17 Oct 2017, at 18:52, George Mihaiescu <<a href="mailto:lmihaiescu@gmail.com" class="">lmihaiescu@gmail.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">
<div class="">Hi Blair,<br class="">
<br class="">
</div>
We had a few cases of compute nodes hanging with the last log in syslog being related to "rdmsr", and requiring hard reboots:<br class="">
kvm [29216]: vcpu0 unhandled rdmsr: 0x345<br class="">
<br class="">
</div>
The workloads are probably similar to yours (SGE workers doing genomics) with CPU mode host-passthrough, on top of Ubuntu 16.04 and kernel 4.4.0-96-generic.<br class="">
<br class="">
</div>
I'm not sure the "rdmsr" logs are relevant though, because we see them on other compute nodes that have no issues.<br class="">
<br class="">
</div>
Did you find anything that might indicate what the root cause is?<br class="">
<br class="">
</div>
Cheers,<br class="">
</div>
George<br class="">
<div class="">
<div class="">
<div class="">
<div class=""><br class="">
</div>
</div>
</div>
</div>
</div>
<div class="gmail_extra"><br class="">
<div class="gmail_quote">On Thu, Oct 12, 2017 at 5:26 PM, Blair Bethwaite <span dir="ltr" class="">
<<a href="mailto:blair.bethwaite@gmail.com" target="_blank" class="">blair.bethwaite@gmail.com</a>></span> wrote:<br class="">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="auto" class="">Hi all,
<div dir="auto" class=""><br class="">
</div>
<div dir="auto" class="">Has anyone seen guest crashes/freezes associated with KVM unhandled rdmsr messages in dmesg on the hypervisor?</div>
<div dir="auto" class=""><br class="">
</div>
<div dir="auto" class="">We have seen these messages before but never with a strong correlation to guest problems. However over the past couple of weeks this is happening almost daily with consistent correlation for a set of hosts dedicated to a particular
HPC workload. So far as I know the workload has not changed, but we have just recently moved the hypervisors to Ubuntu Xenial (though they were already on the Xenial kernel previously) and done minor guest (CentOS7) updates. CPU mode is host-passthrough. Currently
trying to figure out if the CPU flags in the guest have changed since the host upgrade...</div>
<div dir="auto" class=""><br class="">
</div>
<div dir="auto" class="">Cheers,</div>
</div>
<br class="">
______________________________<wbr class="">_________________<br class="">
OpenStack-operators mailing list<br class="">
<a href="mailto:OpenStack-operators@lists.openstack.org" class="">OpenStack-operators@lists.<wbr class="">openstack.org</a><br class="">
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank" class="">http://lists.openstack.org/<wbr class="">cgi-bin/mailman/listinfo/<wbr class="">openstack-operators</a><br class="">
<br class="">
</blockquote>
</div>
<br class="">
</div>
_______________________________________________<br class="">
OpenStack-operators mailing list<br class="">
<a href="mailto:OpenStack-operators@lists.openstack.org" class="">OpenStack-operators@lists.openstack.org</a><br class="">
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators<br class="">
</div>
</blockquote>
</div>
<br class="">
<div class="">--<br class="">
Arne Wiebalck<br class="">
CERN IT </div>
<br class="">
</body>
</html>