[Openstack-operators] well tested distro/kernel combination for production

George Shuklin george.shuklin at gmail.com
Sun Feb 9 15:51:54 UTC 2014


Host.

I'm usually not bother with guest problems.

I'm not sure, but I think I hit that problem a second time (Fist time it 
was during snapshot creation too, but I did not dig deep enough).
Most obvious symptom is '100% disk utilization' in atop regardless of 
actual IO, the second is 'stalled' messages in dmesg after 120 sec.

On 02/09/2014 05:31 AM, Narayan Desai wrote:
> Host or guest?
>  -nld
>
>
> On Sat, Feb 8, 2014 at 6:40 PM, George Shuklin 
> <george.shuklin at gmail.com <mailto:george.shuklin at gmail.com>> wrote:
>
>     Can't say which kernel is stable, but just yesterday I've got
>     rather unfunny error on my lab stand with 3.8.0-35-generic
>     (x86_64): vim went to IO and did not come back (in D+ state). Disk
>     was fine, other software was fine, but in_flight time was 100% for
>     disk and kernel starts to report 'stall' about hanged vim. I
>     played around some time, but none of tricks was not able to 'free'
>     vim (not the disk reinitialization, not the pci bus rescan).
>
>     In my case that happens after rather brutal test of 'snapshot
>     creating during 32 concurrent read/write operations from instance'.
>
>
>     On 08.02.2014 06:02, sylecn wrote:
>
>         Hi,
>
>         I have experienced "rcu_sched detected stalls on CPUs/tasks"
>         in ubuntu vms, which result in dead vm that can't be
>         rebooted/deleted, and I believe it's because of either bug in
>         hypervisor kernel or guest kernel.
>
>         I'd like to know which os version and kernel version do you
>         use in production. Both public and private clouds are welcome.
>         My company plans to run a small (to medium) private cloud.
>         Hypervisor runs ubuntu 12.04 and the first guest OSes will be
>         ubuntu 12.04 and CentOS 6. So kernel version for those is much
>         appreciated.
>
>         Is there a wiki page about this?
>
>         PS. Here is a combination that have the above mentioned error:
>
>         hypervisor os: ubuntu 12.04.3
>         hypervisor kernel: 3.8.0-35-generic
>         vm os: ubuntu 12.04
>         vm kernel: 3.2.0-56-virtual
>         openstack: havana
>         libvirt: 1.1.1-0ubuntu8~cloud2
>
>         Relevant old bugs on similar issues:
>         rhel5.5 running as kvm guest hangs randomly
>         https://bugzilla.redhat.com/show_bug.cgi?id=619798
>
>         Bug #503138 "Lucid & Natty, KVM, After kernel message hrtimer:
>         ..." : Bugs : "kvm" package : Ubuntu
>         https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/503138
>
>         I don't have a 100% way to reproduce the problem, but it
>         happens quite often, no matter when the vm is idle or loaded,
>         which is not acceptable in production.
>
>
>
>
>     _______________________________________________
>     OpenStack-operators mailing list
>     OpenStack-operators at lists.openstack.org
>     <mailto:OpenStack-operators at lists.openstack.org>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140209/33958488/attachment.html>


More information about the OpenStack-operators mailing list