[Openstack-operators] well tested distro/kernel combination for production

George Shuklin george.shuklin at gmail.com
Sun Feb 9 00:40:37 UTC 2014


Can't say which kernel is stable, but just yesterday I've got rather 
unfunny error on my lab stand with 3.8.0-35-generic (x86_64): vim went 
to IO and did not come back (in D+ state). Disk was fine, other software 
was fine, but in_flight time was 100% for disk and kernel starts to 
report 'stall' about hanged vim. I played around some time, but none of 
tricks was not able to 'free' vim (not the disk reinitialization, not 
the pci bus rescan).

In my case that happens after rather brutal test of 'snapshot creating 
during 32 concurrent read/write operations from instance'.

On 08.02.2014 06:02, sylecn wrote:
> Hi,
>
> I have experienced "rcu_sched detected stalls on CPUs/tasks" in ubuntu 
> vms, which result in dead vm that can't be rebooted/deleted, and I 
> believe it's because of either bug in hypervisor kernel or guest kernel.
>
> I'd like to know which os version and kernel version do you use in 
> production. Both public and private clouds are welcome. My company 
> plans to run a small (to medium) private cloud. Hypervisor runs ubuntu 
> 12.04 and the first guest OSes will be ubuntu 12.04 and CentOS 6. So 
> kernel version for those is much appreciated.
>
> Is there a wiki page about this?
>
> PS. Here is a combination that have the above mentioned error:
>
> hypervisor os: ubuntu 12.04.3
> hypervisor kernel: 3.8.0-35-generic
> vm os: ubuntu 12.04
> vm kernel: 3.2.0-56-virtual
> openstack: havana
> libvirt: 1.1.1-0ubuntu8~cloud2
>
> Relevant old bugs on similar issues:
> rhel5.5 running as kvm guest hangs randomly
> https://bugzilla.redhat.com/show_bug.cgi?id=619798
>
> Bug #503138 “Lucid & Natty, KVM, After kernel message hrtimer: ...” : 
> Bugs : “kvm” package : Ubuntu
> https://bugs.launchpad.net/ubuntu/+source/kvm/+bug/503138
>
> I don't have a 100% way to reproduce the problem, but it happens quite 
> often, no matter when the vm is idle or loaded, which is not 
> acceptable in production.
>
>




More information about the OpenStack-operators mailing list