[Openstack-operators] well tested distro/kernel combination for production
george.shuklin at gmail.com
Sun Feb 9 00:40:37 UTC 2014
Can't say which kernel is stable, but just yesterday I've got rather
unfunny error on my lab stand with 3.8.0-35-generic (x86_64): vim went
to IO and did not come back (in D+ state). Disk was fine, other software
was fine, but in_flight time was 100% for disk and kernel starts to
report 'stall' about hanged vim. I played around some time, but none of
tricks was not able to 'free' vim (not the disk reinitialization, not
the pci bus rescan).
In my case that happens after rather brutal test of 'snapshot creating
during 32 concurrent read/write operations from instance'.
On 08.02.2014 06:02, sylecn wrote:
> I have experienced "rcu_sched detected stalls on CPUs/tasks" in ubuntu
> vms, which result in dead vm that can't be rebooted/deleted, and I
> believe it's because of either bug in hypervisor kernel or guest kernel.
> I'd like to know which os version and kernel version do you use in
> production. Both public and private clouds are welcome. My company
> plans to run a small (to medium) private cloud. Hypervisor runs ubuntu
> 12.04 and the first guest OSes will be ubuntu 12.04 and CentOS 6. So
> kernel version for those is much appreciated.
> Is there a wiki page about this?
> PS. Here is a combination that have the above mentioned error:
> hypervisor os: ubuntu 12.04.3
> hypervisor kernel: 3.8.0-35-generic
> vm os: ubuntu 12.04
> vm kernel: 3.2.0-56-virtual
> openstack: havana
> libvirt: 1.1.1-0ubuntu8~cloud2
> Relevant old bugs on similar issues:
> rhel5.5 running as kvm guest hangs randomly
> Bug #503138 “Lucid & Natty, KVM, After kernel message hrtimer: ...” :
> Bugs : “kvm” package : Ubuntu
> I don't have a 100% way to reproduce the problem, but it happens quite
> often, no matter when the vm is idle or loaded, which is not
> acceptable in production.
More information about the OpenStack-operators