[Openstack-operators] CTKI error, was Re: instances stuck in paused state with new hypervisors running 2013.2.3

Jonathan Proulx jon at jonproulx.com
Wed May 21 15:30:41 UTC 2014


Turns out this was a chair to keyboard interface error...

The new nodes were in a new host aggregate with new instance types
associated with it.  I mentally screwed up the units of memory
allocation while creating the ne instance types and while 1G would
have been plenty to run on 1M won't come close to fitting a linux
kernel these days and that is what caused the emulation failure.

D'Oh!
-Jon

On Wed, May 21, 2014 at 9:22 AM, Jonathan Proulx <jon at jonproulx.com> wrote:
> Hi All,
>
> I'm running Ubuntu 12.04 cloudarchive Havana
>
> I just installed a few new hypervisors.  Instances launch, run far
> enough to show "booting from disk" in the VNC console, then fall into
> a paused state and will not resume.
>
> I have identical hardware running identically configured 2013.2.2 and
> all is well (I double checked /proc/cpu info to be sure they are
> really the same cpus with the same options enabled)
>
> the libvirt log for the instance says:
>
> KVM internal error. Suberror: 1
> emulation failure
> EAX=2ad1dcff EBX=022dd200 ECX=a644bc82 EDX=ffffffff
> ESI=00007c05 EDI=d8918c96 EBP=0007fff0 ESP=0007ffec
> EIP=0000e34f EFL=00010013 [----A-C] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
> SS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> DS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> FS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> GS =0010 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
> GDT=     00008280 00000027
> IDT=     00000000 00000000
> CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
> DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000000
> Code=2d 02 b9 82 bc 44 a6 d6 86 9d 78 82 28 ad 68 63 de 5b 46 54 <de>
> 87 d4 a9 71 43 04 21 5a 48 a0 dd a2 b8 e9 80 70 00 d3 37 1d 61 ac 11
> 63 76 d3 33 ca f1
>
> The nova-compute log does show the instance attempting to be resumed
> and failing:
>
> 2014-05-20 21:32:59.165 2289 TRACE nova.openstack.common.rpc.amqp
> libvirtError: internal error: unable to execute QEMU command 'cont':
> Resetting the Virtual Machine is required
>
> using 'virsh reset' then 'virsh resume' on the instance I can very
> briefly get it back into running state but it almost immediately fails
> back to paused state with another failure and register dump in the
> libvirt log.
>
> Obviously my day is now filled with diffing installed package versions
> on the working and broken systems and selectively force downgrading
> them to see what is responsible, wonder if anyonne has seen this and
> could possibly shorten my search?
>
> Thanks,
> -Jon



More information about the OpenStack-operators mailing list