[ci] Kernel panics in the guest vm
skaplons at redhat.com
Sun Dec 6 09:42:57 UTC 2020
Since some time I noticed that quite often some scenario jobs are failing due to
issue with SSH to the guest vm and when I was checking the reason of this SSH
failure, it seems that it's due to Kernel panic in the guest vm, like e.g. :
[ 0.000000] Console: colour VGA+ 80x25
[ 0.000000] printk: console [tty1] enabled
[ 0.000000] printk: console [ttyS0] enabled
[ 0.000000] ACPI: Core revision 20190703
[ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[ 0.000000] APIC: Switch to symmetric I/O mode setup
[ 0.000000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.000000] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[ 0.000000] ...trying to set up timer (IRQ0) through the 8259A ...
[ 0.000000] ..... (found apic 0 pin 2) ...
[ 0.000000] ....... failed.
[ 0.000000] ...trying to set up timer as Virtual Wire IRQ...
[ 0.000000] ..... failed.
[ 0.000000] ...trying to set up timer as ExtINT IRQ...
[ 0.000000] ..... failed :(.
[ 0.000000] Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option.
[ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-26-generic #28~18.04.1-Ubuntu
[ 0.000000] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1 04/01/2014
[ 0.000000] Call Trace:
[ 0.000000] dump_stack+0x6d/0x95
[ 0.000000] panic+0xfe/0x2d4
[ 0.000000] check_timer+0x5e8/0x685
[ 0.000000] ? radix_tree_lookup+0xd/0x10
[ 0.000000] setup_IO_APIC+0x182/0x1ca
[ 0.000000] apic_intr_mode_init+0x1f5/0x1f8
[ 0.000000] x86_late_time_init+0x1b/0x22
[ 0.000000] start_kernel+0x4cb/0x58b
[ 0.000000] x86_64_start_reservations+0x24/0x26
[ 0.000000] x86_64_start_kernel+0x74/0x77
[ 0.000000] secondary_startup_64+0xa4/0xb0
[ 0.000000] ---[ end Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. Then try booting with the 'noapic' option. ]---
Logstash  is telling me that it is problem not only in neutron related jobs.
Maybe someone of You was already trying to investigate such issue and maybe You
have some ideas what we can do with it?
In this specific example above , it was Cirros 0.5.1 image used. But I didn't
check if that is the case in all other cases TBH.
Principal Software Engineer
More information about the openstack-discuss