[Openstack] kernel panic on compute node

Shinobu Kinjo shinobu.kj at gmail.com
Tue Jun 2 12:19:04 UTC 2015


Hi,

When were the node crashed?
How about other two nodes?
Is there any difference between crashed node and others in terms of not
only hardware but also software?

 - Kinjo

On Tue, Jun 2, 2015 at 3:31 PM, Priyanka <ppnaik at cse.iitb.ac.in> wrote:

> Sir,
>
> uname -r
>
> 3.10.0-123.13.2.el7.x86_64
>
> I have stopped iptables services since the start. So, should I still
> follow what they have mentioned in the 5th comment?
>
> Thanks,
>
> Priyanka
>
>
> On Tuesday 02 June 2015 11:49 AM, Matt Taylor wrote:
>
>> Hi Priyanka,
>>
>> Are you using an old kernel ('uname -r' please)? I've seen issues with
>> unlink_anon_vmas in Fedora, so this is rather interesting.
>>
>> In regards to the 2nd kernel panic, it indicates that it's due to
>> iptables and get_counters.
>>
>> It's possibly related to this:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1089569
>>
>> Might be worth trying what was mentioned in the 5th comment.
>>
>> Regards,
>> Matt.
>>
>> On 2/06/2015 15:14, Priyanka wrote:
>>
>>> Hi,
>>>
>>> I have an Openstack Juno setup with one controller node and 3 compute
>>> nodes. I installed it using packstack on centOS 7. One of the compute
>>> node crashed twice. First crash was two days back and second one today.
>>> It automatically restarts after the crash. The crash log in /var/crash
>>> were different in both the instances.
>>>
>>> Crash 1 log:
>>>
>>> |CPU:  5  PID:  16617  Comm:  pickupNot
>>> tainted3.10.0-123.13.2.el7.x86_64#1
>>> [11971.208387]  Hardware  name:  Intel  Corporation S2600CP/S2600CP,
>>> BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
>>> [11971.208389]   ffff8807de2b1cc800000000a2deb902 ffff8807de2b1c80
>>> ffffffff815e232c
>>> [11971.208392]   ffff8807de2b1cb8 ffffffff8105dee1 ffff8807f8e85f50
>>> ffff8807f8e85f40
>>> [11971.208394]   ffff8807f90db0c0 ffff8807f8e85f50 ffff8807f90db0c0
>>> ffff8807de2b1d20
>>> [11971.208396]  Call  Trace:
>>> [11971.208401]   [<ffffffff815e232c>] dump_stack+0x19/0x1b
>>> [11971.208405]   [<ffffffff8105dee1>] warn_slowpath_common+0x61/0x80
>>> [11971.208408]   [<ffffffff8105df5c>] warn_slowpath_fmt+0x5c/0x80
>>> [11971.208410]   [<ffffffff812cff82>] __list_del_entry+0x82/0xd0
>>> [11971.208412]   [<ffffffff812cffdd>]  list_del+0xd/0x30
>>> [11971.208415]   [<ffffffff811776a3>] unlink_anon_vmas+0x93/0x180
>>> [11971.208418]   [<ffffffff81168b88>] free_pgtables+0xa8/0x120
>>> [11971.208420]   [<ffffffff81173556>] exit_mmap+0xc6/0x1a0
>>> [11971.208422]   [<ffffffff8105b187>]  mmput+0x67/0xf0
>>> [11971.208424]   [<ffffffff81063dac>]  do_exit+0x28c/0xa60
>>> [11971.208426]   [<ffffffff810645ff>] do_group_exit+0x3f/0xa0
>>> [11971.208428]   [<ffffffff81064674>] SyS_exit_group+0x14/0x20
>>> [11971.208431]   [<ffffffff815f2a19>] system_call_fastpath+0x16/0x1b
>>> [11971.208432]  ---[  end  trace ebed116bce4ce8eb]---
>>> [11971.208437]  BUG:  unable to handle kernel NULL pointer dereference
>>> at(null)
>>> [11971.208471]  IP:  [<ffffffff81177663>] unlink_anon_vmas+0x53/0x180
>>> [11971.208493]  PGD0
>>> [11971.208502]  Oops:  0000  [#1] SMP|
>>>
>>> Crash 2 log:
>>>
>>> |[321808.123092]  BUG:  unable to handle kernel paging request at
>>> ffffc90017456008
>>> [321808.123122]  IP:  [<ffffffffa03a8521>] get_counters+0x91/0xd0
>>> [ip_tables]
>>> [321808.123146]  PGD81d437067  PUD81d4a4067  PMD7ec309067  PTE0
>>> [321808.123167]  Oops:  0002  [#1] SMP
>>> [321808.123179]  Modules  linkedin:  dummy vhost_net macvtap macvlan tun
>>> iptable_nat nf_nat_ipv4 nf_nat iptable_raw iptable_filter ip_tables
>>> nf_conntrack_ipv6 nf_defrag_ipv6 xt_mac xt_physdev xt_set ip_set_hash_ip
>>> ip_set nfnetlink veth ip6table_filter ip6_tables ebtable_nat ebtables
>>> openvswitch vxlan ip_tunnel gre sg ipt_REJECT xt_comment xt_conntrack
>>> xt_multiport nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack coretemp
>>> kvm_intel kvm crct10dif_pclmul iTCO_wdt iTCO_vendor_support crc32_pclmul
>>> crc32c_intel ghash_clmulni_intel sb_edac edac_core pcspkr lpc_ich ioatdma
>>> i2c_i801 mfd_core aesni_intel lrw gf128mul glue_helper ablk_helper cryptd
>>> mei_me mei shpchp wmi acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd
>>> sunrpc bridge stp llc xfs libcrc32c sd_mod crc_t10di
>>>   f crct10d
>>>
>>> if_common mgag200 syscopyarea sysfillrect
>>> [321808.123452]   sysimgblt drm_kms_helper ttm isci igb ahci drm libsas
>>> libahci ptp scsi_transport_sas pps_core dca libata i2c_algo_bit i2c_core
>>> dm_mirror dm_region_hash dm_log dm_mod[last  unloaded:  ip_tables]
>>> [321808.123521]  CPU:  10  PID:  110268  Comm: iptables-saveNot
>>> tainted3.10.0-123.13.2.el7.x86_64#1
>>> [321808.123547]  Hardware  name:  Intel  Corporation S2600CP/S2600CP,
>>> BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
>>> [321808.123577]  task:  ffff88080a8f38e0 ti:  ffff8807ed398000 task.ti:
>>> ffff8807ed398000
>>> [321808.123599]  RIP:  0010:[<ffffffffa03a8521>] [<ffffffffa03a8521>]
>>> get_counters+0x91/0xd0  [ip_tables]
>>> [321808.123627]  RSP:  0018:ffff8807ed399da8  EFLAGS:  00010286
>>> [321808.123643]  RAX:  ffffc90014ab72e8 RBX:  0000000000010380 RCX:
>>> ffffc90017456000
>>> [321808.123664]  RDX:  0000000000000054  RSI:  0000000000000000 RDI:
>>> ffff88081e290380
>>> [321808.123685]  RBP:  ffff8807ed399dc8 R08:  0000000000000101 R09:
>>> ffff8807ed96caa0
>>> [321808.123706]  R10:  0000000000000000  R11:  ffffffff8117b2e8 R12:
>>> ffffffff819e4aa0
>>> [321808.123727]  R13:  ffff8807ed96c800 R14:  ffffc90017455000 R15:
>>> ffff88080c961ba0
>>> [321808.123748]  FS:   00007fdcc2078740(0000) GS:ffff88081d940000(0000)
>>> knlGS:0000000000000000
>>> [321808.123772]  CS:   0010  DS:  0000  ES:  0000  CR0: 0000000080050033
>>> ||[321808.123789] CR2: ffffc90017456008 CR3: 00000007ec30e000 CR4:
>>> 00000000001427e0
>>> [321808.123810] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>>> 0000000000000000
>>> [321808.123831] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>>> 0000000000000400
>>> |
>>>
>>> I am not able to debug the problem. The setup was working fine till now.
>>> I am unable to understand what is cuasing such a behaviour. Please help.
>>>
>>>
>>> Thanks,
>>>
>>> Priyanka
>>>
>>>
>>>
>>> _______________________________________________
>>> Mailing list:
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> Post to     : openstack at lists.openstack.org
>>> Unsubscribe :
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>
>>>
>> _______________________________________________
>> Mailing list:
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to     : openstack at lists.openstack.org
>> Unsubscribe :
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>
>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>



-- 
Life w/ Linux <http://i-shinobu.hatenablog.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20150602/b1eb4957/attachment.html>


More information about the Openstack mailing list