[Openstack-operators] Network get unstable, put the whole system on halt

Salman Toor salman.toor at it.uu.se
Wed May 22 10:01:51 UTC 2013


Hi again, 

Anyone share some thoughts regarding this matter ... 

Regards..
Salman. 
 

On May 20, 2013, at 11:57 AM, Salman Toor wrote:

> Hi, 
> 
> We are working with Grizzly together with openvswitch for quantum. Following are the details of our system.. 
> 
> Controller and Compute nodes are running with Ubuntu 12.04.5, kernel 3.5 and OpenVSwitch version 1.4.0+build0 with GRE tunnels.  
> 
> The problem is with very little activity everything works very fine but as we started to increase the load on the system the kernel log started to grow on the controller and fill the entire disk space and halt the complete system. And it happen within 2 to 3 hours ... 
> 
> Most of the log is filled with the following messages 
> 
> 
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598519] Call Trace:
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598519]  <IRQ> [<ffffffff81052c9f>] warn_slowpath_common+0x7f/0xc0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598524]  [<ffffffff81052d96>] warn_slowpath_fmt+0x46/0x50
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598526]  [<ffffffff8157501b>] ? skb_release_data.part.47+0xcb/0x110
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598528]  [<ffffffff8169abd0>] skb_warn_bad_offload+0xbe/0xc9
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598531]  [<ffffffff8157f396>] skb_gso_segment+0x246/0x2c0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598536]  [<ffffffffa03dd02f>] ovs_tnl_send+0x1ef/0xc90 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598539]  [<ffffffff8169e7de>] ? _raw_spin_lock+0xe/0x20
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598541]  [<ffffffff810e0001>] ? kdb_bc+0x191/0x240
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598544]  [<ffffffff810e4fe4>] ? handle_edge_irq+0x94/0x130
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598552]  [<ffffffffa03de52e>] ovs_vport_send+0x1e/0x50 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598561]  [<ffffffffa03d5552>] do_execute_actions+0x3e2/0x790 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598570]  [<ffffffffa03d5968>] ovs_execute_actions+0x68/0x110 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598579]  [<ffffffffa03d802e>] ovs_dp_process_received_packet+0x6e/0x150 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598589]  [<ffffffffa03de4ff>] ovs_vport_receive+0x5f/0x70 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598595]  [<ffffffffa03e0e07>] patch_send+0x27/0x50 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598599]  [<ffffffffa03de52e>] ovs_vport_send+0x1e/0x50 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598603]  [<ffffffffa03d5552>] do_execute_actions+0x3e2/0x790 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598607]  [<ffffffffa03de52e>] ? ovs_vport_send+0x1e/0x50 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598610]  [<ffffffffa03d5552>] ? do_execute_actions+0x3e2/0x790 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598613]  [<ffffffffa03d5968>] ovs_execute_actions+0x68/0x110 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598617]  [<ffffffffa03d802e>] ovs_dp_process_received_packet+0x6e/0x150 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598621]  [<ffffffffa045b9b2>] ? tcp_in_window+0x342/0x5e0 [nf_conntrack]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598626]  [<ffffffffa03de4ff>] ovs_vport_receive+0x5f/0x70 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598630]  [<ffffffffa03e0143>] internal_dev_xmit+0x23/0x30 [openvswitch]
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598632]  [<ffffffff815848b6>] dev_hard_start_xmit+0x256/0x550
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598634]  [<ffffffff81584e7c>] dev_queue_xmit+0x2cc/0x470
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598637]  [<ffffffff8159f87a>] ? eth_header+0x3a/0xf0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598640]  [<ffffffff8158c832>] neigh_resolve_output+0x122/0x210
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598642]  [<ffffffff815adf85>] ? nf_hook_slow+0x75/0x150
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598644]  [<ffffffff815ba840>] ? ip_fragment+0x810/0x810
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598646]  [<ffffffff815ba9be>] ip_finish_output+0x17e/0x2d0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598648]  [<ffffffff815bb4a6>] ip_output+0x66/0xa0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598650]  [<ffffffff815b58d0>] ? inet_del_protocol+0x40/0x40
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598653]  [<ffffffff815b7689>] ip_forward_finish+0x69/0x80
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598655]  [<ffffffff815b7931>] ip_forward+0x291/0x3e0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598657]  [<ffffffff815b59dd>] ip_rcv_finish+0x10d/0x370
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598660]  [<ffffffff815b6291>] ip_rcv+0x201/0x300
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598662]  [<ffffffff81582a13>] ? netif_receive_skb+0x23/0x90
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598664]  [<ffffffff81582576>] __netif_receive_skb+0x4c6/0x540
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598666]  [<ffffffff815835c1>] process_backlog+0xb1/0x190
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598668]  [<ffffffff815832f4>] net_rx_action+0x134/0x240
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598671]  [<ffffffff8105ba88>] __do_softirq+0xa8/0x210
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598679]  [<ffffffff8169e7de>] ? _raw_spin_lock+0xe/0x20
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598686]  [<ffffffff816a841c>] call_softirq+0x1c/0x30
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598694]  [<ffffffff81016245>] do_softirq+0x65/0xa0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598702]  [<ffffffff8105be6e>] irq_exit+0x8e/0xb0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598710]  [<ffffffff816a8c73>] do_IRQ+0x63/0xe0
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598715]  [<ffffffff8169ec6a>] common_interrupt+0x6a/0x6a
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598716]  <EOI>
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598717] ---[ end trace 4ed1c8725cfe8f94 ]---
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598733] ------------[ cut here ]------------
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598736] WARNING: at /build/buildd/linux-lts-quantal-3.5.0/net/core/dev.c:1904 skb_warn_bad_offload+0xbe/0xc9()
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598737] Hardware name: PowerEdge M610
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598738] : caps=(0x00000000400158e9, 0x0000000000000000) len=2856 data_len=1402 gso_size=1402 gso_type=1 ip_summed=1
> May 20 06:31:04 ukko233-cern-controller kernel: [67607.598739] Modules linked in: 8021q garp xt_conntrack ipt_REDIRECT ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE xt_state ipt_REJECT xt_CHECKSUM bridge stp llc xt_tcpudp iptable_filter iptable_mangle iptable_nat nf_nat vesafb nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables openvswitch(O) iscsi_trgt(O) nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc ib_iser ext2 rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi gpio_ich coretemp kvm_intel kvm dcdbas microcode wmi acpi_power_meter lpc_ich joydev ioatdma dca i7core_edac edac_core mac_hid lp parport hid_generic usbhid hid usb_storage uas mptsas mptscsih mptbase scsi_transport_sas bnx2x libcrc32c mdio bnx2
> 
> The size of the kernel log is more then 30GB in few hours. 
> 
> We are wondering does anybody else have experience this?
> 
> Or any hint which can help us to fix this problem. 
> 
> Regards.
> Salman. 
> 
> 
> 
> 
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




More information about the OpenStack-operators mailing list