[openstack-dev] [networking-ovs-dpdk]
Mooney, Sean K
sean.k.mooney at intel.com
Wed Nov 18 12:19:21 UTC 2015
Hi that is great to know.
I will internally report this behavior to our dpdk team
But I have already got a patch to change our default target to native-linuxapp
https://review.openstack.org/#/c/246375/ which should merge shortly.
Im glad it is now working for you.
From: Prathyusha Guduri [mailto:prathyushaconnects at gmail.com]
Sent: Wednesday, November 18, 2015 6:13 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [networking-ovs-dpdk]
Thanks a lot Sean, that was helpful.
Changing the target from ivshmem to native-linuxapp removed the error and it doesn't hang at creating external bridge anymore.
All processes(nova-api, neutron, ovs-vswitchd, etc) did start.
Thanks,
Prathyusha
On Tue, Nov 17, 2015 at 7:57 PM, Mooney, Sean K <sean.k.mooney at intel.com<mailto:sean.k.mooney at intel.com>> wrote:
We mainly test with 2M hugepages not 1G however our ci does use 1G pages.
We recently noticed a different but unrelated related issue with using the ivshmem target when building dpdk.
(https://bugs.launchpad.net/networking-ovs-dpdk/+bug/1517032)
Instead of modifying dpdk can you try
Changing the default dpdk build target to x86_64-native-linuxapp-gcc.
This can be done by adding
RTE_TARGET=x86_64-native-linuxapp-gcc to the local.conf
And removing the following file to force a rebuild “/opt/stack/ovs/BUILD_COMPLETE”
I agree with your assessment though this appears to be a timing issue in dpdk 2.0
From: Prathyusha Guduri [mailto:prathyushaconnects at gmail.com<mailto:prathyushaconnects at gmail.com>]
Sent: Tuesday, November 17, 2015 1:42 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [networking-ovs-dpdk]
Here is stack.sh log -
2015-11-17 13:38:50.010 | Loading uio module
2015-11-17 13:38:50.028 | Loading DPDK UIO module
2015-11-17 13:38:50.038 | starting ovs db
2015-11-17 13:38:50.038 | binding nics
2015-11-17 13:38:50.039 | starting vswitchd
2015-11-17 13:38:50.190 | sudo RTE_SDK=/opt/stack/DPDK-v2.0.0 RTE_TARGET=build /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py -b igb_uio 0000:07:00.0
2015-11-17 13:38:50.527 | sudo ovs-vsctl --no-wait --may-exist add-port br-eth1 dpdk0 -- set Interface dpdk0 type=dpdk
2015-11-17 13:38:51.671 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:52.685 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:53.702 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:54.720 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:55.733 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:56.749 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:57.768 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:58.787 | Waiting for ovs-vswitchd to start...
2015-11-17 13:38:59.802 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:00.818 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:01.836 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:02.849 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:03.866 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:04.884 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:05.905 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:06.923 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:07.937 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:08.956 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:09.973 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:10.988 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:12.004 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:13.022 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:14.040 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:15.060 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:16.073 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:17.089 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:18.108 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:19.121 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:20.138 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:21.156 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:22.169 | Waiting for ovs-vswitchd to start...
2015-11-17 13:39:23.185 | Waiting for ovs-vswitchd to start...
On Tue, Nov 17, 2015 at 6:50 PM, Prathyusha Guduri <prathyushaconnects at gmail.com<mailto:prathyushaconnects at gmail.com>> wrote:
Hi Sean,
Here is ovs-vswitchd.log
2015-11-13T12:48:01Z|00001|dpdk|INFO|User-provided -vhost_sock_dir in use: /var/run/openvswitch
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 4 on socket 0
EAL: Detected lcore 5 as core 5 on socket 0
EAL: Detected lcore 6 as core 0 on socket 0
EAL: Detected lcore 7 as core 1 on socket 0
EAL: Detected lcore 8 as core 2 on socket 0
EAL: Detected lcore 9 as core 3 on socket 0
EAL: Detected lcore 10 as core 4 on socket 0
EAL: Detected lcore 11 as core 5 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 12 lcore(s)
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Searching for IVSHMEM devices...
EAL: No IVSHMEM configuration found!
EAL: Setting up memory...
EAL: Ask a virtual area of 0x180000000 bytes
EAL: Virtual area found at 0x7f1e00000000 (size = 0x180000000)
EAL: remap_all_hugepages(): mmap failed: Cannot allocate memory
EAL: Failed to remap 1024 MB pages
PANIC in rte_eal_init():
Cannot init memory
7: [/usr/sbin/ovs-vswitchd() [0x40b803]]
6: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f1fb52d3ec5]]
5: [/usr/sbin/ovs-vswitchd() [0x40a822]]
4: [/usr/sbin/ovs-vswitchd() [0x675432]]
3: [/usr/sbin/ovs-vswitchd() [0x442155]]
2: [/usr/sbin/ovs-vswitchd() [0x407c9f]]
1: [/usr/sbin/ovs-vswitchd() [0x447828]]
Before this hugepages were free and port binding was also done. So I suspected that this is a DPDK specific issue and found that in remap_all_hugepages( ) of /opt/stack/DPDK-v2.0.0/lib/librte_eal/linuxapp/eal/eal_memory.c which first unmaps and then mmaps, there is an issue here and so mmap here fails. In DPDK mailing list I found that the unmap is taking longer time because of which mmap fails, so putting a sleep(1) between unmap and map is supposed to solve the issue. Please check the below link :
https://lists.01.org/pipermail/dpdk-ovs/2014-April/000864.html
After changing so, the ovs-vswitchd command hangs at this place
2015-11-17T10:52:38Z|00001|dpdk|INFO|User-provided -vhost_sock_dir in use: /var/run/openvswitch
2015-11-17 10:52:38.680 | EAL: Detected lcore 0 as core 0 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 1 as core 1 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 2 as core 2 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 3 as core 3 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 4 as core 4 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 5 as core 5 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 6 as core 0 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 7 as core 1 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 8 as core 2 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 9 as core 3 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 10 as core 4 on socket 0
2015-11-17 10:52:38.680 | EAL: Detected lcore 11 as core 5 on socket 0
2015-11-17 10:52:38.680 | EAL: Support maximum 128 logical core(s) by configuration.
2015-11-17 10:52:38.680 | EAL: Detected 12 lcore(s)
2015-11-17 10:52:38.687 | EAL: VFIO modules not all loaded, skip VFIO support...
2015-11-17 10:52:38.687 | EAL: Searching for IVSHMEM devices...
2015-11-17 10:52:38.687 | EAL: No IVSHMEM configuration found!
2015-11-17 10:52:38.687 | EAL: Setting up memory...
2015-11-17 10:52:39.252 | EAL: Ask a virtual area of 0x1c00000 bytes
2015-11-17 10:52:39.252 | EAL: Virtual area found at 0x7fcab3a00000 (size = 0x1c00000)
2015-11-17 10:52:53.265 | EAL: Ask a virtual area of 0x200000 bytes
2015-11-17 10:52:53.266 | EAL: Virtual area found at 0x7fcab3600000 (size = 0x200000)
2015-11-17 10:52:54.266 | EAL: Ask a virtual area of 0x200000 bytes
2015-11-17 10:52:54.266 | EAL: Virtual area found at 0x7fcab3200000 (size = 0x200000)
2015-11-17 10:52:55.267 | EAL: Ask a virtual area of 0x22c00000 bytes
2015-11-17 10:52:55.267 | EAL: Virtual area found at 0x7fca90400000 (size = 0x22c00000)
2015-11-17 10:57:33.574 | EAL: Ask a virtual area of 0x1800000 bytes
2015-11-17 10:57:33.574 | EAL: Virtual area found at 0x7fca8ea00000 (size = 0x1800000)
2015-11-17 10:57:45.585 | EAL: Ask a virtual area of 0xd9800000 bytes
2015-11-17 10:57:45.585 | EAL: Virtual area found at 0x7fc9b5000000 (size = 0xd9800000)
2015-11-17 11:26:50.605 | EAL: Ask a virtual area of 0x200000 bytes
2015-11-17 11:26:50.605 | EAL: Virtual area found at 0x7fc9b4c00000 (size = 0x200000)
2015-11-17 11:26:51.606 | EAL: Ask a virtual area of 0x200000 bytes
2015-11-17 11:26:51.606 | EAL: Virtual area found at 0x7fc9b4800000 (size = 0x200000)
2015-11-17 11:26:52.608 | EAL: Requesting 1024 pages of size 2MB from socket 0
2015-11-17 11:26:53.111 | EAL: TSC frequency is ~3491914 KHz
2015-11-17 11:26:53.111 | EAL: Master lcore 1 is ready (tid=b73cd700;cpuset=[1])
2015-11-17 11:26:53.111 | PMD: ENICPMD trace: rte_enic_pmd_init
2015-11-17 11:26:53.111 | EAL: PCI device 0000:07:00.0 on NUMA socket 0
2015-11-17 11:26:53.111 | EAL: probe driver: 8086:10d3 rte_em_pmd
2015-11-17 11:26:53.111 | EAL: PCI memory mapped at 0x7fcab5600000
2015-11-17 11:26:53.111 | EAL: PCI memory mapped at 0x7fcab730f000
2015-11-17 11:26:53.111 | EAL: PCI memory mapped at 0x7fcab73d6000
2015-11-17 11:26:53.189 | PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x10d3
2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00002|ovs_numa|INFO|Discovered 12 CPU cores on NUMA node 0
2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00003|ovs_numa|INFO|Discovered 1 NUMA nodes and 12 CPU cores
2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00004|memory|INFO|10680 kB peak resident set size after 2054.5 seconds
2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00005|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connecting...
2015-11-17 11:26:53.190 | 2015-11-17T11:26:53Z|00006|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected
2015-11-17 11:26:53.194 | 2015-11-17T11:26:53Z|00007|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath supports recirculation
2015-11-17 11:26:53.194 | 2015-11-17T11:26:53Z|00008|ofproto_dpif|INFO|netdev at ovs-netdev: MPLS label stack length probed as 3
2015-11-17 11:26:53.194 | 2015-11-17T11:26:53Z|00009|ofproto_dpif|INFO|netdev at ovs-netdev: Datapath supports unique flow ids
2015-11-17 11:26:53.195 | 2015-11-17T11:26:53Z|00010|bridge|INFO|bridge br-eth1: added interface br-eth1 on port 65534
2015-11-17 11:26:53.197 | 2015-11-17T11:26:53Z|00011|dpif_netlink|ERR|Generic Netlink family 'ovs_datapath' does not exist. The Open vSwitch kernel module is probably not loaded.
2015-11-17 11:26:53.287 | Zone 0: name:<MALLOC_S0_HEAP_0>, phys:0x9b600000, len:0xb00000, virt:0x7fca8ea00000, socket_id:0, flags:0
2015-11-17 11:26:53.287 | Zone 1: name:<RG_MP_log_history>, phys:0x36600000, len:0x2080, virt:0x7fcab3600000, socket_id:0, flags:0
2015-11-17 11:26:53.287 | Zone 2: name:<MP_log_history>, phys:0x9c100000, len:0x28a0c0, virt:0x7fca8f500000, socket_id:0, flags:0
2015-11-17 11:26:53.287 | Zone 3: name:<rte_eth_dev_data>, phys:0x36602080, len:0x1f400, virt:0x7fcab3602080, socket_id:0, flags:0
2015-11-17 11:26:53.287 | PMD: eth_em_tx_queue_setup(): sw_ring=0x7fca8f4efd40 hw_ring=0x7fcab3621480 dma_addr=0x36621480
2015-11-17 11:26:53.287 | PMD: eth_em_rx_queue_setup(): sw_ring=0x7fca8f4ebc40 hw_ring=0x7fcab3631480 dma_addr=0x36631480
2015-11-17 11:26:53.368 | PMD: eth_em_start(): <<
2015-11-17 11:26:53.368 | 2015-11-17T11:26:53Z|00012|dpdk|INFO|Port 0: 68:05:ca:1b:ca:c9
2015-11-17 11:26:53.405 | PMD: eth_em_tx_queue_setup(): sw_ring=0x7fca8f4efe00 hw_ring=0x7fcab3621480 dma_addr=0x36621480
2015-11-17 11:26:53.405 | PMD: eth_em_rx_queue_setup(): sw_ring=0x7fca8f4ebdc0 hw_ring=0x7fcab3631480 dma_addr=0x36631480
2015-11-17 11:26:53.486 | PMD: eth_em_start(): <<
2015-11-17 11:26:53.486 | 2015-11-17T11:26:53Z|00013|dpdk|INFO|Port 0: 68:05:ca:1b:ca:c9
2015-11-17 11:26:53.487 | 2015-11-17T11:26:53Z|00014|dpif_netdev|INFO|Created 1 pmd threads on numa node 0
2015-11-17 11:26:53.487 | 2015-11-17T11:26:53Z|00001|dpif_netdev(pmd10)|INFO|Core 0 processing port 'dpdk0'
2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00002|dpif_netdev(pmd10)|INFO|Core 0 processing port 'dpdk0'
2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00015|bridge|INFO|bridge br-eth1: added interface dpdk0 on port 1
2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00016|bridge|INFO|bridge br-int: added interface br-int on port 65534
2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00017|bridge|INFO|bridge br-eth1: using datapath ID 00006805ca1bcac9
2015-11-17 11:26:53.488 | 2015-11-17T11:26:53Z|00018|connmgr|INFO|br-eth1: added service controller "punix:/var/run/openvswitch/br-eth1.mgmt"
2015-11-17 11:26:53.489 | 2015-11-17T11:26:53Z|00019|bridge|INFO|bridge br-int: using datapath ID 00002ef7b66a8742
2015-11-17 11:26:53.489 | 2015-11-17T11:26:53Z|00020|connmgr|INFO|br-int: added service controller "punix:/var/run/openvswitch/br-int.mgmt"
2015-11-17 11:26:53.490 | 2015-11-17T11:26:53Z|00021|dpif_netdev|INFO|Created 2 pmd threads on numa node 0
2015-11-17 11:26:53.492 | 2015-11-17T11:26:53Z|00022|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.4.90
2015-11-17 11:26:53.493 | 2015-11-17T11:26:53Z|00001|dpif_netdev(pmd23)|INFO|Core 2 processing port 'dpdk0'
2015-11-17 11:27:03.494 | 2015-11-17T11:27:03Z|00023|memory|INFO|peak resident set size grew 93% in last 10.3 seconds, from 10680 kB to 20572 kB
2015-11-17 11:27:03.494 | 2015-11-17T11:27:03Z|00024|memory|INFO|handlers:4 ports:3 revalidators:2 rules:10
ubuntu at ubuntu-Precision-Tower-5810:/opt/stack/DPDK-v2.0.0/lib/librte_eal/linuxapp/eal$<mailto:ubuntu at ubuntu-Precision-Tower-5810:/opt/stack/DPDK-v2.0.0/lib/librte_eal/linuxapp/eal$> ps -Al | grep ovs
5 S 0 1681 2595 0 80 0 - 4433 poll_s ? 00:00:00 ovsdb-server
4 S 0 1716 1715 0 80 0 - 4636 wait pts/3 00:00:00 ovs-dpdk
4 S 0 2124 1716 99 80 0 - 870841 poll_s pts/3 03:42:31 ovs-vswitchd
So now ovs-vswitchd runs unlike the last time.
I really dont understand where am missing out....
On Tue, Nov 17, 2015 at 5:14 PM, Mooney, Sean K <sean.k.mooney at intel.com<mailto:sean.k.mooney at intel.com>> wrote:
Can you provide the ovs-vswitchd log form ${OVS_LOG_DIR}/ovs-vswitchd.log
/tmp/ovs-vswitchd.log in your case.
If the vswitch fails to start we clean up by unmounting the hugepages.
From: Prathyusha Guduri [mailto:prathyushaconnects at gmail.com<mailto:prathyushaconnects at gmail.com>]
Sent: Tuesday, November 17, 2015 7:37 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [networking-ovs-dpdk]
Hi Sean,
I realised on debugging ovs-dpdk-init script that the main issue is with the following command
$ screen -dms ovs-vswitchd sudo sg $qemu_group -c "umask 002; ${OVS_INSTALL_DIR}/sbin/ovs-vswitchd --dpdk -vhost_sock_dir $OVS_DB_SOCKET_DIR -c $OVS_CORE_MASK -n $OVS_MEM_CHANNELS --proc-type primary --huge-dir $OVS_HUGEPAGE_MOUNT --socket-mem $OVS_SOCKET_MEM $pciAddressWhitelist -- unix:$OVS_DB_SOCKET 2>&1 | tee ${OVS_LOG_DIR}/ovs-vswitchd.log"
which I guess is starting the ovs-vswitchd application. Before this command, huge pages is mounted and port binding is also done but still the screen command fails.
I verified the db.sock and conf.db files.
Any help is highly appreciated.
Thanks,
Prathyusha
On Mon, Nov 16, 2015 at 5:12 PM, Prathyusha Guduri <prathyushaconnects at gmail.com<mailto:prathyushaconnects at gmail.com>> wrote:
Hi Sean,
Thanks for your response.
in your case though you are using 1GB hugepages so I don’t think this is related to memory fragmentation
or a lack of free hugepages.
to use preallocated 1GB page with ovs you should instead set the following in your local.conf
OVS_HUGEPAGE_MOUNT_PAGESIZE=1G
OVS_ALLOCATE_HUGEPAGES=False
Added the above two parameters to the local.conf. The same problem again.
Basically it throws this error -
2015-11-16 11:31:44.741 | starting vswitchd
2015-11-16 11:31:44.863 | sudo RTE_SDK=/opt/stack/DPDK-v2.0.0 RTE_TARGET=build /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py -b igb_uio 0000:07:00.0
2015-11-16 11:31:45.169 | sudo ovs-vsctl --no-wait --may-exist add-port br-eth1 dpdk0 -- set Interface dpdk0 type=dpdk
2015-11-16 11:31:46.314 | Waiting for ovs-vswitchd to start...
2015-11-16 11:31:47.442 | libvirt-bin stop/waiting
2015-11-16 11:31:49.473 | libvirt-bin start/running, process 2255
2015-11-16 11:31:49.477 | [ERROR] /etc/init.d/ovs-dpdk:563 ovs-vswitchd application failed to start
manually mounting /mnt/huge and then commenting that part from the /etc/init.d/ovs-dpdk script also throws the same error.
Using 1G hugepagesize should not give any memory related problem. I dont understand why it is not mounting then.
Here is the /opt/stack/networking-ovs-dpdk/devstack/ovs-dpdk/ovs-dpdk.conf
RTE_SDK=${RTE_SDK:-/opt/stack/DPDK}
RTE_TARGET=${RTE_TARGET:-x86_64-ivshmem-linuxapp-gcc}
OVS_INSTALL_DIR=/usr
OVS_DB_CONF_DIR=/etc/openvswitch
OVS_DB_SOCKET_DIR=/var/run/openvswitch
OVS_DB_CONF=$OVS_DB_CONF_DIR/conf.db
OVS_DB_SOCKET=OVS_DB_SOCKET_DIR/db.sock
OVS_SOCKET_MEM=2048,2048
OVS_MEM_CHANNELS=4
OVS_CORE_MASK=${OVS_CORE_MASK:-2}
OVS_PMD_CORE_MASK=${OVS_PMD_CORE_MASK:-4}
OVS_LOG_DIR=/tmp
OVS_LOCK_DIR=''
OVS_SRC_DIR=/opt/stack/ovs
OVS_DIR=${OVS_DIR:-${OVS_SRC_DIR}}
OVS_UTILS=${OVS_DIR}/utilities/
OVS_DB_UTILS=${OVS_DIR}/ovsdb/
OVS_DPDK_DIR=$RTE_SDK
OVS_NUM_HUGEPAGES=${OVS_NUM_HUGEPAGES:-5}
OVS_HUGEPAGE_MOUNT=${OVS_HUGEPAGE_MOUNT:-/mnt/huge}
OVS_HUGEPAGE_MOUNT_PAGESIZE=''
OVS_BOND_MODE=$OVS_BOND_MODE
OVS_BOND_PORTS=$OVS_BOND_PORTS
OVS_BRIDGE_MAPPINGS=eth1
OVS_PCI_MAPPINGS=0000:07:00.0#eth1
OVS_DPDK_PORT_MAPPINGS=''
OVS_TUNNEL_CIDR_MAPPING=''
OVS_ALLOCATE_HUGEPAGES=True
OVS_INTERFACE_DRIVER='igb_uio'
Verified the OVS_DB_SOCKET_DIR and all others. conf.db and db.sock exist. So why ovs-vswitchd is failing to start??? Am I missing something???
Thanks,
Prathyusha
On Mon, Nov 16, 2015 at 4:39 PM, Mooney, Sean K <sean.k.mooney at intel.com<mailto:sean.k.mooney at intel.com>> wrote:
Hi
Yes sorry for the delay in responding to you and samta.
In your case assuming you are using 2mb hugepages it is easy to hit dpdks default max memory segments
This can be changed by setting OVS_DPDK_MEM_SEGMENTS=<arbitrary large number that you will never hit>
In the local.conf and recompiling. To do this simply remove the build complete file in /opt/stack/ovs
rm –f /opt/stack/BUILD_COMPLETE
in your case though you are using 1GB hugepages so I don’t think this is related to memory fragmentation
or a lack of free hugepages.
to use preallocated 1GB page with ovs you should instead set the following in your local.conf
OVS_HUGEPAGE_MOUNT_PAGESIZE=1G
OVS_ALLOCATE_HUGEPAGES=False
Regards
sean
From: Prathyusha Guduri [mailto:prathyushaconnects at gmail.com<mailto:prathyushaconnects at gmail.com>]
Sent: Monday, November 16, 2015 6:20 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [networking-ovs-dpdk]
Hi all,
I have a similar problem as Samta. Am also stuck at the same place. The following command
$sudo ovs-vsctl br-set-external-id br-ex bridge-id br-ex
hangs forever. As Sean said, it might be because of ovs-vswitchd proces.
> The vswitchd process may exit if it failed to allocate memory (due to memory fragmentation or lack of free hugepages)
> if the ovs-vswitchd.log is not available can you check the the hugepage mount point was created in
> /mnt/huge And that Iis mounted
> Run
> ls -al /mnt/huge
> and
> mount
>
$mount
/dev/sda6 on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
none on /sys/fs/pstore type pstore (rw)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,user=ubuntu)
/mnt/huge is my mount point. So no mounting happening.
ovs-vswitchd.log says
2015-11-13T12:48:01Z|00001|dpdk|INFO|User-provided -vhost_sock_dir in use: /var/run/openvswitch
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Detected lcore 2 as core 2 on socket 0
EAL: Detected lcore 3 as core 3 on socket 0
EAL: Detected lcore 4 as core 4 on socket 0
EAL: Detected lcore 5 as core 5 on socket 0
EAL: Detected lcore 6 as core 0 on socket 0
EAL: Detected lcore 7 as core 1 on socket 0
EAL: Detected lcore 8 as core 2 on socket 0
EAL: Detected lcore 9 as core 3 on socket 0
EAL: Detected lcore 10 as core 4 on socket 0
EAL: Detected lcore 11 as core 5 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 12 lcore(s)
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Searching for IVSHMEM devices...
EAL: No IVSHMEM configuration found!
EAL: Setting up memory...
EAL: Ask a virtual area of 0x180000000 bytes
EAL: Virtual area found at 0x7f1e00000000 (size = 0x180000000)
EAL: remap_all_hugepages(): mmap failed: Cannot allocate memory
EAL: Failed to remap 1024 MB pages
PANIC in rte_eal_init():
Cannot init memory
7: [/usr/sbin/ovs-vswitchd() [0x40b803]]
6: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f1fb52d3ec5]]
5: [/usr/sbin/ovs-vswitchd() [0x40a822]]
4: [/usr/sbin/ovs-vswitchd() [0x675432]]
3: [/usr/sbin/ovs-vswitchd() [0x442155]]
2: [/usr/sbin/ovs-vswitchd() [0x407c9f]]
1: [/usr/sbin/ovs-vswitchd() [0x447828]]
I have given hugepages in /boot/grub/grub.cfg file. So there are free hugepages.
AnonHugePages: 378880 kB
HugePages_Total: 6
HugePages_Free: 6
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
It failed to allocate memory because mounting was not done. Did not understand why mounting is not done when there are free hugepages.
And also dpdk binding did happen.
$../DPDK-v2.0.0/tools/dpdk_nic_bind.py --status
Network devices using DPDK-compatible driver
============================================
0000:07:00.0 '82574L Gigabit Network Connection' unused=igb_uio
Network devices using kernel driver
===================================
0000:00:19.0 'Ethernet Connection I217-LM' if=eth0 drv=e1000e unused=igb_uio *Active*
0000:06:02.0 '82540EM Gigabit Ethernet Controller' if=eth2 drv=e1000 unused=igb_uio
Other network devices
=====================
None
Am using a 1G NIC card for the port (eth1) binds dpdk. Is that a problem??? Should dpdk binding port necessarily have a 10G NIC???? I dont think its a problem anyway because binding is done. Please correct me if am going wrong...
Thanks,
Prathyusha
On Wed, Nov 11, 2015 at 3:52 PM, Samta Rangare <samtarangare at gmail.com<mailto:samtarangare at gmail.com>> wrote:
Hi Sean,
Thanks for replying back, response inline.
On Mon, Nov 9, 2015 at 8:24 PM, Mooney, Sean K <sean.k.mooney at intel.com<mailto:sean.k.mooney at intel.com>> wrote:
> Hi
> Can you provide some more information regarding your deployment?
>
> Can you check which kernel you are using.
>
> uname -a
Linux ubuntu 3.16.0-50-generic #67~14.04.1-Ubuntu SMP Fri Oct 2 22:07:51 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> If you are using a 3.19 kernel changes to some locking code in the kennel broke synchronization dpdk2.0 and requires dpdk 2.1 to be used instead.
> In general it is not advisable to use a 3.19 kernel with dpdk as it can lead to non-deterministic behavior.
>
> When devstack hangs can you connect with a second ssh session and run
> sudo service ovs-dpdk status
> and
> ps aux | grep ovs
>
sudo service ovs-dpdk status
sourcing config
/opt/stack/logs/ovs-vswitchd.pid is not running
Not all processes are running restart!!!
1
ubuntu at ubuntu:~/samta/devstack$ ps -ef | grep ovs
root 13385 1 0 15:17 ? 00:00:00 /usr/sbin/ovsdb-server --detach --pidfile=/opt/stack/logs/ovsdb-server.pid --remote=punix:/usr/local/var/run/openvswitch/db.sock --remote=db:Open_vSwitch,Open_vSwitch,manager_options
ubuntu 24451 12855 0 15:45 pts/0 00:00:00 grep --color=auto ovs
>
> When the deployment hangs at sudo ovs-vsctl br-set-external-id br-ex bridge-id br-ex
> It usually means that the ovs-vswitchd process has exited.
>
The above result shows that ovs-vswitchd is not running.
> This can happen for a number of reasons.
> The vswitchd process may exit if it failed to allocate memory (due to memory fragmentation or lack of free hugepages)
> if the ovs-vswitchd.log is not available can you check the the hugepage mount point was created in
> /mnt/huge And that Iis mounted
> Run
> ls -al /mnt/huge
> and
> mount
>
ls -al /mnt/huge
total 4
drwxr-xr-x 2 libvirt-qemu kvm 0 Nov 11 15:18 .
drwxr-xr-x 3 root root 4096 May 15 00:09 ..
ubuntu at ubuntu:~/samta/devstack$ mount
/dev/mapper/ubuntu--vg-root on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
none on /sys/fs/pstore type pstore (rw)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,relatime,net_cls)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
cgroup on /sys/fs/cgroup/net_prio type cgroup (rw,relatime,net_prio)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb)
/dev/sda1 on /boot type ext2 (rw)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)
hugetlbfs-kvm on /run/hugepages/kvm type hugetlbfs (rw,mode=775,gid=106)
nodev on /mnt/huge type hugetlbfs (rw,uid=106,gid=106)
nodev on /mnt/huge type hugetlbfs (rw,uid=106,gid=106)
> then checkout how many hugepages are mounted
>
> cat /proc/meminfo | grep huge
>
cat /proc/meminfo | grep Huge
AnonHugePages: 292864 kB
HugePages_Total: 5
HugePages_Free: 5
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 1048576 kB
>
> the vswitchd process may also exit if it failed to initializes dpdk interfaces.
> This can happen if no interface is compatible with the igb-uio or vfio-pci drivers
> (note in the vfio-pci case all interface in the same iommu group must be bound to the vfio-pci driver and
> The iommu must be enabled in the kernel command line with VT-d enabled in the bios)
>
> Can you check which interface are bound to the dpdk driver by running the following command
>
> /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py --status
>
/opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py --status
Network devices using DPDK-compatible driver
============================================
<none>
Network devices using kernel driver
===================================
0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' if=p1p1 drv=ixgbe unused=igb_uio
0000:02:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=p4p1 drv=i40e unused=igb_uio
0000:03:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=p2p1 drv=i40e unused=igb_uio
0000:06:00.0 'I350 Gigabit Network Connection' if=em1 drv=igb unused=igb_uio *Active*
0000:06:00.1 'I350 Gigabit Network Connection' if=em2 drv=igb unused=igb_uio
Other network devices
=====================
0000:01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' unused=igb_uio
>
> Finally can you confim that ovs-dpdk compiled successfully by either check the xstack.log or
> Checking for the BUILD_COMPLETE file in /opt/stack/ovs
BUILD_COMPLETE exist in /opt/stack/ovs though its empty.
>
> Regards
> sean
>
>
>
>
> -----Original Message-----
> From: Samta Rangare [mailto:samtarangare at gmail.com<mailto:samtarangare at gmail.com>]
> Sent: Monday, November 9, 2015 2:31 PM
> To: Czesnowicz, Przemyslaw
> Cc: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [networking-ovs-dpdk]
>
> Thanks for replying Przemyslaw, there is no ovs-vswitchd.log in /opt/stack/logs/. This is all contains inside (ovsdb-server.pid, screen).
>
> When I cancel stack .sh (ctr c), and try to rerun this $sudo ovs-vsctl br-set-external-id br-ex bridge-id br-ex it didnt hang, that means vSwitch was running isn't it ?
>
> But rerunning stack.sh after unstack hangs again.
>
> Thanks,
> Samta
>
> On Mon, Nov 9, 2015 at 7:50 PM, Czesnowicz, Przemyslaw <przemyslaw.czesnowicz at intel.com<mailto:przemyslaw.czesnowicz at intel.com>> wrote:
>> Hi Samta,
>>
>> This usually means that the vSwitch is not running/has crashed.
>> Can you check in /opt/stack/logs/ovs-vswitchd.log ? There should be an error msg there.
>>
>> Regards
>> Przemek
>>
>>> -----Original Message-----
>>> From: Samta Rangare [mailto:samtarangare at gmail.com<mailto:samtarangare at gmail.com>]
>>> Sent: Monday, November 9, 2015 1:51 PM
>>> To: OpenStack Development Mailing List (not for usage questions)
>>> Subject: [openstack-dev] [networking-ovs-dpdk]
>>>
>>> Hello Everyone,
>>>
>>> I am installing devstack with networking-ovs-dpdk. The local.conf
>>> exactly looks like the one is available in /opt/stack/networking-ovs-
>>> dpdk/doc/source/_downloads/local.conf.single_node.
>>> So I believe all the necessary configuration will be taken care.
>>>
>>> However I am stuck at place where devstack is trying to set
>>> external-id ($ sudo ovs-vsctl br-set-external-id br-ex bridge-id
>>> br-ex). As soon as it hits at this place it's just hangs forever. I
>>> tried commenting this line from
>>> lib/neutron_plugin/ml2 (I know this is wrong) and then all services
>>> came up except ovs-dpdk agent and ovs agent.
>>>
>>> BTW I am deploying it in ubuntu 14.04. Any pointer will be really helpful.
>>>
>>> Thanks,
>>> Samta
>>>
>>> __________________________________________________________
>>> ________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-
>>> request at lists.openstack.org?subject:unsubscribe<http://request@lists.openstack.org?subject:unsubscribe>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151118/77651d25/attachment.html>
More information about the OpenStack-dev
mailing list