[openstack-dev] [networking-ovs-dpdk]

Prathyusha Guduri prathyushaconnects at gmail.com
Mon Nov 16 11:42:42 UTC 2015


Hi Sean,

Thanks for your response.

in your case though you are using 1GB hugepages so I don’t think this is
related to memory fragmentation
or a lack of free hugepages.



to use preallocated 1GB page with ovs you should instead set the following
in your local.conf



OVS_HUGEPAGE_MOUNT_PAGESIZE=1G
OVS_ALLOCATE_HUGEPAGES=False


Added the above two parameters to the local.conf. The same problem again.
Basically it throws this error -
2015-11-16 11:31:44.741 | starting vswitchd
2015-11-16 11:31:44.863 | sudo RTE_SDK=/opt/stack/DPDK-v2.0.0
RTE_TARGET=build /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py -b igb_uio
0000:07:00.0
2015-11-16 11:31:45.169 | sudo ovs-vsctl --no-wait --may-exist add-port
br-eth1 dpdk0 -- set Interface dpdk0 type=dpdk
2015-11-16 11:31:46.314 | Waiting for ovs-vswitchd to start...
2015-11-16 11:31:47.442 | libvirt-bin stop/waiting
2015-11-16 11:31:49.473 | libvirt-bin start/running, process 2255
2015-11-16 11:31:49.477 | [ERROR] /etc/init.d/ovs-dpdk:563 ovs-vswitchd
application failed to start


manually mounting /mnt/huge and then commenting that part from the
/etc/init.d/ovs-dpdk script also throws the same error.

Using 1G hugepagesize should not give any memory related problem. I dont
understand why it is not mounting then.
Here is the /opt/stack/networking-ovs-dpdk/devstack/ovs-dpdk/ovs-dpdk.conf

RTE_SDK=${RTE_SDK:-/opt/stack/DPDK}
RTE_TARGET=${RTE_TARGET:-x86_64-ivshmem-linuxapp-gcc}

OVS_INSTALL_DIR=/usr
OVS_DB_CONF_DIR=/etc/openvswitch
OVS_DB_SOCKET_DIR=/var/run/openvswitch
OVS_DB_CONF=$OVS_DB_CONF_DIR/conf.db
OVS_DB_SOCKET=OVS_DB_SOCKET_DIR/db.sock

OVS_SOCKET_MEM=2048,2048
OVS_MEM_CHANNELS=4
OVS_CORE_MASK=${OVS_CORE_MASK:-2}
OVS_PMD_CORE_MASK=${OVS_PMD_CORE_MASK:-4}
OVS_LOG_DIR=/tmp
OVS_LOCK_DIR=''
OVS_SRC_DIR=/opt/stack/ovs
OVS_DIR=${OVS_DIR:-${OVS_SRC_DIR}}
OVS_UTILS=${OVS_DIR}/utilities/
OVS_DB_UTILS=${OVS_DIR}/ovsdb/
OVS_DPDK_DIR=$RTE_SDK
OVS_NUM_HUGEPAGES=${OVS_NUM_HUGEPAGES:-5}
OVS_HUGEPAGE_MOUNT=${OVS_HUGEPAGE_MOUNT:-/mnt/huge}
OVS_HUGEPAGE_MOUNT_PAGESIZE=''
OVS_BOND_MODE=$OVS_BOND_MODE
OVS_BOND_PORTS=$OVS_BOND_PORTS
OVS_BRIDGE_MAPPINGS=eth1
OVS_PCI_MAPPINGS=0000:07:00.0#eth1
OVS_DPDK_PORT_MAPPINGS=''
OVS_TUNNEL_CIDR_MAPPING=''
OVS_ALLOCATE_HUGEPAGES=True
OVS_INTERFACE_DRIVER='igb_uio'

Verified the OVS_DB_SOCKET_DIR and all others. conf.db and db.sock exist.
So why ovs-vswitchd is failing to start??? Am I missing something???



Thanks,
Prathyusha


On Mon, Nov 16, 2015 at 4:39 PM, Mooney, Sean K <sean.k.mooney at intel.com>
wrote:

>
>
> Hi
>
>
>
> Yes sorry for the delay in responding to you and samta.
>
>
>
> In your case assuming you are using 2mb hugepages it is easy to hit dpdks
> default max memory segments
>
>
>
> This can be changed by setting OVS_DPDK_MEM_SEGMENTS=<arbitrary large
> number that you will never hit>
>
> In the local.conf and recompiling. To do this simply remove the build
> complete file in /opt/stack/ovs
>
> rm –f /opt/stack/BUILD_COMPLETE
>
>
>
> in your case though you are using 1GB hugepages so I don’t think this is
> related to memory fragmentation
> or a lack of free hugepages.
>
>
>
> to use preallocated 1GB page with ovs you should instead set the following
> in your local.conf
>
>
>
> OVS_HUGEPAGE_MOUNT_PAGESIZE=1G
>
> OVS_ALLOCATE_HUGEPAGES=False
>
>
>
> Regards
>
> sean
>
>
>
> *From:* Prathyusha Guduri [mailto:prathyushaconnects at gmail.com]
> *Sent:* Monday, November 16, 2015 6:20 AM
> *To:* OpenStack Development Mailing List (not for usage questions)
> *Subject:* Re: [openstack-dev] [networking-ovs-dpdk]
>
>
>
> Hi all,
>
> I have a similar problem as Samta. Am also stuck at the same place. The
> following command
>
> $sudo ovs-vsctl br-set-external-id br-ex bridge-id br-ex
>
> hangs forever. As Sean said, it might be because of ovs-vswitchd proces.
>
> > The vswitchd process may exit if it  failed to allocate memory (due to
> memory fragmentation or lack of free hugepages)
> > if the ovs-vswitchd.log is not available can you check the the hugepage
> mount point was created in
> > /mnt/huge And that Iis mounted
> > Run
> >         ls -al /mnt/huge
> > and
> >         mount
> >
>
> $mount
>
> /dev/sda6 on / type ext4 (rw,errors=remount-ro)
> proc on /proc type proc (rw,noexec,nosuid,nodev)
> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
> none on /sys/fs/cgroup type tmpfs (rw)
> none on /sys/fs/fuse/connections type fusectl (rw)
> none on /sys/kernel/debug type debugfs (rw)
> none on /sys/kernel/security type securityfs (rw)
> udev on /dev type devtmpfs (rw,mode=0755)
> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
> tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
> none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
> none on /run/shm type tmpfs (rw,nosuid,nodev)
> none on /run/user type tmpfs
> (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
> none on /sys/fs/pstore type pstore (rw)
> cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
> cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
> cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
> cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
> cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
> cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
> cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
> cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
> cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb)
> systemd on /sys/fs/cgroup/systemd type cgroup
> (rw,noexec,nosuid,nodev,none,name=systemd)
> gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse
> (rw,nosuid,nodev,user=ubuntu)
>
> /mnt/huge is my mount point. So no mounting happening.
>
> ovs-vswitchd.log says
>
> 2015-11-13T12:48:01Z|00001|dpdk|INFO|User-provided -vhost_sock_dir in use:
> /var/run/openvswitch
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 1 on socket 0
> EAL: Detected lcore 2 as core 2 on socket 0
> EAL: Detected lcore 3 as core 3 on socket 0
> EAL: Detected lcore 4 as core 4 on socket 0
> EAL: Detected lcore 5 as core 5 on socket 0
> EAL: Detected lcore 6 as core 0 on socket 0
> EAL: Detected lcore 7 as core 1 on socket 0
> EAL: Detected lcore 8 as core 2 on socket 0
> EAL: Detected lcore 9 as core 3 on socket 0
> EAL: Detected lcore 10 as core 4 on socket 0
> EAL: Detected lcore 11 as core 5 on socket 0
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 12 lcore(s)
> EAL: VFIO modules not all loaded, skip VFIO support...
> EAL: Searching for IVSHMEM devices...
> EAL: No IVSHMEM configuration found!
> EAL: Setting up memory...
> EAL: Ask a virtual area of 0x180000000 bytes
> EAL: Virtual area found at 0x7f1e00000000 (size = 0x180000000)
> EAL: remap_all_hugepages(): mmap failed: Cannot allocate memory
> EAL: Failed to remap 1024 MB pages
> PANIC in rte_eal_init():
> Cannot init memory
> 7: [/usr/sbin/ovs-vswitchd() [0x40b803]]
> 6: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)
> [0x7f1fb52d3ec5]]
> 5: [/usr/sbin/ovs-vswitchd() [0x40a822]]
> 4: [/usr/sbin/ovs-vswitchd() [0x675432]]
> 3: [/usr/sbin/ovs-vswitchd() [0x442155]]
> 2: [/usr/sbin/ovs-vswitchd() [0x407c9f]]
> 1: [/usr/sbin/ovs-vswitchd() [0x447828]]
>
> I have given hugepages in /boot/grub/grub.cfg file. So there are free
> hugepages.
>
>
> AnonHugePages:    378880 kB
> HugePages_Total:       6
> HugePages_Free:        6
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:    1048576 kB
>
> It failed to allocate memory because mounting was not done. Did not
> understand why mounting is not done when there are free hugepages.
>
> And also dpdk binding did happen.
>
> $../DPDK-v2.0.0/tools/dpdk_nic_bind.py --status
>
> Network devices using DPDK-compatible driver
> ============================================
> 0000:07:00.0 '82574L Gigabit Network Connection' unused=igb_uio
>
> Network devices using kernel driver
> ===================================
> 0000:00:19.0 'Ethernet Connection I217-LM' if=eth0 drv=e1000e
> unused=igb_uio *Active*
> 0000:06:02.0 '82540EM Gigabit Ethernet Controller' if=eth2 drv=e1000
> unused=igb_uio
>
> Other network devices
> =====================
>
> None
>
> Am using a 1G NIC card for the port (eth1) binds dpdk. Is that a
> problem??? Should dpdk binding port necessarily have a 10G NIC???? I dont
> think its a problem anyway because binding is done. Please correct me if am
> going wrong...
>
> Thanks,
>
> Prathyusha
>
>
>
>
>
>
>
> On Wed, Nov 11, 2015 at 3:52 PM, Samta Rangare <samtarangare at gmail.com>
> wrote:
>
> Hi Sean,
>
> Thanks for replying back, response inline.
>
> On Mon, Nov 9, 2015 at 8:24 PM, Mooney, Sean K <sean.k.mooney at intel.com>
> wrote:
> > Hi
> > Can you provide some more information regarding your deployment?
> >
> > Can you check which kernel you are using.
> >
> > uname -a
>
> Linux ubuntu 3.16.0-50-generic #67~14.04.1-Ubuntu SMP Fri Oct 2 22:07:51
> UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> >
> > If you are using a 3.19 kernel changes to some locking code in the
> kennel broke synchronization dpdk2.0 and requires dpdk 2.1 to be used
> instead.
> > In general it is not advisable to use a 3.19 kernel with dpdk as it can
> lead to non-deterministic behavior.
> >
> > When devstack hangs can you connect with a second ssh session and run
> >         sudo service ovs-dpdk status
> > and
> >         ps aux | grep ovs
> >
>
> sudo service ovs-dpdk status
>
> sourcing config
>
> /opt/stack/logs/ovs-vswitchd.pid is not running
>
> Not all processes are running restart!!!
>
> 1
>
> ubuntu at ubuntu:~/samta/devstack$ ps -ef | grep ovs
>
> root     13385     1  0 15:17 ?        00:00:00 /usr/sbin/ovsdb-server
> --detach --pidfile=/opt/stack/logs/ovsdb-server.pid
> --remote=punix:/usr/local/var/run/openvswitch/db.sock
> --remote=db:Open_vSwitch,Open_vSwitch,manager_options
>
> ubuntu   24451 12855  0 15:45 pts/0    00:00:00 grep --color=auto ovs
>
>
>
> >
> > When the deployment hangs at sudo ovs-vsctl br-set-external-id br-ex
> bridge-id br-ex
> > It usually means that the ovs-vswitchd process has exited.
> >
>
> The above result shows that ovs-vswitchd is not running.
>
> > This can happen for a number of reasons.
> > The vswitchd process may exit if it  failed to allocate memory (due to
> memory fragmentation or lack of free hugepages)
> > if the ovs-vswitchd.log is not available can you check the the hugepage
> mount point was created in
> > /mnt/huge And that Iis mounted
> > Run
> >         ls -al /mnt/huge
> > and
> >         mount
> >
>
> ls -al /mnt/huge
>
> total 4
>
> drwxr-xr-x 2 libvirt-qemu kvm     0 Nov 11 15:18 .
>
> drwxr-xr-x 3 root         root 4096 May 15 00:09 ..
>
>
>
> ubuntu at ubuntu:~/samta/devstack$ mount
>
> /dev/mapper/ubuntu--vg-root on / type ext4 (rw,errors=remount-ro)
>
> proc on /proc type proc (rw,noexec,nosuid,nodev)
>
> sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
>
> none on /sys/fs/cgroup type tmpfs (rw)
>
> none on /sys/fs/fuse/connections type fusectl (rw)
>
> none on /sys/kernel/debug type debugfs (rw)
>
> none on /sys/kernel/security type securityfs (rw)
>
> udev on /dev type devtmpfs (rw,mode=0755)
>
> devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
>
> tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
>
> none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
>
> none on /run/shm type tmpfs (rw,nosuid,nodev)
>
> none on /run/user type tmpfs
> (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
>
> none on /sys/fs/pstore type pstore (rw)
>
> cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
>
> cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
>
> cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
>
> cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
>
> cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
>
> cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
>
> cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,relatime,net_cls)
>
> cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
>
> cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
>
> cgroup on /sys/fs/cgroup/net_prio type cgroup (rw,relatime,net_prio)
>
> cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb)
>
> /dev/sda1 on /boot type ext2 (rw)
>
> systemd on /sys/fs/cgroup/systemd type cgroup
> (rw,noexec,nosuid,nodev,none,name=systemd)
>
> hugetlbfs-kvm on /run/hugepages/kvm type hugetlbfs (rw,mode=775,gid=106)
>
> nodev on /mnt/huge type hugetlbfs (rw,uid=106,gid=106)
>
> nodev on /mnt/huge type hugetlbfs (rw,uid=106,gid=106)
>
>
>
> > then checkout how many hugepages are mounted
> >
> >         cat /proc/meminfo | grep huge
> >
>
>
>
> cat /proc/meminfo | grep Huge
>
> AnonHugePages:    292864 kB
>
> HugePages_Total:       5
>
> HugePages_Free:        5
>
> HugePages_Rsvd:        0
>
> HugePages_Surp:        0
>
> Hugepagesize:    1048576 kB
>
>
> >
> > the vswitchd process may also exit if it  failed to initializes dpdk
> interfaces.
> > This can happen if no interface is  compatible with the igb-uio or
> vfio-pci drivers
> > (note in the vfio-pci case all interface in the same iommu group must be
> bound to the vfio-pci driver and
> > The iommu must be enabled in the kernel command line with VT-d enabled
> in the bios)
> >
> > Can you  check which interface are bound to the dpdk driver by running
> the following command
> >
> >         /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py --status
> >
>
> /opt/stack/DPDK-v2.0.0/tools/dpdk_nic_bind.py --status
>
>
>
> Network devices using DPDK-compatible driver
>
> ============================================
>
> <none>
>
>
>
> Network devices using kernel driver
>
> ===================================
>
> 0000:01:00.0 'Ethernet Controller 10-Gigabit X540-AT2' if=p1p1 drv=ixgbe
> unused=igb_uio
>
> 0000:02:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=p4p1 drv=i40e
> unused=igb_uio
>
> 0000:03:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=p2p1 drv=i40e
> unused=igb_uio
>
> 0000:06:00.0 'I350 Gigabit Network Connection' if=em1 drv=igb
> unused=igb_uio *Active*
>
> 0000:06:00.1 'I350 Gigabit Network Connection' if=em2 drv=igb
> unused=igb_uio
>
>
>
> Other network devices
>
> =====================
>
> 0000:01:00.1 'Ethernet Controller 10-Gigabit X540-AT2' unused=igb_uio
>
>
> >
> > Finally can you confim that ovs-dpdk compiled successfully by either
> check the xstack.log or
> > Checking for the BUILD_COMPLETE file in /opt/stack/ovs
>
> BUILD_COMPLETE exist in /opt/stack/ovs though its empty.
>
> >
> > Regards
> > sean
> >
> >
> >
> >
> > -----Original Message-----
> > From: Samta Rangare [mailto:samtarangare at gmail.com]
> > Sent: Monday, November 9, 2015 2:31 PM
> > To: Czesnowicz, Przemyslaw
> > Cc: OpenStack Development Mailing List (not for usage questions)
> > Subject: Re: [openstack-dev] [networking-ovs-dpdk]
> >
> > Thanks for replying Przemyslaw, there is no ovs-vswitchd.log in
> /opt/stack/logs/. This is all contains inside (ovsdb-server.pid, screen).
> >
> > When I cancel stack .sh (ctr c), and try to rerun this $sudo ovs-vsctl
> br-set-external-id br-ex bridge-id br-ex it didnt hang, that means vSwitch
> was running isn't it ?
> >
> > But rerunning stack.sh after unstack hangs again.
> >
> > Thanks,
> > Samta
> >
> > On Mon, Nov 9, 2015 at 7:50 PM, Czesnowicz, Przemyslaw <
> przemyslaw.czesnowicz at intel.com> wrote:
> >> Hi Samta,
> >>
> >> This usually means that the vSwitch is not running/has crashed.
> >> Can you check in /opt/stack/logs/ovs-vswitchd.log ? There should be an
> error msg there.
> >>
> >> Regards
> >> Przemek
> >>
> >>> -----Original Message-----
> >>> From: Samta Rangare [mailto:samtarangare at gmail.com]
> >>> Sent: Monday, November 9, 2015 1:51 PM
> >>> To: OpenStack Development Mailing List (not for usage questions)
> >>> Subject: [openstack-dev] [networking-ovs-dpdk]
> >>>
> >>> Hello Everyone,
> >>>
> >>> I am installing devstack with networking-ovs-dpdk. The local.conf
> >>> exactly looks like the one is available in /opt/stack/networking-ovs-
> >>> dpdk/doc/source/_downloads/local.conf.single_node.
> >>> So I believe all the necessary configuration will be taken care.
> >>>
> >>> However I am stuck at place where devstack is trying to set
> >>> external-id ($ sudo ovs-vsctl br-set-external-id br-ex bridge-id
> >>> br-ex). As soon as it hits at this place it's just hangs forever. I
> >>> tried commenting this line from
> >>> lib/neutron_plugin/ml2 (I know this is wrong) and then all services
> >>> came up except ovs-dpdk agent and ovs agent.
> >>>
> >>> BTW I am deploying it in ubuntu 14.04. Any pointer will be really
> helpful.
> >>>
> >>> Thanks,
> >>> Samta
> >>>
> >>> __________________________________________________________
> >>> ________________
> >>> OpenStack Development Mailing List (not for usage questions)
> >>> Unsubscribe: OpenStack-dev-
> >>> request at lists.openstack.org?subject:unsubscribe
> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151116/040b7573/attachment.html>


More information about the OpenStack-dev mailing list