[Openstack] [neutron][ml2][sriov]Issues with neutron behaviour

Itzik Brown itzikb at redhat.com
Sun Feb 8 09:14:27 UTC 2015


Hi,
There is a weekly meeting for PCI Passthrough:
https://wiki.openstack.org/wiki/Meetings/Passthrough

I think it's a good idea to attend the meetings and share ideas there.

BR,
Itzik

On 02/06/2015 07:14 AM, Akilesh K wrote:
> I do understand that and that is why I believe it should not be that 
> way. profile should rather be populated by the sriov-nic-switch-agent 
> that is running on the compute node. That way It is possible to do 
> interface-attach because the profile is already populated and nova 
> doesn't have to do it and also the agent can keep track of the devices 
> that are available instead of nova tracking it.
>
> There is already a discussion on cleaning up the interaction between 
> nova and neutron and probably our case can also be a part of it.
>
> Thank you,
> Ageeleshwar K
>
> On Thu, Feb 5, 2015 at 6:45 PM, Irena Berezovsky <irenab.dev at gmail.com 
> <mailto:irenab.dev at gmail.com>> wrote:
>
>
>
>     On Thu, Feb 5, 2015 at 9:38 AM, Akilesh K <akilesh1597 at gmail.com
>     <mailto:akilesh1597 at gmail.com>> wrote:
>
>         I know that vif_type is binding_failed on a multinode setup
>         and I also know why it happens.
>
>         As for interface-attach I got it work for sriov ports and even
>         verified it works inside the instance. The trick was to
>         specify profile with pci_slot and pci_vendor_info during port
>         create. In case any one else wants to do this.
>
>     binding:profile is not supposed to be populated by the user, it
>     can be set only under admin credentials and actually for SR-IOV
>     case should be populated by nova. Manual population of the profile
>     with pci_slot details can be very dangerous, since you skip the
>     phase when this pci slot is reserved by nova. The system may
>     become inconsistent.
>
>
>
>         Thank you,
>         Ageeleshwar K
>
>         On Thu, Feb 5, 2015 at 12:19 PM, Irena Berezovsky
>         <irenab.dev at gmail.com <mailto:irenab.dev at gmail.com>> wrote:
>
>             Hi Akilesh,
>             Please see my responses inline.
>             Hope this help,
>
>             BR,
>             Irena
>
>             On Thu, Feb 5, 2015 at 6:14 AM, Akilesh K
>             <akilesh1597 at gmail.com <mailto:akilesh1597 at gmail.com>> wrote:
>
>                 Hi Irena,
>
>                 Issue 1 - I agree. You are correct.
>
>                 Issue 2
>                 The behavior you outlined
>                 1. When port is created with vnic_type=direct, the
>                 vif_type is 'unbound'. The pci_vendor_info will be
>                 available during port update when 'nova boot' command
>                 is invoked and PCI device is allocated.
>                 This happens when the controller and compute are on
>                 the same host. Not when they are on the different
>                 host. On a multiserver setup vif_type is set to
>                 binging_failed during port create.
>
>             This is strange, since port-create operation is pure
>             neutron API call and it should not differ whether you are
>             in the multiserver or all-in-one setup.
>
>                 Second is i am not doing nova boot. Instead I am doing
>                 nova interface-attach. In this case the
>                 pci_vendor_info is not updated by anyone but me. and
>                 pci_slot is also not populated.
>
>             interface-attach is currently not supported for SR-IOV
>             ports. There is a proposed blueprint to support this:
>             https://review.openstack.org/#/c/139910/.
>             So for now, the only option to provide PCI passthrough
>             vNIC is according to what is described in the previously
>             referenced wiki page: create neutron port with vnic_type=
>             direct and then 'nova boot' with pre-created port.
>
>             Do you still think this is correct ?
>
>
>
>
>                 On Wed, Feb 4, 2015 at 8:08 PM, Irena Berezovsky
>                 <irenab.dev at gmail.com <mailto:irenab.dev at gmail.com>>
>                 wrote:
>
>                     Hi Akilesh,
>                     please see inline
>
>                     On Wed, Feb 4, 2015 at 11:32 AM, Akilesh K
>                     <akilesh1597 at gmail.com
>                     <mailto:akilesh1597 at gmail.com>> wrote:
>
>                         Hi,
>                         Issue 1:
>                         I do not understand what you mean. I did
>                         specify the physical_network. What I am trying
>                         to say is some physical networks exists only
>                         on the compute node and not on the network
>                         node. We are unable to create a network on
>                         those physnets. The work around was to fake
>                         their existance on the network node too. Which
>                         I believe is the wrong way to do.
>
>                     Every physical network  should be defined at the
>                     Controller node, including range of segmentation
>                     ids (i.e. vlan ids) available for allocation.
>                     When virtual network is created, you should verify
>                     that it has associated  network type and
>                     segmentation id (assuming you are using provider
>                     network extension).
>
>
>                         Issue2:
>                         I looked directly into the code after looking
>                         at the logs.
>
>                         1. What neutron (sriov mech driver ) is doing
>                         is loading the default list of
>                         'supported_pci_vendor_devs' , then it picks up
>                         the profile->pci_vendor_info from the port
>                         defenition we sent in the port create request
>                         and checks if it is supported. If not it says
>                         'binding_failed'
>
>                     When port is created with vnic_type=direct, the
>                     vif_type is 'unbound'. The pci_vendor_info will be
>                     available during port update when 'nova boot'
>                     command is invoked and PCI device is allocated.
>
>
>                         I am fine with this
>
>                         2. Then when I attach the created port to a
>                         host nova's vif driver (hv_veb) is looking for
>                         profile->pci_slot in the context of the port
>                         that was supplied and fails to attach to the
>                         instance if it is not present.
>
>                     nova vif driver receives profile->pci_slot from
>                     neutron, but it was actually filed earlier by nova
>                     during port-update.
>
>                         this is what I think should be done by neutron
>                         itself. neutron's sriov mech driver should
>                         have updated the port with the pci_slot
>                         details when the port got created. and this
>                         does happen on a single machine install. We
>                         need to find why it does not happen on a multi
>                         node install, possibly because the mech driver
>                         is not running on the host with sriov devices
>                         and fix it.
>
>                     I suggest to follow
>                     https://wiki.openstack.org/wiki/SR-IOV-Passthrough-For-Networking instructions,
>                     this should work for you.
>
>                     I hope you guys can understand what I mean.
>
>
>                         Thank you,
>                         Ageeleshwar K
>
>
>                         On Wed, Feb 4, 2015 at 2:49 PM, Itzik Brown
>                         <itzikb at redhat.com <mailto:itzikb at redhat.com>>
>                         wrote:
>
>                             Hi,
>
>                             Issue 1;
>                             You must specify the physical networks.
>                             Please look at:
>                             https://wiki.openstack.org/wiki/SR-IOV-Passthrough-For-Networking*
>
>                             *Issue 2:
>                             AFAIK the agent is supported by only one
>                             vendor.
>                             Can you please look for errors in
>                             Neutron's log?
>                             *
>                             *Thanks,
>                             Itzik
>
>                             On 02/04/2015 09:12 AM, Akilesh K wrote:
>>                             Hi,
>>                             I found two issues with the way neutron
>>                             behaves on a multi server install. I got
>>                             it to work but I do not this this is the
>>                             right way to do it. It might be a bug we
>>                             might want to fix and for which I could
>>                             volunteer.
>>
>>                             Setup - Multiserver juno on ubuntu.
>>
>>                             Machine 1 - Controller
>>                             All api servers , l3, dhcp and ovs agent
>>
>>                             Machine 2 - Compute
>>                             nova compute, neutron-ovs-agent, neutron
>>                             sriov agent.
>>
>>
>>                             Issue 1:
>>
>>                             Controller node has physnets 'External',
>>                             'Internal' configured in ml2
>>
>>                             Compute node has physnets 'Internal',
>>                             'Physnet1', 'Physnet2' configured in ml2
>>
>>                             When I do neutron net-create
>>                             --provider:physicalnetwork Physnet1, It
>>                             complains that 'Physnet1' is not available.
>>
>>                             Offcourse its not available on the
>>                             controller but is available on the
>>                             compute node and there is no way to tell
>>                             neutron to host that network on compute
>>                             node alone
>>
>>                             Work around
>>                             I had to include 'Physnet1' in the
>>                             controller node also to get it to work,
>>                             except that there is not bridge mapings
>>                             for this physnet.
>>
>>
>>                             Issue 2:
>>
>>                             This is related to sriov agent. This
>>                             agent is configured only on the compute
>>                             node as that node alone has supported
>>                             devices.
>>
>>                             When I do a port create
>>                             --binding:vnic_type direct
>>                             --binding:host_id <compute node> The port
>>                             is created but with binding:vif_type:
>>                             *'binding-failed'*. and naturally I could
>>                             not attach it to any instance.
>>
>>                             I looked at the code and figured out that
>>                             neutron api is expecting binding:profile
>>                             also in the format
>>                              {"pci_slot": "0000:03:10.1",
>>                             "pci_vendor_info": "8086:10ed"}
>>
>>                             Is this how it should be. Because on a
>>                             single machine install I did not have to
>>                             do this. However on a multiserver I had
>>                             to even give the pci address is the exact
>>                             format to get it to work.
>>
>>                             I have a serious feeling that this could
>>                             be lot simpler if neutron could take care
>>                             of finding the details in a smart way
>>                             rather than relying on the administrator
>>                             to find which device is available and
>>                             configure it.
>>
>>
>>                             Note:
>>                             1. If I can get some expert advice I can
>>                             fix both these.
>>                             2. I am not sure if this question should
>>                             rather be sent to openstack-dev group.
>>                             Let me know.
>>
>>
>>                             Thank you,
>>                             Ageeleshwar K
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>                             _______________________________________________
>>                             Mailing list:http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>                             Post to     :openstack at lists.openstack.org  <mailto:openstack at lists.openstack.org>
>>                             Unsubscribe :http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
>
>                         _______________________________________________
>                         Mailing list:
>                         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>                         Post to     : openstack at lists.openstack.org
>                         <mailto:openstack at lists.openstack.org>
>                         Unsubscribe :
>                         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20150208/81e2e3a4/attachment.html>


More information about the Openstack mailing list