[openstack-dev] [nova] [neutron] PCI pass-through network support
Henry Gessau
gessau at cisco.com
Tue Oct 29 21:23:29 UTC 2013
On Tue, Oct 29, at 4:31 pm, Jiang, Yunhong <yunhong.jiang at intel.com> wrote:
> Henry,why do you think the "service VM" need the entire PF instead of a
> VF? I think the SR-IOV NIC should provide QoS and performance isolation.
I was speculating. I just thought it might be a good idea to leave open the
possibility of assigning a PF to a VM if the need arises.
Neutron service VMs are a new thing. I will be following the discussions and
there is a summit session for them. It remains to be seen if there is any
desire/need for full PF ownership of NICs. But if a service VM owns the PF
and has the right NIC driver it could do some advanced features with it.
> As to assign entire PCI device to a guest, that should be ok since
> usually PF and VF has different device ID, the tricky thing is, at least
> for some PCI devices, you can't configure that some NIC will have SR-IOV
> enabled while others not.
Thanks for the warning. :) Perhaps the cloud admin might plug in an extra
NIC in just a few nodes (one or two per rack, maybe) for the purpose of
running service VMs there. Again, just speculating. I don't know how hard it
is to manage non-homogenous nodes.
>
> Thanks
> --jyh
>
>> -----Original Message-----
>> From: Henry Gessau [mailto:gessau at cisco.com]
>> Sent: Tuesday, October 29, 2013 8:10 AM
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network
>> support
>>
>> Lots of great info and discussion going on here.
>>
>> One additional thing I would like to mention is regarding PF and VF usage.
>>
>> Normally VFs will be assigned to instances, and the PF will either not be
>> used at all, or maybe some agent in the host of the compute node might
>> have
>> access to the PF for something (management?).
>>
>> There is a neutron design track around the development of "service VMs".
>> These are dedicated instances that run neutron services like routers,
>> firewalls, etc. It is plausible that a service VM would like to use PCI
>> passthrough and get the entire PF. This would allow it to have complete
>> control over a physical link, which I think will be wanted in some cases.
>>
>> --
>> Henry
>>
>> On Tue, Oct 29, at 10:23 am, Irena Berezovsky <irenab at mellanox.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I would like to share some details regarding the support provided by
>> > Mellanox plugin. It enables networking via SRIOV pass-through devices
>> or
>> > macvtap interfaces. It plugin is available here:
>> >
>> https://github.com/openstack/neutron/tree/master/neutron/plugins/mln
>> x.
>> >
>> > To support either PCI pass-through device and macvtap interface type of
>> > vNICs, we set neutron port profile:vnic_type according to the required
>> VIF
>> > type and then use the created port to 'nova boot' the VM.
>> >
>> > To overcome the missing scheduler awareness for PCI devices which
>> was not
>> > part of the Havana release yet, we
>> >
>> > have an additional service (embedded switch Daemon) that runs on each
>> > compute node.
>> >
>> > This service manages the SRIOV resources allocation, answers vNICs
>> > discovery queries and applies VLAN/MAC configuration using standard
>> Linux
>> > APIs (code is here:
>> https://github.com/mellanox-openstack/mellanox-eswitchd
>> > ). The embedded switch Daemon serves as a glue layer between VIF
>> Driver and
>> > Neutron Agent.
>> >
>> > In the Icehouse Release when SRIOV resources allocation is already part
>> of
>> > the Nova, we plan to eliminate the need in embedded switch daemon
>> service.
>> > So what is left to figure out is how to tie up between neutron port and
>> PCI
>> > device and invoke networking configuration.
>> >
>> >
>> >
>> > In our case what we have is actually the Hardware VEB that is not
>> programmed
>> > via either 802.1Qbg or 802.1Qbh, but configured locally by Neutron
>> Agent. We
>> > also support both Ethernet and InfiniBand physical network L2
>> technology.
>> > This means that we apply different configuration commands to set
>> > configuration on VF.
>> >
>> >
>> >
>> > I guess what we have to figure out is how to support the generic case for
>> > the PCI device networking support, for HW VEB, 802.1Qbg and
>> 802.1Qbh cases.
>> >
>> >
>> >
>> > BR,
>> >
>> > Irena
>> >
>> >
>> >
>> > *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>> > *Sent:* Tuesday, October 29, 2013 3:31 PM
>> > *To:* Jiang, Yunhong; Irena Berezovsky;
>> prashant.upadhyaya at aricent.com;
>> > chris.friesen at windriver.com; He, Yongli; Itzik Brown
>> > *Cc:* OpenStack Development Mailing List; Brian Bowen (brbowen);
>> Kyle
>> > Mestery (kmestery); Sandhya Dasu (sadasu)
>> > *Subject:* Re: [openstack-dev] [nova] [neutron] PCI pass-through
>> network support
>> >
>> >
>> >
>> > Hi Yunhong,
>> >
>> >
>> >
>> > I haven't looked at Mellanox in much detail. I think that we'll get more
>> > details from Irena down the road. Regarding your question, I can only
>> answer
>> > based on my experience with Cisco's VM-FEX. In a nutshell:
>> >
>> > -- a vNIC is connected to an external switch. Once the host is
>> booted
>> > up, all the PFs and VFs provisioned on the vNIC will be created, as well as
>> > all the corresponding ethernet interfaces .
>> >
>> > -- As far as Neutron is concerned, a neutron port can be
>> associated
>> > with a VF. One way to do so is to specify this requirement in the -nic
>> > option, providing information such as:
>> >
>> > . PCI alias (this is the same alias as defined in your nova
>> > blueprints)
>> >
>> > . direct pci-passthrough/macvtap
>> >
>> > . port profileid that is compliant with 802.1Qbh
>> >
>> > -- similar to how you translate the nova flavor with PCI
>> requirements
>> > to PCI requests for scheduling purpose, Nova API (the nova api
>> component)
>> > can translate the above to PCI requests for scheduling purpose. I can
>> give
>> > more detail later on this.
>> >
>> >
>> >
>> > Regarding your last question, since the vNIC is already connected with
>> the
>> > external switch, the vNIC driver will be responsible for communicating
>> the
>> > port profile to the external switch. As you have already known, libvirt
>> > provides several ways to specify a VM to be booted up with SRIOV. For
>> > example, in the following interface definition:
>> >
>> >
>> >
>> > *<interface type='hostdev' managed='yes'>*
>> >
>> > * <source>*
>> >
>> > * <address type='pci' domain='0' bus='0x09' slot='0x0'
>> function='0x01'/>*
>> >
>> > * </source>*
>> >
>> > * <mac address='01:23:45:67:89:ab' />*
>> >
>> > * <virtualport type='802.1Qbh'>*
>> >
>> > * <parameters profileid='my-port-profile' />*
>> >
>> > * </virtualport>*
>> >
>> > * </interface>*
>> >
>> >
>> >
>> > The SRIOV VF (bus 0x09, VF 0x01) will be allocated, and the port profile
>> 'my-port-profile' will be used to provision this VF. Libvirt will be
>> responsible for invoking the vNIC driver to configure this VF with the port
>> profile my-port-porfile. The driver will talk to the external switch using the
>> 802.1qbh standards to complete the VF's configuration and binding with
>> the VM.
>> >
>> >
>> >
>> > Now that nova PCI passthrough is responsible for
>> discovering/scheduling/allocating a VF, the rest of the puzzle is to associate
>> this PCI device with the feature that's going to use it, and the feature will
>> be responsible for configuring it. You can also see from the above example,
>> in one implementation of SRIOV, the feature (in this case neutron) may not
>> need to do much in terms of working with the external switch, the work is
>> actually done by libvirt behind the scene.
>> >
>> >
>> >
>> > Now the questions are:
>> >
>> > -- how the port profile gets defined/managed
>> >
>> > -- how the port profile gets associated with a neutron network
>> >
>> > The first question will be specific to the particular product, and
>> therefore a particular neutron plugin has to mange that.
>> >
>> > There may be several approaches to address the second question. For
>> example, in the simplest case, a port profile can be associated with a
>> neutron network. This has some significant drawbacks. Since the port
>> profile defines features for all the ports that use it, the one port profile to
>> one neutron network mapping would mean all the ports on the network
>> will have exactly the same features (for example, QoS characteristics). To
>> make it flexible, the binding of a port profile to a port may be done at the
>> port creation time.
>> >
>> >
>> >
>> > Let me know if the above answered your question.
>> >
>> >
>> >
>> > thanks,
>> >
>> > Robert
>> >
>> >
>> >
>> > On 10/29/13 3:03 AM, "Jiang, Yunhong" <yunhong.jiang at intel.com
>> > <mailto:yunhong.jiang at intel.com>> wrote:
>> >
>> >
>> >
>> > Robert, is it possible to have a IRC meeting? I'd prefer to IRC
>> meeting
>> > because it's more openstack style and also can keep the minutes
>> clearly.
>> >
>> >
>> >
>> > To your flow, can you give more detailed example. For example, I
>> can
>> > consider user specify the instance with -nic option specify a
>> network
>> > id, and then how nova device the requirement to the PCI device? I
>> assume
>> > the network id should define the switches that the device can
>> connect to
>> > , but how is that information translated to the PCI property
>> > requirement? Will this translation happen before the nova
>> scheduler make
>> > host decision?
>> >
>> >
>> >
>> > Thanks
>> >
>> > --jyh
>> >
>> >
>> >
>> > *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>> > *Sent:* Monday, October 28, 2013 12:22 PM
>> > *To:* Irena Berezovsky; prashant.upadhyaya at aricent.com
>> > <mailto:prashant.upadhyaya at aricent.com>; Jiang, Yunhong;
>> > chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>; He,
>> > Yongli; Itzik Brown
>> > *Cc:* OpenStack Development Mailing List; Brian Bowen
>> (brbowen); Kyle
>> > Mestery (kmestery); Sandhya Dasu (sadasu)
>> > *Subject:* Re: [openstack-dev] [nova] [neutron] PCI pass-through
>> network
>> > support
>> >
>> >
>> >
>> > Hi Irena,
>> >
>> >
>> >
>> > Thank you very much for your comments. See inline.
>> >
>> >
>> >
>> > --Robert
>> >
>> >
>> >
>> > On 10/27/13 3:48 AM, "Irena Berezovsky" <irenab at mellanox.com
>> > <mailto:irenab at mellanox.com>> wrote:
>> >
>> >
>> >
>> > Hi Robert,
>> >
>> > Thank you very much for sharing the information regarding
>> your
>> > efforts. Can you please share your idea of the end to end flow?
>> How
>> > do you suggest to bind Nova and Neutron?
>> >
>> >
>> >
>> > The end to end flow is actually encompassed in the blueprints in a
>> > nutshell. I will reiterate it in below. The binding between Nova and
>> > Neutron occurs with the neutron v2 API that nova invokes in order
>> to
>> > provision the neutron services. The vif driver is responsible for
>> > plugging in an instance onto the networking setup that neutron has
>> > created on the host.
>> >
>> >
>> >
>> > Normally, one will invoke "nova boot" api with the -nic options to
>> > specify the nic with which the instance will be connected to the
>> > network. It currently allows net-id, fixed ip and/or port-id to be
>> > specified for the option. However, it doesn't allow one to specify
>> > special networking requirements for the instance. Thanks to the
>> nova
>> > pci-passthrough work, one can specify PCI passthrough device(s) in
>> the
>> > nova flavor. But it doesn't provide means to tie up these PCI devices
>> in
>> > the case of ethernet adpators with networking services. Therefore
>> the
>> > idea is actually simple as indicated by the blueprint titles, to provide
>> > means to tie up SRIOV devices with neutron services. A work flow
>> would
>> > roughly look like this for 'nova boot':
>> >
>> >
>> >
>> > -- Specifies networking requirements in the -nic option.
>> > Specifically for SRIOV, allow the following to be specified in addition
>> > to the existing required information:
>> >
>> > . PCI alias
>> >
>> > . direct pci-passthrough/macvtap
>> >
>> > . port profileid that is compliant with 802.1Qbh
>> >
>> >
>> >
>> > The above information is optional. In the absence of them,
>> the
>> > existing behavior remains.
>> >
>> >
>> >
>> > -- if special networking requirements exist, Nova api creates
>> PCI
>> > requests in the nova instance type for scheduling purpose
>> >
>> >
>> >
>> > -- Nova scheduler schedules the instance based on the
>> requested
>> > flavor plus the PCI requests that are created for networking.
>> >
>> >
>> >
>> > -- Nova compute invokes neutron services with PCI
>> passthrough
>> > information if any
>> >
>> >
>> >
>> > -- Neutron performs its normal operations based on the
>> request,
>> > such as allocating a port, assigning ip addresses, etc. Specific to
>> > SRIOV, it should validate the information such as profileid, and
>> stores
>> > them in its db. It's also possible to associate a port profileid with a
>> > neutron network so that port profileid becomes optional in the
>> -nic
>> > option. Neutron returns nova the port information, especially for
>> PCI
>> > passthrough related information in the port binding object.
>> Currently,
>> > the port binding object contains the following information:
>> >
>> > binding:vif_type
>> >
>> > binding:host_id
>> >
>> > binding:profile
>> >
>> > binding:capabilities
>> >
>> >
>> >
>> > -- nova constructs the domain xml and plug in the instance by
>> > calling the vif driver. The vif driver can build up the interface xml
>> > based on the port binding information.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > The blueprints you registered make sense. On Nova side, there
>> is a
>> > need to bind between requested virtual network and PCI
>> > device/interface to be allocated as vNIC.
>> >
>> > On the Neutron side, there is a need to support networking
>> > configuration of the vNIC. Neutron should be able to identify
>> the
>> > PCI device/macvtap interface in order to apply configuration. I
>> > think it makes sense to provide neutron integration via
>> dedicated
>> > Modular Layer 2 Mechanism Driver to allow PCI pass-through
>> vNIC
>> > support along with other networking technologies.
>> >
>> >
>> >
>> > I haven't sorted through this yet. A neutron port could be
>> associated
>> > with a PCI device or not, which is a common feature, IMHO.
>> However, a
>> > ML2 driver may be needed specific to a particular SRIOV
>> technology.
>> >
>> >
>> >
>> >
>> >
>> > During the Havana Release, we introduced Mellanox Neutron
>> plugin
>> > that enables networking via SRIOV pass-through devices or
>> macvtap
>> > interfaces.
>> >
>> > We want to integrate our solution with PCI pass-through Nova
>> > support. I will be glad to share more details if you are
>> interested.
>> >
>> >
>> >
>> >
>> >
>> > Good to know that you already have a SRIOV implementation. I
>> found out
>> > some information online about the mlnx plugin, but need more
>> time to get
>> > to know it better. And certainly I'm interested in knowing its details.
>> >
>> >
>> >
>> > The PCI pass-through networking support is planned to be
>> discussed
>> > during the summit:
>> http://summit.openstack.org/cfp/details/129. I
>> > think it's worth to drill down into more detailed proposal and
>> > present it during the summit, especially since it impacts both
>> nova
>> > and neutron projects.
>> >
>> >
>> >
>> > I agree. Maybe we can steal some time in that discussion.
>> >
>> >
>> >
>> > Would you be interested in collaboration on this effort? Would
>> you
>> > be interested to exchange more emails or set an IRC/WebEx
>> meeting
>> > during this week before the summit?
>> >
>> >
>> >
>> > Sure. If folks want to discuss it before the summit, we can schedule
>> a
>> > webex later this week. Or otherwise, we can continue the
>> discussion with
>> > email.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Regards,
>> >
>> > Irena
>> >
>> >
>> >
>> > *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>> > *Sent:* Friday, October 25, 2013 11:16 PM
>> > *To:* prashant.upadhyaya at aricent.com
>> > <mailto:prashant.upadhyaya at aricent.com>; Irena Berezovsky;
>> > yunhong.jiang at intel.com <mailto:yunhong.jiang at intel.com>;
>> > chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>;
>> > yongli.he at intel.com <mailto:yongli.he at intel.com>
>> > *Cc:* OpenStack Development Mailing List; Brian Bowen
>> (brbowen);
>> > Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>> > *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
>> pass-through
>> > network support
>> >
>> >
>> >
>> > Hi Irena,
>> >
>> >
>> >
>> > This is Robert Li from Cisco Systems. Recently, I was tasked to
>> > investigate such support for Cisco's systems that support
>> VM-FEX,
>> > which is a SRIOV technology supporting 802-1Qbh. I was able to
>> bring
>> > up nova instances with SRIOV interfaces, and establish
>> networking in
>> > between the instances that employes the SRIOV interfaces.
>> Certainly,
>> > this was accomplished with hacking and some manual
>> intervention.
>> > Based on this experience and my study with the two existing
>> nova
>> > pci-passthrough blueprints that have been implemented and
>> committed
>> > into Havana
>> >
>> (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base and
>> >
>> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-libvirt), I
>> > registered a couple of blueprints (one on Nova side, the other
>> on
>> > the Neutron side):
>> >
>> >
>> >
>> >
>> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov
>> >
>> >
>> https://blueprints.launchpad.net/neutron/+spec/pci-passthrough-sriov
>> >
>> >
>> >
>> > in order to address SRIOV support in openstack.
>> >
>> >
>> >
>> > Please take a look at them and see if they make sense, and let
>> me
>> > know any comments and questions. We can also discuss this in
>> the
>> > summit, I suppose.
>> >
>> >
>> >
>> > I noticed that there is another thread on this topic, so copy
>> those
>> > folks from that thread as well.
>> >
>> >
>> >
>> > thanks,
>> >
>> > Robert
>> >
>> >
>> >
>> > On 10/16/13 4:32 PM, "Irena Berezovsky"
>> <irenab at mellanox.com
>> > <mailto:irenab at mellanox.com>> wrote:
>> >
>> >
>> >
>> > Hi,
>> >
>> > As one of the next steps for PCI pass-through I would like
>> to
>> > discuss is the support for PCI pass-through vNIC.
>> >
>> > While nova takes care of PCI pass-through device
>> resources
>> > management and VIF settings, neutron should manage
>> their
>> > networking configuration.
>> >
>> > I would like to register asummit proposal to discuss the
>> support
>> > for PCI pass-through networking.
>> >
>> > I am not sure what would be the right topic to discuss the
>> PCI
>> > pass-through networking, since it involve both nova and
>> neutron.
>> >
>> > There is already a session registered by Yongli on nova
>> topic to
>> > discuss the PCI pass-through next steps.
>> >
>> > I think PCI pass-through networking is quite a big topic and
>> it
>> > worth to have a separate discussion.
>> >
>> > Is there any other people who are interested to discuss it
>> and
>> > share their thoughts and experience?
>> >
>> >
>> >
>> > Regards,
>> >
>> > Irena
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > OpenStack-dev mailing list
>> > OpenStack-dev at lists.openstack.org
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list